United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: 6843752 missing code for an anti-dependent Phi in GCM
6843752 : missing code for an anti-dependent Phi in GCM

Details
Type:
Bug
Submit Date:
2009-05-21
Status:
Closed
Updated Date:
2011-03-08
Project Name:
JDK
Resolved Date:
2011-03-08
Component:
hotspot
OS:
windows_xp
Sub-Component:
compiler
CPU:
x86
Priority:
P3
Resolution:
Fixed
Affected Versions:
6u13,6u14
Fixed Versions:
hs16

Related Reports
Backport:
Backport:
Duplicate:
Relates:
Relates:

Sub Tasks

Description
FULL PRODUCT VERSION :
java version "1.6.0_12"
Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
Java HotSpot(TM) Server VM (build 11.2-b01, mixed mode)

FULL OS VERSION :
Microsoft Windows XP [Version 5.1.2600]


A DESCRIPTION OF THE PROBLEM :
We've found one of our programs suddenly started failing in a simple loop; depending on the exact code it would loop infinitely or cause a null pointer exception.  This happened when upgrading to 1.6.0 update 10.

We've now managed to extract the loop and trigger the bug with a simple test program, and found that:

Java 1.6.0 update 6 and update 7 works OK.
Java 1.6.0 update 10 to 13 fails.

It always fails with the "-server" option, but works with "-client".
Adding an assert for the pointer that is null, makes the program work when running with the "-ea" option.
Using the "-Xint" option also makes the program work OK.


THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: No

THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
run the attached reproduction program using the "-server" option.

EXPECTED VERSUS ACTUAL BEHAVIOR :
expected:

C:\Documents and Settings\arnej\break>java -server -ea BreakJava
successfully performed 1000000 cases
successfully performed 2000000 cases
successfully performed 3000000 cases
successfully performed 4000000 cases
successfully performed 5000000 cases
successfully performed 6000000 cases
successfully performed 7000000 cases
[...etc...]


actual:

C:\Documents and Settings\arnej\break>java -server BreakJava
ERROR: crashed during case 27456
java.lang.NullPointerException
        at BreakJava.removeItems(BreakJava.java:53)
        at BreakJava.perform(BreakJava.java:65)
        at BreakJava.main(BreakJava.java:78)

C:\Documents and Settings\arnej\break>java -server BreakJava
ERROR: crashed during case 10496
java.lang.NullPointerException
        at BreakJava.removeItems(BreakJava.java:53)
        at BreakJava.perform(BreakJava.java:65)
        at BreakJava.main(BreakJava.java:78)

C:\Documents and Settings\arnej\break>java -server BreakJava
ERROR: crashed during case 10816
java.lang.NullPointerException
        at BreakJava.removeItems(BreakJava.java:53)
        at BreakJava.perform(BreakJava.java:65)
        at BreakJava.main(BreakJava.java:78)

C:\Documents and Settings\arnej\break>java -server BreakJava
ERROR: crashed during case 37440
java.lang.NullPointerException
        at BreakJava.removeItems(BreakJava.java:53)
        at BreakJava.perform(BreakJava.java:65)
        at BreakJava.main(BreakJava.java:78)

ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.lang.NullPointerException
        at BreakJava.removeItems(BreakJava.java:53)
        at BreakJava.perform(BreakJava.java:65)
        at BreakJava.main(BreakJava.java:78)


REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
public class BreakJava {

    Item list;

    static class Item {
        public Item    next;
        public Item    prev;
        public boolean remove;

        Item(boolean r) { remove = r; }
    }

    private void linkIn(Item item) {
        Item head = list;
        if (head == null) {
            item.next = item;
            item.prev = item;
            list = item;
        } else {
            item.next = head;
            item.prev = head.prev;
            head.prev.next = item;
            head.prev = item;
        }
    }

    private void linkOut(Item item) {
        Item head = list;
        if (item.next == item) {
            list = null;
        } else {
            item.prev.next = item.next;
            item.next.prev = item.prev;
            if (head == item) {
                list = item.next;
            }
        }
        item.next = null;
        item.prev = null; // is this the null pointer we are seeing?
    }

    private void removeItems() {
        Item item = list;
        if (item == null) {
            return;
        }
        Item last = item.prev;
        assert(last != null);
        boolean done = false;
        while (!done) {
            // the original code "done = (item == last);" triggered an infinite loop
            // and was changed slightly in order to produce an exception instead.
            done = (item.next == last.next);
            item = item.next;
            if (item.prev.remove) {
                linkOut(item.prev);
            }
        }
    }

    public void perform(int numItems) {
        for (int i = 0; i < numItems; i++) {
            linkIn(new Item(i == 0));
        }
        removeItems();
        list = null;
    }

    static public void main(String[] args) {
        int caseCnt = 0;
        BreakJava bj = new BreakJava();
        try {
            for (;;) {
                int numItems = (++caseCnt % 2);
                if ((caseCnt % 64) == 0) {
                    numItems = 5;
                }
                bj.perform(numItems);
                if ((caseCnt % 1000000) == 0) {
                    System.out.println("successfully performed " + caseCnt + " cases");
                }
            }
        } catch (Exception e) {
            System.out.println("ERROR: crashed during case " + caseCnt);
            e.printStackTrace(System.out);
        }
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
using an array-based container instead of a double-linked list is a workaround in this case, but we're a bit worried about other loops using similar constructs that might trigger the same bug.

Release Regression From : 6u7
The above release value was the last known release where this 
bug was not reproducible. Since then there has been a regression.

                                    

Comments
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/1851e1fb420e
                                     
2009-05-28
PUBLIC COMMENTS

Problem:
Changes for 6470497 (part G) incorrectly removed the code
for anti-dependent PHI pinned below load's 'early' block.
As result the load placed below this phi into the block with
lowest frequency and a wrong value is loaded.
In the test case the load placed inside loop which modifies
the loaded value to NULL. As result we got NULL exception.

Solution:
Don't place a load below anti-dependent PHI.
Added the regression test modified to show the problem
in latest VM (the loop body is executed less time then
the code before the loop).
                                     
2009-05-28
SUGGESTED FIX

src/share/vm/opto/gcm.cpp	Fri May 22 18:20:43 2009 -0700
@@ -595,6 +595,9 @@ Block* PhaseCFG::insert_anti_dependences
             assert(!LCA_orig->dominates(pred_block) ||
                    early->dominates(pred_block), "early is high enough");
             must_raise_LCA = true;
+          } else {
+            // anti-dependent upon PHI pinned below 'early', no edge needed
+            LCA = early;             // but can not schedule below 'early'
           }
         }
       }
                                     
2009-05-23
EVALUATION

The regression introduced by changes:

6470497 (part G) C2 cleanups before larger changes

which incorrectly removed the code for anti-dependent PHI pinned below load's 'early' block.
                                     
2009-05-23
EVALUATION

Fix for 6384206 may hided the problem. The regression was introduced with next changes:

6510732: Compute consistent block frequencies

But even that change may only expose a pre-existing problem.

The root of the exception is GCM moved the next load inside the loop

        Item last = item.prev;

which is wrong since during call to linkOut() item.prev = null. The comment in the test is correct -
it is the null which produce the exception on the next loop iteration.

025   B2: #	B3 <- B1  Freq: 3267.82
025   	MOV    EDI,EAX
027   	MOV    [ESP + #8],EAX
02b   	XOR    EDX,EDX
02d
02d   B3: #	B25 B4 <- B2 B28 	Loop: B3-B28  Freq: 3667.9
02d   	TEST   EDX,EDX
02f   	Jne    B25  P=0.471259 C=4645.000000
02f
035   B4: #	B33 B5 <- B3  Freq: 1939.37
035   	MOV    [ESP + #0],ECX
038   	MOV    EBX,[EAX + #16] ! Field Test$Item.prev <<<<<<<<<<<<<< incorrectly placed inside the loop
03b   	MOV    [ESP + #4],EBX
03f   	MOV    EBX,[EBX + #12] ! Field Test$Item.next <<<<<<<<<<<<<< EBX == NULL
042   	NullCheck EBX


With 6384206 fix it is:

025   B2: #	B3 <- B1  Freq: 3267.82
025   	MOV    ESI,[EDI + #16] ! Field Test$Item.prev <<<<<<<<<< correctly placed before the loop
028   	XOR    EBP,EBP
02a
02a   B3: #	B22 B4 <- B2 B20 	Loop: B3-B20  Freq: 6534.11
02a   	TEST   EBP,EBP
02c   	Jne    B22  P=0.471259 C=4645.000000
02c
032   B4: #	B27 B5 <- B3  Freq: 3454.85
032   	MOV    EAX,[ESI + #12] ! Field Test$Item.next
035   	NullCheck ESI


But we could be simply "lucky" since the load is placed into the lowest-fequency block.
B2:Freq > B4:Freq in the failing case and B2:Freq < B4:Freq in the passing case.

Note, the load has the control edge before the loop:

 31     IfTrue  ===  28  [[ 325  37 ]] #1 !jvms: Test::removeItems @ bci:6
 37     LoadP   ===  31  7  36  [[ 330  98  154  280  330 ]]  @Test$Item+16 *, name=prev, idx=5; #Test$Item *  Oop:Test$Item * !orig=[52] !jvms: Test::removeItems @ bci:11
 325    Loop    ===  325  31  427  [[ 325  324  321  59  322  323  61  66  379 ]]  !orig=[55] !jvms: Test::removeItems @ bci:35
                                     
2009-05-22
EVALUATION

It seems this is a duplicate of 6384206, which is integrated in HS14b06.  At least starting with this changeset the bug is fixed, but Tom should comment on this one and confirm that it's a duplicate.
                                     
2009-05-21



Hardware and Software, Engineered to Work Together