Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 5048441
Votes 22
Synopsis Intermittent crashes in Java MarkSweep garbage collection methods
Category hotspot:runtime_system
Reported Against 1.4.2_03
Release Fixed
State 11-Closed, Not Reproducible, bug
Priority: 3-Medium
Related Bugs 6276921 , 6306530 , 5048446 , 4951940
Submit Date 17-MAY-2004
Description
Cingular is experiencing a crash in the JVM 3-4 times per week in production.  The runtime environment consists of JVM 1.4.2_03 with WebLogic Server 8.1 SP2.  The crashes are exhibiting multiple symptoms (some of which look like Bug ID 5008819), but appear to be C2 HotSpot compiler crashes:

 --- called from signal handler with signal -14217216 (SIG Unknown) ---
const Type*URShiftINode::Value(PhaseTransform*)const
Node*PhaseIterGVN::transform_old(Node*)
void PhaseIterGVN::optimize()
void Compile::Optimize()
Compile::Compile(ciEnv*,ciScope*,ciMethod*,int,int,int)
void C2Compiler::compile_method(ciEnv*,ciScope*,ciMethod*,int,int)
void CompileBroker::invoke_compiler_on_method(CompileTask*)
void CompileBroker::compiler_thread_loop()
void JavaThread::run()
_start (f0970, ff271000, 0, 0, 0, 0) + 134        
_lwp_start (0, 0, 0, 0, 0, 0)          


  xxxxx@xxxxx   2004-05-17
  xxxxx@xxxxx   2004-05-17
Work Around
N/A
Evaluation
  xxxxx@xxxxx   2004-05-17

The crash in URShiftINode::Value() must be the 4951940 bug which was
fixed in 1.4.2_05. Try to use 1.4.2_05. 


  xxxxx@xxxxx   2004-06-03

I investigated the second crash and I think it is GC problem.
I have 2Gb core file in /net/jaberwocky/export/home2/work/bugs/5048441/ccbmw01

The crash happend in the next part of MarkSweep::preserve_mark() method:

    _preserved_mark_stack->push(mark);

0xfed1c110: preserve_mark+0x00e8:       mov     %l3, %o0
0xfed1c114: preserve_mark+0x00ec:       ld      [%l3 + 0x4], %g2
0xfed1c118: preserve_mark+0x00f0:       ld      [%l3], %g5
0xfed1c11c: preserve_mark+0x00f4:       cmp     %g5, %g2
0xfed1c120: preserve_mark+0x00f8:       bne,pt %icc,preserve_mark+0x110
0xfed1c124: preserve_mark+0x00fc:       sethi   %hi(0x5400), %g2
0xfed1c128: preserve_mark+0x0100:       call    grow
0xfed1c12c: preserve_mark+0x0104:       mov     %g5, %o1
0xfed1c130: preserve_mark+0x0108:       ld      [%l3], %g5
0xfed1c134: preserve_mark+0x010c:       sethi   %hi(0x5400), %g2
0xfed1c138: preserve_mark+0x0110:       ld      [%l3 + 0x8], %g3
0xfed1c13c: preserve_mark+0x0114:       add     %g5, 0x1, %g4
0xfed1c140: preserve_mark+0x0118:       add     %g2, 0x11c, %g2
0xfed1c144: preserve_mark+0x011c:       st      %g4, [%l3]
0xfed1c148: preserve_mark+0x0120:       sll     %g5, 0x2, %g4
0xfed1c14c: preserve_mark+0x0124:       ld      [%l2 + %g2], %g2
0xfed1c150: preserve_mark+0x0128:       st      %l1, [%g3 + %g4] <<< SIGBUS here

g0-g3    0x00000000 0x00004000 0xff1c2514 0x00000002
g4-g7    0x00000008 0x00000002 0x00000000 0xff270200
o0-o3    0x02916f14 0x8212d588 0x00000028 0xff17e000
o4-o7    0x00000000 0x0000e805 0xfc77f620 0xfed1c02c
l0-l3    0xf23c5b10 0x47197381 0xff17e000 0x02916f14
l4-l7    0x02916b18 0xff1d7998 0xffffffff 0x00000004
i0-i3    0xf23c5b10 0x47197381 0x00e23120 0xff17e000
i4-i7    0x00000000 0x029170ac 0xfc77f680 0xfecc72f0
y        0x00000000
ccr      0x00000009
pc       0xfed1c150:preserve_mark+0x128 st      %l1, [%g3 + %g4]

So it nothing to do with oops.
And I think it is something to do with the size of the Heap == 2^31 
(and we use 32bit VM). And the fact that it is full:

Heap at VM Abort:
Heap
 def new generation   total 235968K, used 235968K [0x72000000, 0x82000000, 0x82000000)
  eden space 209792K, 100% used [0x72000000, 0x7ece0000, 0x7ece0000)
  from space 26176K, 100% used [0x80670000, 0x82000000, 0x82000000)
  to   space 26176K,   0% used [0x7ece0000, 0x7ece0000, 0x80670000)
 tenured generation   total 1835008K, used 1606773K [0x82000000, 0xf2000000, 0xf2000000)
   the space 1835008K,  87% used [0x82000000, 0xe411d558, 0xe411d600, 0xf2000000)
 compacting perm gen  total 131072K, used 47593K [0xf2000000, 0xfa000000, 0xfa000000)
   the space 131072K,  36% used [0xf2000000, 0xf4e7a718, 0xf4e7a800, 0xfa000000)

According to the core file the next parameters were used:

java -server -verbose:gc -Xms2048m -Xmx2048m -XX:MaxPermSize=128m -XX:PermSize=128m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:SurvivorRatio=8 ...

It is Solaris 8:

@(#)SunOS 5.8 Generic 111297-01 April 2001

Assuming that the first crash is fixed in 1.4.2_05 I will pass this
bug to GC to investigate the second crash. It could be runtime issue
with memory allocations of such size. but I don't see how it could be
C2 problem (all compiler threads are waiting new task state).

----------------------------------------------------------------

Let me see if I can disassemble this.

MarkSweep::preserve_mark gets to this point:

    _preserved_mark_stack->push(mark);

and _preserved_mark_stack is declared as

    static GrowableArray<markOop>*         _preserved_mark_stack;

so that's really a call to GrowableArray<markOop>::push
which is defined as

    void push(const E elem) { append(elem); }

which inlines to 

    void append(const E elem) {
      check_nesting();
      if (_len == _max) grow(_len);
      _data[_len++] = (GrET*) elem;
    }

and knowing that GenericGrowableArray defines the fields

    int    _len;		// current length
    int    _max;		// maximum length
    GrET** _data;		// data array

so _len is at offset 0, _max is at offset 4, and _data is 
at offset 8, that corresponds to the disassembly you show:

    /* original_len = _len */
    0xfed1c130: preserve_mark+0x0108:       ld      [%l3], %g5
    /* ??? */
    0xfed1c134: preserve_mark+0x010c:       sethi   %hi(0x5400), %g2
    /* address_of_data = _data */
    0xfed1c138: preserve_mark+0x0110:       ld      [%l3 + 0x8], %g3
    /* new_len = original_len + 1 */
    0xfed1c13c: preserve_mark+0x0114:       add     %g5, 0x1, %g4
    /* ??? */
    0xfed1c140: preserve_mark+0x0118:       add     %g2, 0x11c, %g2
    /* _len = new_len */
    0xfed1c144: preserve_mark+0x011c:       st      %g4, [%l3]
    /* offset = original_len * sizeof(GrET*) */
    0xfed1c148: preserve_mark+0x0120:       sll     %g5, 0x2, %g4
    /* ??? */
    0xfed1c14c: preserve_mark+0x0124:       ld      [%l2 + %g2], %g2
    /* address_of_data+offset = l1 */
    0xfed1c150: preserve_mark+0x0128: st %l1, [%g3 + %g4] <<< SIGBUS here

which would be okay, except that we have 

    original_len:    g5: 0x00000002
    address_of_data: g3: 0x00000002
    offset:          g4: 0x00000008

so address_of_data+offset is going to be 0x0000000a, 
which is misaligned *and* on the zeroth page, so if 
we hadn't gotten a SIGBUS we would have gotten a 
SIGFAULT.

It is curious that original_length (g5) is 2.  That's 
either because we are just starting to push things 
onto the GrowableArray<oop*>, or because we've wrapped 
around the *int* used for _len and _max.  I have trouble 
believing we've wrapped, given that we only have a 2GB 
heap, so we shouldn't be able to have more than 256K 
minimal objects, so even if they all were locked or had 
hashcodes (and so had to be pushed), we wouldn't be any 
where near wrapping.  (This is a concern for the 64-bit 
VM, though, where we could have more than 2G of objects 
that had to be pushed.)

If you run with -XX:+PrintGC -XX:+Verbose, you should 
get lines with your full collections from

    Restoring %d marks

that say how many marks were pushed and restored.  It 
would be interesting to see if this number is relatively 
modest (e.g., a few thousands or less) or ridiculously 
large (e.g., approaching 2^31).

  xxxxx@xxxxx   2004-06-07
----------------------------------------------------------------
Since we haven't seen a failure of the GrowableArray code 
in quite a while, what's the chance this is a memory smash?  
The fact that both the _len and _data fields are 0x2 is 
suspicious.  Is there any user JNI code running?  How does 
the program behave when run with -Xcheck:jni?  (Though, 
in JDK-1.4.2, -Xcheck:jni wasn't nearly as good as it is 
in JDK-1.5.0.)

Do we have any more core files to examine for similarities?
Without more data I might have to mark this bug "incomplete".

  xxxxx@xxxxx   2004-06-22
----------------------------------------------------------------

Sending to runtime team for evaluation as this appears to be a memory stomp.
  xxxxx@xxxxx   2005-06-13 18:51:05 GMT

Will be closing this bug in 1 month 7/13/2005 unless we here back from
customer with more details.. Please let us know if this issue is present in
1.4.2_09 release of the JDK ..  Been almost a year since the last update. I guess this is not an issue anymore or customer found a workaround. De-commit from mustang..

  xxxxx@xxxxx   2005-06-13 20:52:06 GMT
Posted Date : 2005-08-08 16:58:21.0
Comments
  
  Include a link with my name & email   

Submitted On 30-JUN-2004
mechner
Hello, 

I'm experiencing what might be a related bug. I'm getting regular VM crashes, but it doesn't even generate the standard messages. Instead, I get this: 

Exception java.lang.OutOfMemoryError: requested 24536 bytes for GrET* in D:/BUILD_AREA/jdk142-update/ws/fcs/hotspot\src\share\vm\utilities\growableArray.cpp. Out of swap space?

Printed to standard out, and the app terminates with no other output. 

I'm experiencing it in win2k on both dual Athlon and dual Opteron machines. By the way, memory is not (close to) exhausted.. in this one the memory usage was around 50MB, with these vm args: 

-Djava.library.path="c:/pragma/lib" -ea -server -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Xms512m -Xmx1g

If there's anything I can do to generate more useful information, please let me know: mechner at pragmafs dot com. 

Thanks


Submitted On 30-JUN-2004
mechner
Oh, sorry, the VM is build 1.4.2_04-b05


Submitted On 01-JUL-2004
mechner
By the way again, my crashes are not intermittend; they are every few minutes (!!) as long as -XX:+UseConcMarkSweepGC is set. With no GC opts it seems stable. 


Submitted On 07-JUL-2004
mechner
I also experience the problem with no GC arguments at all. 


Submitted On 12-JUL-2004
BeanRobert
Even build 1.5.0-beta2-b51 !


Submitted On 12-JUL-2004
BeanRobert
Hello,

I can reproduce this error  within minutes in my environment on the following VMs:
- build 1.4.2_04-b05
- build 1.4.2_05-b04
Does anybody know any workaround yet?


Submitted On 11-MAR-2005
albjon42
We get these regularly on our production systems, 1.4.2_06-b03


Submitted On 06-APR-2005
albjon42
Also present in tests in 1.5.0_02-b09


Submitted On 17-OCT-2005
I'm getting the same bug  with 1.5.0_05-b05. Like everyone else, I'm nowhere near actually running out of memory. I'm watching memory usage with JProfiler, and with everything looking good and no signs of leaks... my app suddenly dies.

I very desperately need a fix or a workaround for this, because my code needs to run reliably 24/7 for months at a time.


Submitted On 17-OCT-2005
I'm getting the same bug  with 1.5.0_05-b05. Like everyone else, I'm nowhere near actually running out of memory. I'm watching memory usage with JProfiler, and with everything looking good and no signs of leaks... my app suddenly dies.

I very desperately need a fix or a workaround for this, because my code needs to run reliably 24/7 for months at a time.


Submitted On 23-AUG-2006
Hello. 
I've got the same problem on 1.5_06. 
Exception java.lang.OutOfMemoryError: requested 8192000 bytes for GrET* in C:/BUILD_AREA/jdk1.5.0_06/hotspot\src\share\vm\utilities\growableArray.cpp. Out of swap space?
ze = 4472

Application was terminated without any additional output.  Application used jni code.
I'm on OS WinXP SP2


Submitted On 29-SEP-2006
jwcarman
We're having the same problem.  Is anyone watching this bug?


Submitted On 27-OCT-2006
AnmolB
Same Issue with 1.5.0_09 on a Server VM. Heavy Network I/O over RMI and 900M heap


Submitted On 23-APR-2007
Having the same problem in our production environment.  It occurs approximately once every 2 days.  Error looks as follows:  

Exception java.lang.OutOfMemoryError: requested 41943040 bytes for GrET* in C:/BUILD_AREA/jdk1.5.0_10/hotspot\src\share\vm\utilities\growableArray.cpp. Out of swap space?

Any info would be greatly appreciated.


Submitted On 02-OCT-2007
avci
Same problem,
Java Servers terminate with the same output, commented above. I have seen this problem 3 times last week,but I cant reproduce it.
JVM parameters are -Xconcurrentio -XX:+UseLWPSynchronization -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Xms2048m -Xmx2048m

Servers heavily use JNI.
Server uses more than 1GB memory.
I have seen SIGSEGV and SIGBUS errors, too.

System definition:
-----------------------------
Sun Solaris,
java version "1.5.0_08"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_08-b03) Java HotSpot(TM) Server VM (build 1.5.0_08-b03, mixed mode)

error output
------------------
Exception java.lang.OutOfMemoryError: requested 8192000 bytes for GrET* in /BUILD_AREA/jdk1.5.0_08/hotspot/src/share/vm/utilities/gr
owableArray.cpp. Out of swap space?




Submitted On 30-JAN-2009
ravi999
Hello, 
we saw this issue with jre1.5.0_6. Is this resolved? please let  know what is the fix


Submitted On 20-FEB-2009
jmelvin
This bug was marked as Closed-Not-Reproducible some 3 years ago.  It may be been resolved as a byproduct of another fix.  Please try the latest update in the Java 5 release train available here...

http://java.sun.com/javase/downloads/index_jdk5.jsp

You might also check out the support offerings from Java For Business available here...

http://java.sun.com/javase/support/javaseforbusiness/index.jsp



PLEASE NOTE: JDK6 is formerly known as Project Mustang