Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 4753265
Votes 2
Synopsis [1.4.0_02] Crash in 64 bit HotSpot Server JVM,, backport 4746263
Category hotspot:compiler2
Reported Against 1.4.0_03
Release Fixed 1.4.0_04
State 10-Fix Delivered, bug
Priority: 2-High
Related Bugs 4746263
Submit Date 25-SEP-2002
Description
Customer is trying to benchmark their application using J2SE1.4 64 bit on
E4500 and E10K systems, but are running into crashes.  They have
tried with 1.4.0_02, early access 1.4.0_03 and 1.4.1 64bit
versions and the problem occurs in all with the server HotSpot
compiler.  They have tried with 1.4.0_2 -client using the
32bit binary and the crashes don't occur, but the performance is
only 40% of what -server 64bit can provide.

The first crash message is seen consistently using 64-Bit Server VM 
1.4.0_03-ea-b01. The cust tested with a 64bit fastdebug build of 1.4.0_03
and  assert failures occurred assert(top <= end, "pointers out of order")
The stack trace is located in the attachments and the actual core files can 
be provided upon requested.

Unexpected Signal : 10 occurred at PC=0xFFFFFFFF39A3001C
Function=printString (compiled Java code)
Library=(N/A)

Current Java thread:

Dynamic libraries:
0x100000000 	/c1t1/tal/jre/14003/bin/sparcv9/java
0xffffffff7f300000 	/usr/lib/64/libthread.so.1
0xffffffff7f500000 	/usr/lib/64/libdl.so.1
0xffffffff7ef00000 	/usr/lib/64/libc.so.1
0xffffffff7ee00000 	/usr/platform/SUNW,Ultra-Enterprise-10000/lib/sparcv9/libc_psr.so.1
0xffffffff7d000000 	/c1t1/tal/jre/14003/lib/sparcv9/server/libjvm.so
0xffffffff7ce00000 	/usr/lib/64/libCrun.so.1
0xffffffff7cc00000 	/usr/lib/64/libsocket.so.1
0xffffffff7ca00000 	/usr/lib/64/libnsl.so.1
0xffffffff7c800000 	/usr/lib/64/libm.so.1
0xffffffff7db00000 	/usr/lib/64/libw.so.1
0xffffffff7c500000 	/usr/lib/64/libmp.so.2
0xffffffff7c200000 	/c1t1/tal/jre/14003/lib/sparcv9/native_threads/libhpi.so
0xffffffff7c000000 	/c1t1/tal/jre/14003/lib/sparcv9/libverify.so
0xffffffff7be00000 	/c1t1/tal/jre/14003/lib/sparcv9/libjava.so
0xffffffff7bb00000 	/c1t1/tal/jre/14003/lib/sparcv9/libzip.so
0xfffffffee3700000 	/usr/lib/locale/en_US.ISO8859-1/sparcv9/en_US.ISO8859-1.so.2
0xfffffffee1a00000 	/t3-6/gatherer/g20/Shared/libxacct_native_sparcv9_SunOS.so
0xfffffffee1800000 	/c1t1/tal/jre/14003/lib/sparcv9/libnet.so
0xfffffffee0400000 	/t3-6/gatherer/g20/Shared/libdb_java-3.3-sparcv9-SunOS.so
0xfffffffedf400000 	/c1t1/tal/jre/14003/lib/sparcv9/libawt.so
0xfffffffedf200000 	/c1t1/tal/jre/14003/lib/sparcv9/libmlib_image.so
0xfffffffedf000000 	/c1t1/tal/jre/14003/lib/sparcv9/motif21/libmawt.so
0xfffffffedeb00000 	/usr/dt/lib/sparcv9/libXm.so.4
0xfffffffede900000 	/usr/openwin/lib/sparcv9/libXt.so.4
0xfffffffede700000 	/usr/openwin/lib/sparcv9/libXext.so.0
0xfffffffede500000 	/usr/openwin/lib/sparcv9/libXtst.so.1
0xfffffffede200000 	/usr/openwin/lib/sparcv9/libX11.so.4
0xfffffffede000000 	/usr/openwin/lib/sparcv9/libdps.so.5
0xfffffffedde00000 	/usr/openwin/lib/sparcv9/libSM.so.6
0xfffffffeddb00000 	/usr/openwin/lib/sparcv9/libICE.so.6
0xfffffffedd900000 	/usr/openwin/lib/sparcv9/libdga.so.1

Local Time = Wed Sep 18 00:34:50 2002
Elapsed Time = 577
#
# HotSpot Virtual Machine Error : 10
# Error ID : 4F530E43505002D5 01
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.0_03-ea-b01 mixed mode)
#
# An error report file has been saved as hs_err_pid22037.log.
# Please refer to the file for further information.
#

#
# HotSpot Virtual Machine Error, assertion failure
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.0_03-internal-20020821-debug mixed mode)
#
# assert(top <= end, "pointers out of order")
#
# Error ID: /net/jdk/export/jpse01/hshen/J2SE/140/hotspot/src/share/vm/memory/collectedHeap.inline.hpp, 121 [ Patched ]
#
# Problematic Thread: prio=5 tid=0x10573c2b8 nid=0x5c runnable 
#
Dumping core....
Work Around
N/A
Evaluation
This is the basic stuff :

HeapWord*CollectedHeap::common_mem_allocate_noinit

 ffffffff7c4bb978 HeapWord*CollectedHeap::common_mem_allocate_noinit(unsigned long,Thread*) (5, 10573c2b8, fffffffed5c00c40, ffffffff7ce31112, ffffffff7c8688e4,0) + 80
 ffffffff7c4badb8 oopDesc*CollectedHeap::obj_allocate(KlassHandle,int,Thread*) (10573c5b8, 5, 10573c2b8, 1f, ffffffff7c8688e4, 0) + c0
 ffffffff7c4b30c8 instanceOopDesc*instanceKlass::allocate_instance(Thread*) (ffffffff3300bfb8, 10573c2b8, 0, 0, 2, 1001ced58) + 4b8
 ffffffff7cd07f40 void OptoRuntime::new_C(klassOopDesc*,JavaThread*) (ffffffff3300bfa8, 10573c2b8, 0, 0, 30, fffffffefaeba558) + 1a0
 ffffffff38998bbc ???????? (ffffffff3300bfa8, fffffffefaeba558, 2, fffffffefaeba540, 2, 86)

# assert(top <= end, "pointers out of order")
#
# Error ID: /net/jdk/export/jpse01/hshen/J2SE/140/hotspot/src/share/vm/memory/collectedHeap.inline.hpp, 121 [ Patched ]
#
   115  HeapWord* CollectedHeap::allocate_from_tlab(Thread* thread, size_t size) {
   116    assert(UseTLAB, "should use UseTLAB");
   117  
   118    HeapWord* top = thread->tlab().top();
   119    HeapWord* end = thread->tlab().end();
   120    
   121    assert(top <= end, "pointers out of order");
   122  
   123    if (pointer_delta(end, top) >= size) {
   124      // successful thread-local allocation
   125      if (!ZeroTLAB) {
   126        // need to clear individual objects
   127        Memory::set_words(top, size);
   128      }
   129      // This addition is safe because we know that top is
   130      // at least size below end, so the add can't wrap.
   131      thread->tlab().set_top(top + size);
   132
   133      assert(thread->tlab().invariants(), "TLAB integrity violated");
   134      return top;
   135    }
   136    // Otherwise...
   137    return allocate_from_tlab_slow(thread, size);
   138  }
This is the basic stuff in the bug report:

Awaiting core file /machine access... cycling home now..

  xxxxx@xxxxx   2002-09-25

----- -----

  xxxxx@xxxxx   2002-09-25

Hitting this assert in 1.4.1 is probably an occurrence of the following bug 
4746263 JDK 1.4.1 dumps core during ECperf; fails debug assert top <= end

A fix for this is going into 1.4.1_01 and is available in HotSpot's current main/baseline
/net/balvenie/export/imgr_home/archive/main/baseline/20020920073822.mpal.c2_merge_20020919/product.tgz

----- -----
  xxxxx@xxxxx   2002-09-27

Customer has confirmed that crashes no longer occur with the fix for 4746263 in 1.4.1_01
and would like the fix backported to the next 1.4.0_x as well.

===============================
  xxxxx@xxxxx   2002-10-03

Unfortunately the following does not appear to be sufficient to 
fix this in 1.4.0 :

------- memnode.hpp -------
*** sccs.HmayZJ	Thu Oct  3 10:35:43 2002
--- memnode.hpp	Mon Sep 30 12:33:33 2002
***************
*** 1,5 ****
  #ifdef USE_PRAGMA_IDENT_HDR
! #pragma ident "%W% %E% %U% JVM"
  #endif
  /*
   * Copyright 1991-2002 Sun Microsystems, Inc.  All rights reserved.
--- 1,5 ----
  #ifdef USE_PRAGMA_IDENT_HDR
! #pragma ident "@(#)memnode.hpp	1.87 02/09/30 12:31:36 JVM"
  #endif
  /*
   * Copyright 1991-2002 Sun Microsystems, Inc.  All rights reserved.
***************
*** 115,120 ****
--- 115,130 ----
    virtual uint ideal_reg() const { return Op_RegI; }
    virtual Node *Ideal(PhaseGVN *phase, bool can_reshape);
    virtual int store_Opcode() const { return Op_StoreB; }
+   // depends_only_on_test is almost always true, and needs to be almost always
+   // true to enable key hoisting & commoning optimizations.  However, for the
+   // special case of RawPtr loads from TLS top & end, the control edge carries
+   // the dependence preventing hoisting past a Safepoint instead of the memory
+   // edge.  (An unfortunate consequence of having Safepoints not set Raw
+   // Memory; itself an unfortunate consequence of having Nodes which produce
+   // results (new raw memory state) inside of loops preventing all manner of
+   // other optimizations).  Basically, it's ugly but so is the alternative.
+   // See comment in graphkit.cpp, around line 1923 GraphKit::allocate_heap.
+   virtual bool depends_only_on_test() const { return adr_type() != TypeRawPtr::BOTTOM; } 
  };
  
  //------------------------------LoadCNode--------------------------------------
***************
*** 148,153 ****
--- 158,164 ----
      : LoadINode(c,mem,adr,TypeAryPtr::RANGE,ti) {}
    virtual int Opcode() const;
    virtual const Type *Value( PhaseTransform *phase ) const;
+   virtual bool depends_only_on_test() const { return true; } 
  };
  
  //------------------------------LoadLNode--------------------------------------
***************
*** 322,327 ****
--- 333,339 ----
      : LoadPNode(c,mem,adr,TypeRawPtr::BOTTOM, TypeRawPtr::BOTTOM) {}
    virtual int Opcode() const;
    virtual int store_Opcode() const { return Op_StorePConditional; }
+   virtual bool depends_only_on_test() const { return true; } 
  };
  
  //------------------------------LoadLLockedNode---------------------------------

------- loopopts.cpp -------
*** sccs.pcai0J	Thu Oct  3 10:36:19 2002
--- loopopts.cpp	Mon Sep 30 12:33:33 2002
***************
*** 1,5 ****
  #ifdef USE_PRAGMA_IDENT_SRC
! #pragma ident "%W% %E% %U% JVM"
  #endif
  //
  // Copyright 1997-2002 Sun Microsystems, Inc.  All rights reserved.
--- 1,5 ----
  #ifdef USE_PRAGMA_IDENT_SRC
! #pragma ident "@(#)loopopts.cpp	1.162 02/09/30 12:31:22 JVM"
  #endif
  //
  // Copyright 1997-2002 Sun Microsystems, Inc.  All rights reserved.
***************
*** 629,638 ****
      // If trying to do a 'Split-If' at the loop head, it is only
      // profitable if the cmp folds up on BOTH paths.  Otherwise we
      // risk peeling a loop forever.
-     // CNC - Disabled for now.
-     if( n_ctrl->is_Loop() )
-       policy = 999;             // Policy requires BOTH paths to win  
  
      // Split compare 'n' through the merge point if it is profitable
      Node *phi = split_thru_phi( n, n_ctrl, policy );
      if( !phi ) return;
--- 629,647 ----
      // If trying to do a 'Split-If' at the loop head, it is only
      // profitable if the cmp folds up on BOTH paths.  Otherwise we
      // risk peeling a loop forever.
  
+     // CNC - Disabled for now.  Requires careful handling of loop
+     // body selection for the cloned code.  Also, make sure we check
+     // for any input path not being in the same loop as n_ctrl.  For
+     // irreducible loops we cannot check for 'n_ctrl->is_Loop()'
+     // because the alternative loop entry points won't be converted
+     // into LoopNodes.
+     IdealLoopTree *n_loop = get_loop(n_ctrl);
+     for( uint j = 1; j < n_ctrl->req(); j++ )
+       if( get_loop(n_ctrl->in(j)) != n_loop )
+         return;
+ 
+ 
      // Split compare 'n' through the merge point if it is profitable
      Node *phi = split_thru_phi( n, n_ctrl, policy );
      if( !phi ) return;

Any suggestions?

Heres the stack/info after the above with the same assert:

# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.4.0+4753265-TEST+20020930.152537
+chrisph-debug mixed mode)
#
# assert(top <= end, "pointers out of order")
#
# Error ID: /net/altair.east/terra/space5/chrisph/4746263/build/src/share/vm/mem
ory/collectedHeap.inline.hpp, 121 [ Patched ]
#
# Problematic Thread: prio=5 tid=0x1059d8228 nid=0x38 runnable 
#
Dumping core....

core '/t3-1/gatherer/g29/Gatherer/core' of 15088:	/c1t1/tal/jre/14003e/bin/sparcv9/java -server -showversion -Xms1280m -
-----------------  lwp# 57 / thread# 56  --------------------
...
 ffffffff7c4bd800 HeapWord*CollectedHeap::common_mem_allocate_noinit(unsigned long,Thread*) (5, 1059d8228, fffffffeba000c40, ffffffff7ce3c692, ffffffff7c86b164, 0) + 80
 ffffffff7c4bcc40 oopDesc*CollectedHeap::obj_allocate(KlassHandle,int,Thread*) (1004dd608, 5, 1059d8228, 1f, ffffffff7c86b164, 0) + c0
 ffffffff7c4b4f5c instanceOopDesc*instanceKlass::allocate_instance(Thread*) (ffffffff3300bfb8, 1059d8228, 0, 0, 2, 1001ced98) + 4c4
 ffffffff7cd0ac38 void OptoRuntime::new_C(klassOopDesc*,JavaThread*) (ffffffff3300bfa8, 1059d8228, 0, 0, 37, fffffffeda724390) + 1a0

  xxxxx@xxxxx   2002-10-03

Hmm, I seemed to have botched the backport... I am moving the code
from LoadBNode to LoadPNode in memnodes.hpp, we'll see if that makes the
fix better.

  xxxxx@xxxxx   2002-10-03

OK... The customer confirms the fix  [See Suggested Fix.] 

  xxxxx@xxxxx   2002-10-04
Comments
  
  Include a link with my name & email   


PLEASE NOTE: JDK6 is formerly known as Project Mustang