Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 4873538
Votes 0
Synopsis signal handler deadlocked waiting on a malloc lock
Category hotspot:runtime_system
Reported Against 1.3.1_07 , 1.3.1_09
Release Fixed
State 11-Closed, duplicate of 4515367, bug
Priority: 2-High
Related Bugs 4515367 , 4647546 , 6194668
Submit Date 03-JUN-2003
Description
Customer is using SunONE Application server with jdk1.3.1_07 as the jvm in it. They are occasionally seeing the JVM getting crashed.

When a SIGBUS occured, the JVM should have cored and been restarted by iAS.  However, it did not and I believe this is a result of a bug in the JVM.  The following is the relevant stack info:

current thread:   xxxxx@xxxxx  
  [1] __lwp_park(0x0, 0x0, 0x0, 0x0, 0xfe254000, 0xfe7c0600), at 0xfedb4ab0
  [2] mutex_lock_queue(0xe30e1c00, 0xfedc6b6c, 0xfe7c0600, 0xfedc6000, 0x1, 0xfe7c0608), at 0xfedb1524
  [3] slow_lock(0xfe7c0600, 0xe30e1c00, 0xfe7c0600, 0xfe7bc004, 0x0, 0x3), at 0xfedb1c00
  [4] free(0x25af648, 0x25af648, 0x45535400, 0x7efefeff, 0x81010100, 0xff00), at 0xfe742b14
  [5] tzcpy(0x25af648, 0xfe7c2938, 0x0, 0xb, 0xfe7bc004, 0xffbefef6), at 0xfe7534f4
  [6] getzname(0xffbeff01, 0xfe7bf55c, 0x0, 0xfe7bf55c, 0xffbefef6, 0x2), at 0xfe753458
  [7] _ltzset_u(0x3edb64f5, 0xfe7bc004, 0x0, 0x0, 0x0, 0x1), at 0xfe752f5c
  [8] localtime_u(0xe067d540, 0xfe7c2940, 0xe067d540, 0xfecc8000, 0xfe7bc004, 0xfebc5a48), at 0xfe752124
  [9] os::report_fatal_error(0x24fb4b8, 0xffffffff, 0xfec70c30, 0x22250, 0x0, 0xfe742464), at 0xfebc5a48
  [10] os::handle_unexpected_exception(0x24fb4b8, 0xfed38984, 0xfecdc18c, 0xfec70ff4, 0xfecc8000, 0x0), at 0xfebc5e0c
  [11] JVM_handle_solaris_signal(0x0, 0x24fb4b8, 0xe067e178, 0xfecc8000, 0xa, 0xe067e430), at 0xfea0a9bc
  [12] __sighndlr(0xa, 0xe067e430, 0xe067e178, 0xfea0a9d4, 0x0, 0x0), at 0xfedb4cc8
  [13] call_user_handler(0xe30e1c00, 0x10c, 0xfedc78c0, 0xe067e178, 0xe067e430, 0xa), at 0xfedafb00
  [14] sigacthandler(0xe30e1c00, 0xe067e430, 0xe067e178, 0xfedc6000, 0xe067e430, 0xa), at 0xfedafccc
  ---- called from signal handler with signal 10 (SIGBUS) ------
  [15] realfree(0x2ffc228, 0xfe7c2850, 0xfe7bc004, 0x2ffc0f0, 0x139, 0x2ffc0f8), at 0xfe742464
  [16] cleanfree(0x0, 0xfe7bc004, 0xfe7c27c4, 0xfe7c2844, 0xfe7c27c8, 0x0), at 0xfe742c60
  [17] _malloc_unlocked(0x1e, 0xe30e1c00, 0xfe7bc004, 0x20, 0x0, 0x0), at 0xfe741d94
  [18] malloc(0x1e, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfe741c88
  [19] operator new(0x1e, 0x0, 0x13640, 0xe4ad8c78, 0xff029978, 0x1e), at 0xff01635c

If you notice, when the SIGBUS occured, the JVM's signal handler get's called to process the signal.  I believe what has happened is the signal handler has deadlocked waiting on a malloc lock which is held by the JVM thread which caused the SIGBUS to be thrown. 
  xxxxx@xxxxx   2003-06-03
Work Around
The following  falag has been added to 1.3.1 to allow avoiding this hang:
 -XX:+SuppressFatalErrorMessage

  xxxxx@xxxxx   2003-11-18
Evaluation
4852773 (jdk1.2.2_15) is unrelated to current problem. This 1.2.2_xx issue
was due to suspending a thread for GC when it is inside malloc. GC suspension
code in 1.3.1_xx is totally different. 1.3.1_xx and higher will not suspend
a thread for GC when it is inside malloc call. 

In JVM signal handler, if signal is not an expected one, we print some
error info. In the info we print time by calling 'localtime'. localtime indirectly calls 'free'. Our SIGBUS originated from problem in malloc and hence we already hold malloc lock. The 'free' call tries to get the same lock again and hence results in deadlock. I am check whether we can get current time value in some async. safe way (i.e., without affecting malloc lock).

  xxxxx@xxxxx   2003-06-04

This is a duplicate of 4515367 which is fixed in 1.5.

  xxxxx@xxxxx   2003.10.24
Comments
  
  Include a link with my name & email   


PLEASE NOTE: JDK6 is formerly known as Project Mustang