Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 4343002
Votes 0
Synopsis Problematic Thread: prio=1, An unexpected exception has been detected
Category hotspot:runtime_system
Reported Against kestrel-linux , kest-linux-rc1
Release Fixed
State 11-Closed, Will Not Fix, bug
Priority: 2-High
Related Bugs 4344153 , 4345764 , 4350165 , 4531882
Submit Date 02-JUN-2000
Description
I got the following crash when clicking on the Lines tab in Java2Demo,
but it is a random error. There is no definite sequence of actions
which reproduces the bug, but I have seen it happen several times.
Since the error is in native code outside the VM, it is not a HotSpot error.

$ java -version
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build mbron-000522-13:23)
Java HotSpot(TM) Client VM (build 1.3.0beta-b04, mixed mode)

$ java -jar Java2Demo.jar
# # An unexpected exception has been detected in native code outside the VM.# Program counter=0x40091598
#
# Problematic Thread: prio=1 tid=0x812ea40 nid=0x1c08 runnable
#
# # An unexpected exception has been detected in native code outside the VM.# Program counter=0x4074098a
#
# Problematic Thread: prio=1 tid=0x91e1c70 nid=0x18c17 runnable
Work Around
N/A
Evaluation
I was unable to reproduce the bug on two configurations I have: RedHat 6.1 and
RedHat 6.2.

I let the demo run for a couple of hours cycling through the tabs but the bug
didn't manifest.

A stack trace of the crash would be helpful.

  xxxxx@xxxxx   2000-06-19

Reproduced on redhat 6.1 with hotspot from beta release

It is very likely to be a bug in glibc or Java 2D that causes SEGSEGV (null pointer).

  xxxxx@xxxxx   2000-06-27

This is caused by a very rare race condition in linuxthreads contained
within glibc 2.1.2...

A Java thread enters pthread_cond_timedwait().  The thread unblocks the
restart signal, does a sigsetjmp (to enable a siglongjmp from signal handler)
and then goes to sleep for the specified timeout period.

After the sleep we may have either been signalled via pthread_cond_signal
(siglongjmp) or the sleep timed out.  In the first case the signalling thread
has removed us from the wait queue and the signal state has been reset.  In the
second case we must block signals and remove ourself.

However, there is a race condition.  If a thread signals us between when we
block the signal and remove ourselves from the queue we will get an outstanding
restart signal.

This isn't the bug we are seeing here.  Instead, it's an even rarer race
condition when the thread gets signalled resulting in a pending restart AND
another thread attempts to signal a thread on the same condvar at the same time
as we attempt to obtain the low-level lock protecting access to the wait
queue.  In this case the signalling thread can obtain the lock before us and
so we will be suspended when we attempt to obtain the lock.  However,
we have a pending restart so as soon as the signal is unblocked we will
exit the suspend within the internal lock function
(__pthread_lock).  This means two threads now think they have the internal
lock and any operation protected by this lock is now unsafe.

In this particular case hotspot was always crashing within __pthread_unlock() -
the internal unlock function.  This is because the function assumes that only
one thread can ever have the lock and so can manipulate the thread queue
linked-list without using atomic operations.  In this case, two threads enter
__pthread_unlock which leads to a SEGV...

As testing has been conducted using glibc 2.1.2 a note will be added to the
README concerning this bug.  If users encounter it they can upgrade to the
later linuxthreads.

  xxxxx@xxxxx   2000-07-28
Comments
  
  Include a link with my name & email   

Submitted On 08-FEB-2001
rpandya
I am seeing this problem very consistently in certain
applications using Sun JVM 1.3 on Redhat 7.0 Update 1.
I don't believe I am running an SMP kernel, and I have tried
updating to glibc-2.2 and it still happens.

Ravi Pandya
ravi at iecommerce dot com


Submitted On 13-FEB-2001
doubrava
I am also seeing this problem consistently.  I am using Sun 
JVM 1.3 on Redhat 6.2.  I am using glibc-2.2.

Mark Doubrava


Submitted On 14-FEB-2001
jskovron
We're also seeing this problem frequently, using RedHat 6.2
upgrade to
the 2.4.0 kernel and glibc 2.1.3-22 rpm.

John Skovron
john@datasynapse.com


Submitted On 10-MAR-2001
dlemire
I see this problem with glibc-2.2.2-3


Submitted On 06-APR-2001
Juggler
I too am getting this consistently with Redhat 6.2 and 
glibc 2.1.3-15.  I'm creating/destroying many threads 
rapidly and make a Runtime.exec() call in each thread.  
This is really frustrating!


Submitted On 05-JUL-2001
dassh
i am seeing this issue running redhat 7.0, JVM 1.3.  Issue 
arises using thread pool of 250 to handle incoming socket 
requests.  If each request matches protcol, a syncronized 
method, makes 3-6 calls to mysql, is called to distributed 
and thread is then put back into sychronized pool.  
Sometimes i can slam this code with an average of 1000/2.5 
seconds.  then again, sometimes it dies on me.  Very 
unreliable and as someone else said, very frustrating.


Submitted On 10-SEP-2001
patrickvc
I found this exception in solaris 7.0 running JVM 1.3 .Can anyone give some work around.
Thanks in advance.
Patrick


Submitted On 16-NOV-2001
cathyrein
I am also able to reproduce this problem consistently on 
Red Hat 6.2, JVM 1.3.  I get the same error, proceeded by:

sem_lock->semop->op_op: Identifier removed

Any suggestions or workarounds?


Submitted On 13-AUG-2004
jagilgen
I am also seeing this condition hit.  I am spawning roughly 1000 threads and this happens very often.  



PLEASE NOTE: JDK6 is formerly known as Project Mustang