United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: 6475157 RMIConnectorServer.stop: deadlock
6475157 : RMIConnectorServer.stop: deadlock

Details
Type:
Bug
Submit Date:
2006-09-26
Status:
Resolved
Updated Date:
2010-12-03
Project Name:
JDK
Resolved Date:
2006-12-12
Component:
core-svc
OS:
solaris_9
Sub-Component:
javax.management
CPU:
sparc
Priority:
P2
Resolution:
Fixed
Affected Versions:
5.0
Fixed Versions:
6u1

Related Reports
Backport:
Backport:

Sub Tasks

Description
The deadlock happens when:
1) close the last client;
2) stop the server immediately.

Here the client report:
<###@###.###>
While running test cases against the server, I encountered a deadlock that prevented the tests from completing.  From the stack trace, it looks like it's in the RMI communication performed by JMX.  This needs to be looked into more closely because it has the potential to hang our tests (including anything that runs them, like daily builds), as well as obviously causing problems for the server itself.

Neil


2006-09-24 14:42:35
Full thread dump Java HotSpot(TM) Client VM (1.6.0-rc-b99 mixed mode, sharing):

...............

Found one Java-level deadlock:
=============================
"RMI Unreferenced-0":
  waiting to lock monitor 0x08099b88 (object 0x9ff0d648, a java.util.ArrayList),
  which is held by "main"
"main":
  waiting to lock monitor 0x08099bec (object 0x9ff162b8, a javax.management.remote.rmi.RMIConnectionImpl),
  which is held by "RMI Unreferenced-0"

Java stack information for the threads listed above:
===================================================
"RMI Unreferenced-0":
	at javax.management.remote.rmi.RMIServerImpl.clientClosed(RMIServerImpl.java:324)
	- waiting to lock <0x9ff0d648> (a java.util.ArrayList)
	at javax.management.remote.rmi.RMIConnectionImpl.close(RMIConnectionImpl.java:182)
	- locked <0x9ff162b8> (a javax.management.remote.rmi.RMIConnectionImpl)
	at javax.management.remote.rmi.RMIConnectionImpl.unreferenced(RMIConnectionImpl.java:190)
	at sun.rmi.transport.Target$1.run(Target.java:310)
	at java.lang.Thread.run(Thread.java:619)
"main":
	at javax.management.remote.rmi.RMIConnectionImpl.close(RMIConnectionImpl.java:162)
	- waiting to lock <0x9ff162b8> (a javax.management.remote.rmi.RMIConnectionImpl)
	at javax.management.remote.rmi.RMIServerImpl.close(RMIServerImpl.java:411)
	- locked <0x9ff0d648> (a java.util.ArrayList)
	- locked <0x9fef8cc8> (a org.opends.server.protocols.jmx.OpendsRMIJRMPServerImpl)
	at javax.management.remote.rmi.RMIConnectorServer.stop(RMIConnectorServer.java:528)
	at org.opends.server.protocols.jmx.RmiConnector.finalizeConnectionHandler(RmiConnector.java:433)
	at org.opends.server.protocols.jmx.JmxConnectionHandler.finalizeConnectionHandler(JmxConnectionHandler.java:508)
	at org.opends.server.protocols.jmx.JmxConnectionHandler.applyNewConfiguration(JmxConnectionHandler.java:798)
	at org.opends.server.protocols.jmx.JmxConnectionHandler.applyNewConfiguration(JmxConnectionHandler.java:771)
	at org.opends.server.protocols.jmx.JmxConnectTest.configureJmx(JmxConnectTest.java:375)
	at org.opends.server.protocols.jmx.JmxConnectTest.sslConnect(JmxConnectTest.java:336)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.testng.internal.MethodHelper.invokeMethod(MethodHelper.java:552)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:411)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:785)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:114)
	at org.testng.TestRunner.privateRun(TestRunner.java:695)
	at org.testng.TestRunner.run(TestRunner.java:574)
	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:241)
	at org.testng.SuiteRunner.run(SuiteRunner.java:145)
	at org.testng.TestNG.createAndRunSuiteRunners(TestNG.java:901)
	at org.testng.TestNG.runSuitesLocally(TestNG.java:863)
	at org.testng.TestNG.run(TestNG.java:613)
	at org.testng.TestNG.privateMain(TestNG.java:1001)
	at org.testng.TestNG.main(TestNG.java:938)

Found 1 deadlock.

                                    

Comments
EVALUATION

The deadlock can be reproduced in another way that leads to a valid regression test.  The idea is to subclass RMIJRMPServerImpl in order to be able to override its clientClosed method.  Within that method, it can create another thread that will call connectorServer.stop(), and wait for that thread to complete or block before calling super.clientClosed.  The test is triggered by creating and closing a connection.

If the bug is not fixed, this will lead to the following deadlock:

Initial thread:
  RMIConnectionImpl.close
  - locks RMIConnectionImpl
  -> RMIJRMPServerImpl.clientClosed (called from overriding method)
     - tries to lock clientList

Created thread:
  RMIConnectorServer.stop
  -> RMIJRMPServerImpl.close
     - locks clientList
     -> RMIConnectionImpl.close
        - tries to lock RMIConnectionImpl

With the fix (move the call to clientClosed out of the synchronized block), the test passes.
                                     
2006-09-27
EVALUATION

The deadlock can occur if an individual RMI connection is closed at the same time as the RMI connector server is closed.  An individual connection can be closed either explicitly by the client, or implicitly because the client has gone away and the Distributed Garbage Collection discovers that.  The latter seems less likely but is in fact what we see in the stack trace.

To reproduce this failure, we could subclass RMIJRMPConnectionImpl and override its close() method like this:

@Override
public synchronized void close() throws IOException {
    if (!alreadyCalled) {
        alreadyCalled = true;
        Thread t = new Thread() {
            public void run() {
                connectorServer.stop();
            }
        };
        t.start();
        Thread.sleep(1000);
    }
    super.close();
}

(This will need some try/catches to compile.)

Then the test can open a single connection and close it again (with JMXConnector.close()), which should produce the deadlock.

The deadlock will look like this:

Original thread:
    RMIConnectionImplSubclass.close
    - locks RMIConnectionImpl
    - connectorServer.stop thread created here
    -> RMIConnectionImpl.close
       - tries to lock clientList

Created thread:
    RMIConnectorServer.stop
    -> RMIServerImpl.close
       - locks clientList
       -> RMIConnectionImpl.close
          - tries to lock RMIConnectionImpl

I think the simplest fix is to change RMIConnectionImpl.close so that it is no longer synchronized.  Instead, a synchronized(this) block should surround the body of the statement but the final call to rmiServer.clientClosed should be outside the block.  Unfortunately, the modified code above will still fail, because we are artificially extending the synchronization of RMIConnectionImpl.close to cover the entire method again.  It may be possible to reproduce the deadlock in another way that would not have this problem (so that we can have a regression test).
                                     
2006-09-26



Hardware and Software, Engineered to Work Together