|
Quick Lists
|
|
Bug ID:
|
4916766
|
|
Votes
|
0
|
|
Synopsis
|
CORBA COMM_FAILURE when destroy() takes too long and close() happens
|
|
Category
|
idl:orb
|
|
Reported Against
|
1.4.1_02
|
|
Release Fixed
|
1.4.1(1.4.1_07),
1.4.2(1.4.2_04) (Bug ID:2074266)
|
|
State
|
10-Fix Delivered,
Needs Verification,
bug
|
|
Priority:
|
2-High
|
|
Related Bugs
|
4936203
,
6660037
|
|
Submit Date
|
03-SEP-2003
|
|
Description
|
Test case and related files are in
/net/cores.east/cores/63693175
BEA logged the case and this is reproducible under their application server.
I have it in this directory as well in case it's needed.
server811_solaris32.bin
You can install this installer and choose the default install
location(/usr/local/bea).
After the installation create a directory user_projects under
/usr/local/bea/weblogic81/
create another directory under user_projects by name domains.
Under this domains directory extract the mydomain.zip file which is available in the above directory
/net/cores.east/cores/63693175
deploy the application 'fxtransact'
hit the applet in the browser as follows:
http://host:port/fxtransact/applet.html or
http://host:port/fxtransact/iiop-applet.html or
http://host:port/fxtransact/http-applet.html
4) you will a submit button in the browser. Hit the submit button wait for message that says applet started. And then hit the submit button again. You will see corba errors in java plugin console.
Caused by: org.omg.CORBA.COMM_FAILURE: vmcid: SUN minor code: 208 completed: Maybe
at com.sun.corba.se.internal.iiop.IIOPConnection.purge_calls(Unknown Source)
at com.sun.corba.se.internal.iiop.ReaderThread.run(Unknown Source)
On the applet refresh/reload it is trying to do two things at a time. One is calling destroy and the other is an event on killing the whole applet context and it's corresponding resources.
So, One thread is executing destroy() and the other thread is executing cleaning up AppContext. As part of cleaning up app context, it is also killing all the threads and thread groups and hence killing com.sun.corba.se.internal.iiop.ReaderThread. But at the same time, the other thread who is doing destroy() is trying to use the ReaderThread and operating
processInput() on IIOPConnection inside run() method of ReaderThread. Hence it got ThreadDeath. Hence, it is setting SystemException as COMM_FAILURE to the connection and finally it's been thrown from the destroy() method.
So, finallay based on various combinations we tried, looks like it is failing only in the case when both of these are happening at the same time. It doesn't fail if the destroy() or init() happens completely before or after ThreadDeath is issued.
|
|
Work Around
|
It appears that
prior to destroy since destroy() is taking longer time to finish. And
as part of closing AppContext it is also releasing the ClassLoader. And
release classloader is going through all ThreadGroups and stoping all
threads.
Hence, as a result of this, it is stoping the ReaderThread
that corresponds to ORB. On the other side, destroy() applet is closing
the JMSProducer and JMSProducer is doing dispatchSync() in close(). But,
by this time the reader thread is already stoped and hence getting
COMM_FAILUREs.
So, it works if we put 3 seconds delay in destroy() method.
|
|
Evaluation
|
The root of the problem is Weblogic InitialContext caches and resues ORB reader threads, as Tao Ma suggested. So Weblogic InitialContext should not be created inside applet's thread group, such lifecycle is shorter than the reader threads.
I developed a workaround that appears to fix the problem, by creating InitialContext in different thread group.
init() {
...
InitialContext adminContext = null;
try {
System.out.println("Getting new initial context");
// adminContext = new InitialContext(props);
InitialContextThread t = new InitialContextThread(getInitialContextThreadGroup());
adminContext = t.getInitialContext(props);
} catch (NamingException ne) {
ne.printStackTrace();
}
...}
private static final String INITIALCONTEXT_THREADGROUP = "InitialContextThreadGroup";
private ThreadGroup getInitialContextThreadGroup() {
ThreadGroup tg = Thread.currentThread().getThreadGroup().getParent();
int count = tg.activeGroupCount();
ThreadGroup[] tgs = new ThreadGroup[count];
count = tg.enumerate(tgs);
for(int index = 0; index < count; index ++) {
if(INITIALCONTEXT_THREADGROUP.equals(tgs[index].getName()))
return tgs[index];
}
return new ThreadGroup(tg, INITIALCONTEXT_THREADGROUP);
}
class InitialContextThread extends Thread {
private Properties props;
private InitialContext initCtx;
private NamingException ne;
public InitialContextThread(ThreadGroup tg) {
super(tg, "InitialContextCreationThread");
}
public void run() {
try {
initCtx = new InitialContext(props);
} catch(NamingException e) {
this.ne = e;
}
}
public InitialContext getInitialContext(Properties props)
throws NamingException {
this.props = props;
initCtx = null;
ne = null;
this.start();
try {
this.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
if(this.ne != null)
throw this.ne;
return initCtx;
}
}
xxxxx@xxxxx 2003-09-18
----------------------------------
xxxxx@xxxxx 2003-11-19
Re-opening as a new fix was dis-covered for handling the 548145 escalation from BEA/BofA.
Investigation with lots of help from Tao ma and ken, has helped in identifying the accidental unexpected death of the ReaderThread, and the ListenerThread as being the root cause of this behaviour.
The changes in the code affect in ensuring that these threads get created in a threadgroup that is more persistant than the thread-group associated with the applet's threads, although it is the applet-activity that causes the creation of the ReaderThread, and the ListenerThread.
Look at suggested fix for details.
Need to be fixed in CORBA code.
xxxxx@xxxxx 2003-11-24
|
|
Comments
|
Submitted On 10-DEC-2005
shp です。
Solaris 10 x86 3/05
JDS(Java Desktop System)3
CORBA COMM_FAILURE:1.0 が発生する
原因や解決法などご存知か推測できる方、
お手数ですが、ご教授いただけないでしょうか。
PLEASE NOTE: JDK6 is formerly known as Project Mustang
|
|
|
 |