To make a Java call a native thread has to look like a Java thread: It has to have a Java thread object and reside in the threads list. The thread must exist in the thread's list prior to the Java call so that GC and safepoints will work correctly if they occur during the Java call (whether triggered by the current thread or not).
- creates a JavaThread object
- initializes its basic VM state and TLS
- under the Threads_lock:
- initializes active handles
- adds it to the Threads list
- creates a default initualized java.lang.Thread object
- binds the JavaThread to the Thread
- sets the Thread's priority
- binds the Thread to the JavaThread
- if a name is supplied then
- creates a Java String for the name
- invokes the Thread(ThreadGroup tg, String name) constructor
- invokes the Thread(ThreadGroup tg, Runnable r) constructor (which
creates a default name)
- sets the daemon status of the Thread
- invokes ThreadGroup.add passing in the new Thread object (this
emulates what would occur when a normal Thread was started)
- sets the Thread state to runnable
- informs JVMTI and JVMPI of a "thread start" event
Note there is a little bit of trickery here. The Thread constructor expects to ascertain some properties of the new Thread from the current "parent" Thread. But in this case the current Thread IS the new Thread, so it has to have all the right properties set for when the constructor queries it for its own attributes.
The problems: during the Java call to the Thread constructor the newly attached thread has a partly constructed Thread object:
1. the name is null until after the name is assigned,
2. the group is null until assigned
3. the context classloader is null until assigned (but may be null
4.the TID is zero until assigned
So far we have seen (1) trigger a VM crash due to accessing the null name (fixed by an explicit null check); and (4) causes this IllegalArgumentException in the ThreadMXBean code. I think (3) is safe because the CCL can be null. (2) might also cause a problem because a null group is only expected for threads that have terminated; but at this stage the Thread won't be found by any code the enumerates the ThreadGroups in the system - so this would only be a problem if the VM cared and I don't think it does.
Possible solutions for this include:
(a) perform the necessary initialization in native code so that the Thread object is never observed with default initialized fields
Setting the TID is possible but the problem is giving it a valid TID value: it must be unique across all live threads and it mustn't change. We could maintain a native "nextTID" value that starts at max and works down to avoid conflict with the Thread version that starts at zero and works up.
Setting the name might be possible, if creating a Java String isn't a Java call. It doesn't appear to be but I'm not clear on how the allocation is handled and whether GC might get involved. Setting the right name is harder because we don't know what count the Thread class is up to. This might seem minor but someone is bound to complain if they suddenly get Thread-49875645321 instead of Thread-5.
(b) Make the sequence of attaching the partly initialized Thread and completing the call to the constructor, atomic.
Holding any lock whilst making the up-call into Java would be very risky. To make it unacceptably risky you need to realize that the Thread constructor invokes methods on the installed SecurityManager, and these are non-final methods and so to all intents and purposes, we could end up executing application-defined code, which could do anything.
(c) have the Thread object construction performed by the VM thread and hand it back to the native thread
This is certainly possible, but the performance implications make it impractical. For a point of reference: asking the VMThread to obtain the stracktrace of the current thread is 10X slower than getting the current thread's stacktrace directly. We do not want to bring the VM to a safepoint each time JNI_AttachCurrentThread is called.
(d) allow the native thread to temporarily impersonate an existing fully-initialized Thread
When the VM is created we create a java.lang.Thread object with a valid name (eg "jni_attaching_thread") and TID. Instead of binding the JavaThread with the newly allocated Thread object we bind it with this pre-existing object for the duration of the constructor call. (Note the preexisting Thread would have to satisfy the "parent" role that the Thread constructor expects.) After the constructor we re-bind the JavaThread to the now fully constructed new Thread.
This seems doable. The problem is dealing with concurrent JNI_AttachCurrentThread calls. If the pre-initialized Thread is shared then we need to serialize JNI_AttachCurrentThread by taking a lock. This is less risky for a dedicated lock than using the ThreadList lock, but still holds some risk. There are also performance implications if we serialize JNI_Attach_CurrentThread
(e) patch the library code to skip Threads with default initialized fields (in this case have the ThreadMXBean ignore any thread with a TID of zero)
A simple and immediate fix, but probably short-sighted. It would probably comes as a surprise to most library developers that they might encounter partly initialized Thread objects, and it seems likely that other bugs like this exist in library code.
(f) hide the partly initialized thread from those parts of the system that might fail if they encounter a partly initialized Thread
The list of JavaThreads is obtained using the ThreadsListEnumerator, and this gets used by three clients:
- the Thread class (getAllStackTraces()) and ThreadMXbean class - both via JVM_GetAllThreads; and
- JVMTI via JVMTIEnv::GetAllThreads
We could add state to JavaThread that tracks if the thread "is attaching", the enumerator could then skip threads that are attaching.
This solution would be the obvious choice if not for one thing: the attaching thread might be a thread of interest to Thread.getAllStackTraces, or the JVMTI client. The reason being that, as stated previously, the thread could end up executing arbitrary application code via the Thread constructor and its calls to the SecurityManager; further the thread may hold monitor locks due to calls on ThreadGroup and/or SecurityManager. Note: VM operations like dumping stacks, tracing deadlocks etc do not use the ThreadsListEnumerator so would have direct control over the threads they see - and must take care to watch for things like NULL thread names.
I think (f) is the way to proceed, but it requires feedback from the MMX, Thread and JVMTI folk as to whether this change in behaviour would be acceptable. The MMX folk should be okay with the change as they are the ones getting the exception because of this. JVMTI might be more concerned, but note that during the constructor call the JVMTI "thread started" events have not been posted for this thread anyway.