Adding some of the explanations in the JDC comments.
Q. Could you change the implementation to use data structures of a fixed size, pre-allocated so that errors just send you back to a known state?
A. Regarding the comment about using pre-allocated fixed size data structures to avoid the problem of the JVM not being able to get additional memory during a garbage collection, the garbage collector does use pre-allocated data structures. However, there are data structures that may need space that is proportional to the amount of live data in the heap. It is very unusual that this actually occurs so space is not pre-allocated for this contingency (because such pre-allocated space would not be available to the application). Rather a fixed amount of space is pre-allocated and more is requested only in the rare situations where it is actually needed. Also there are a few situations where the JVM has requested that the operating system reserve space for the JVM and when the space is actually needed, it is not available.
Q. How about simply releasing the jvm memory consumed and returning to a fresh state so it can continue servicing new calls and throw a error for existing processes that were consuming memory.
A. Regarding the comment from the respondent, if I understand correctly, you are asking if an out-of-memory can be sent only to the thread that is consuming memory and resulted in the JVM calling vm_exit_out_of_memory(). As relates to 4697804 the JVM is trying to get memory so that it can complete a garbage collection. As such there is no free space in the Java heap so that any request for the allocation of an object from any thread would fail. The JVM could release all its ancillary data structures but that doesn't change the fact that the Java heap needs to be collected.
Sometimes when the VM is in a low-resource situation, it currently
cannot throw an OutOfMemory error because the heap is corrupt -- no
further Java code can run. This often happens when we're trying to
expand the heap and cannot successfully malloc space for the supplemental
data structures that we need.
When this happens, the VM prints OutOfMemory and exits (cleanly, rather
than with a fault).
(see bug 4317486 [sol/c1/b04] assert generation.cpp 469 during scavenge.)
We should try to ensure we can never get into such a state by atomically
allocating all data that we require, so we can roll-back to a consistant
I haven't yet found where exactly we're calling vm_exit_out_of_memory
for this test case (and it might change from machine to machine, depending
on how many resources we have)... but doing so might give us a good place
to start cleaning this up.
Please note that in typical use, users are not trying to provoke
OutOfMemory exceptions. If they hit this condition, they'd see an indication
of what went wrong, e.g.
Exception java.lang.OutOfMemoryError: requested 1048592 bytes
and the VM will exit. This at least indicates a possible solution (bumping
-Xmx and perhaps -Xms).
This doesn't help deployed Java applications, though, that might catch the
OutOfMemory error and more robustly handle the situation, so anything changes
we can make to allow the error to be thrown would be useful.
The customer test case from the JDC comments shows one of the scenarios
where we can exit the VM: when we're attempting to resize the heap after
a GC and cannot get enough resources to do so. Growing the heap is
actually implemented as a multi-step process, as we also grow supporting
data structures for the heap.
Ideally, if one of these steps fail we should roll back to a known good
heap state. I'm looking into how to do that for the mantis time frame.
Note there are other failure modes which also need to be looked at.
Committing to fix the most common causes of this problem for Mantis, 1.4.2.
Further work showed us that while fixing the common cause of this, we
just moved the problem: typically we'd still get unrecoverable errors
and there was simply nothing Java code could do to attempt to clean up.
This is a very difficult problem to solve. We are working on some possible
solutions, but our best suggestion for users running into this problem is
to avoid the problem by increasing swap space on their machines so they
have enough to cover the Java application's needs plus all other applications
running on the machine at the same time.
###@###.### 2005-2-04 19:22:49 GMT