United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: 7003860 G1: assert(_cur_alloc_region == NULL || !expect_null_cur_alloc_region) fails
7003860 : G1: assert(_cur_alloc_region == NULL || !expect_null_cur_alloc_region) fails

Details
Type:
Bug
Submit Date:
2010-12-01
Status:
Closed
Updated Date:
2011-03-07
Project Name:
JDK
Resolved Date:
2011-03-07
Component:
hotspot
OS:
generic
Sub-Component:
gc
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
hs20
Fixed Versions:
hs20

Related Reports
Backport:
Backport:
Relates:

Sub Tasks

Description
We're hitting a failure in the nightlies after I pushed the slow allocation path changes (6974966). It looks like this:

[2010-12-01T10:24:53.05] # To suppress the following error report, specify this argument
[2010-12-01T10:24:53.05] # after -XX: or in .hotspotrc:  SuppressErrorAt=/g1CollectedHeap.cpp:924
[2010-12-01T10:24:53.05] #
[2010-12-01T10:24:53.05] # A fatal error has been detected by the Java Runtime Environment:
[2010-12-01T10:24:53.05] #
[2010-12-01T10:24:53.05] #  Internal Error (/tmp/jprt/P1/B/162626.ap31282/source/src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp:924), pid=23081, tid=1096509776
[2010-12-01T10:24:53.05] #  assert(_cur_alloc_region == NULL || !expect_null_cur_alloc_region) failed: The current alloc region should only be non-NULL if we're expecting it not to be NULL
[2010-12-01T10:24:53.05] #
[2010-12-01T10:24:53.05] # JRE version: 7.0
[2010-12-01T10:24:53.05] # Java VM: OpenJDK 64-Bit Server VM (20.0-b03-201011301626.ap31282.hotspot-g1-push-fastdebug mixed mode linux-amd64 compressed oops)
[2010-12-01T10:24:53.05] # An error report file with more information is saved as:
[2010-12-01T10:24:53.05] # /export/local/40214.JDK7.NIGHTLY.VM+linux-amd64_vm_server_mixed_nsk.stress.testlist/results/ResultDir/jck122001/hs_err_pid23081.log
[2010-12-01T10:24:53.05] #
[2010-12-01T10:24:53.05] # If you would like to submit a bug report, please visit:
[2010-12-01T10:24:53.05] #   http://java.sun.com/webapps/bugreport/crash.jsp
[2010-12-01T10:24:53.05] #

                                    

Comments
EVALUATION

http://hg.openjdk.java.net/jdk7/build/hotspot/rev/d9310331a29c
                                     
2010-12-25
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/d9310331a29c
                                     
2010-12-09
SUGGESTED FIX

The failing test ran for almost 900 iterations with the failures after the fix was applied (it'd fail usually on the first attempt before).
                                     
2010-12-02
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/d9310331a29c
                                     
2010-12-02
SUGGESTED FIX

One liner!

@@ -1594,7 +1598,7 @@
   assert(regions_accounted_for(), "Region leakage!");
 
   return attempt_allocation_at_safepoint(word_size,
-                                      true /* expect_null_cur_alloc_region */);
+                                     false /* expect_null_cur_alloc_region */);
                                     
2010-12-01
EVALUATION

This is caused by an incorrect assumption. We do an allocation attempt at the end of expand_and_allocate() and we claim that the current alloc region should be NULL (expect_null_cur_alloc_region == true):

HeapWord* G1CollectedHeap::expand_and_allocate(size_t word_size) {
  ...

  return attempt_allocation_at_safepoint(word_size,
                                     true /* expect_null_cur_alloc_region */);
}

However, this is incorrect. expand_and_allocate() is called from satisfy_failed_allocation():

HeapWord*
G1CollectedHeap::satisfy_failed_allocation(size_t word_size,
                                           bool* succeeded) {
  ....
  HeapWord* result = attempt_allocation_at_safepoint(word_size,
                                     false /* expect_null_cur_alloc_region */);
  if (result != NULL) {
    assert(*succeeded, "sanity");
    return result;
  }
  ...
  result = expand_and_allocate(word_size);
  ...

Note that in the first allocation attempt in satisfy_failed_allocation() we pass false for expect_null_cur_alloc_region. But, by the time we get into expand_and_allocate() we haven't done any attempt to retire the current alloc region. So, the assumption re: the current alloc region being NULL or not should be the same in both places and, specifically, expect_null_cur_alloc_region should be false in both places.

The failure is triggered by a humongous allocation attempt. The original allocation attempts in mem_allocate() failed so a safepoint was triggered which called the satisfy_failed_allocation() method. It's perfectly correct to have a non-NULL current alloc region at this point as the earlier allocation attempts did not use it as the request is for a humongous object.
                                     
2010-12-01



Hardware and Software, Engineered to Work Together