United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: 6904967 G1: some CollectionUsageThreshold tests fail
6904967 : G1: some CollectionUsageThreshold tests fail

Details
Type:
Bug
Submit Date:
2009-11-25
Status:
Closed
Updated Date:
2011-03-08
Project Name:
JDK
Resolved Date:
2011-03-08
Component:
hotspot
OS:
generic
Sub-Component:
gc
CPU:
generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
hs17
Fixed Versions:
hs17

Related Reports
Backport:
Backport:
Relates:
Relates:

Sub Tasks

Description
Several tests involving CollectionUsageThreshold, mostly from the nsk.quick-monitoring suite, fail on all platforms:

nsk/monitoring/MemoryPoolMBean/isCollectionUsageThresholdExceeded/isexceeded001
nsk/monitoring/MemoryPoolMBean/isCollectionUsageThresholdExceeded/isexceeded002
nsk/monitoring/MemoryPoolMBean/isCollectionUsageThresholdExceeded/isexceeded004
nsk/monitoring/MemoryPoolMBean/isCollectionUsageThresholdExceeded/isexceeded005
nsk/monitoring/MemoryPoolMBean/setCollectionUsageThreshold/setthreshold002
nsk/monitoring/MemoryPoolMBean/setCollectionUsageThreshold/setthreshold003
nsk/monitoring/MemoryPoolMBean/setCollectionUsageThreshold/setthreshold004
nsk/monitoring/MemoryPoolMBean/setCollectionUsageThreshold/setthreshold005
java/lang/management/MemoryPoolMXBean/ThresholdTest.java

Here's a partial log from the first test above:

[2009-11-25T10:03:39.97] MemoryMonitor   > Test mode:	DIRECTLY access to MBean
[2009-11-25T10:03:39.97] MemoryMonitor   > Memory:	heap	(This setting is used in lowmem* tests only)
[2009-11-25T10:03:39.97] MemoryMonitor   > Monitoring:	notification	(This setting is used in lowmem* tests only)
[2009-11-25T10:03:39.97] MemoryMonitor   > Threshold:	usage	(This setting is used in lowmem* tests only)
[2009-11-25T10:03:39.97] MemoryMonitor   > Timeout:	30	(This setting is used in lowmem* tests only)
[2009-11-25T10:03:39.97] 0 pool Code Cache
[2009-11-25T10:03:39.97]   does not support collection usage thresholds
[2009-11-25T10:03:39.97]   UnsupportedOperationException is thrown
[2009-11-25T10:03:39.97] 1 pool G1 Eden
[2009-11-25T10:03:39.97]   supports collection usage thresholds
[2009-11-25T10:03:39.97]   setting threshold 1 init = 1048576(1024K) used = 0(0K) committed = 0(0K) max = 1048576(1024K)
[2009-11-25T10:03:39.97]   threshold 1 is set, (used = 0, isExceeded = false)
[2009-11-25T10:03:39.97]   used value (0) did not cross the threshold value (1)
[2009-11-25T10:03:39.97] 2 pool G1 Survivor
[2009-11-25T10:03:39.97]   supports collection usage thresholds
[2009-11-25T10:03:39.97]   setting threshold 0 init = 0(0K) used = 0(0K) committed = 0(0K) max = 0(0K)
[2009-11-25T10:03:39.97]   threshold 0 is set, (used = 0, isExceeded = false)
[2009-11-25T10:03:39.97]   used Turning off use of shared archive because of choice of garbage collector or large pages 
[2009-11-25T10:03:40.00] value (0) crossed the threshold value (0)
[2009-11-25T10:03:40.00] # ERROR: isCollectionUsageThresholdExceeded() returned false, while threshold = 0 and used = 0
[2009-11-25T10:03:40.00] 3 pool G1 Old Gen
[2009-11-25T10:03:40.00]   supports collection usage thresholds
[2009-11-25T10:03:40.00]   setting threshold 1 init = 200278016(195584K) used = 0(0K) committed = 0(0K) max = 3218079744(3142656K)
[2009-11-25T10:03:40.00]   threshold 1 is set, (used = 0, isExceeded = false)
[2009-11-25T10:03:40.00]   used value (0) did not cross the threshold value (1)
[2009-11-25T10:03:40.00] 4 pool G1 Perm Gen
[2009-11-25T10:03:40.00]   supports collection usage thresholds
[2009-11-25T10:03:40.00]   setting threshold 1 init = 20971520(20480K) used = 0(0K) committed = 0(0K) max = 88080384(86016K)
[2009-11-25T10:03:40.00]   threshold 1 is set, (used = 0, isExceeded = false)
[2009-11-25T10:03:40.00]   used value (0) did not cross the threshold value (1)
[2009-11-25T10:03:40.00] TEST FAILED
[2009-11-25T10:03:40.00] 
[2009-11-25T10:03:40.00] 
[2009-11-25T10:03:40.00] #>  
[2009-11-25T10:03:40.00] #>  SUMMARY: Following errors occured
[2009-11-25T10:03:40.00] #>      during test execution:
[2009-11-25T10:03:40.00] #>  
[2009-11-25T10:03:40.00] # ERROR: isCollectionUsageThresholdExceeded() returned false, while threshold = 0 and used = 0
[2009-11-25T10:03:40.00] # Test level exit status: 97

                                    

Comments
EVALUATION

The suggested fix resolves the failures it was supposed to resolve in the nightlies.
                                     
2009-12-09
EVALUATION

To summarize. Out of the three distinct failures we see, the following two (setCollectionUsageThreshold) are caused by test issues which do not take into account the fact that max is not constant over time:

[2009-11-25T13:12:56.92] # ERROR: Unexpected java.lang.IllegalArgumentException: Invalid threshold: 932184064 > max (930086912). in pool G1 Old Gen

[2009-11-25T13:12:56.92] # ERROR: IllegalArgumentException is not thrown in pool G1 Eden for threshold 1048577 (init = 1048576(1024K) used = 0(0K) committed = 0(0K) max = 1048576(1024K)) 

We will only fix the third type of failure (isCollectionUsageThresholdExceeded):

[2009-11-25T13:11:59.71] # ERROR: isCollectionUsageThresholdExceeded() returned false, while threshold = 0 and used = 0
                                     
2009-12-07
SUGGESTED FIX

For the failure that we'll fix (isCollectionUsageThresholdExceeded):

[2009-11-25T13:11:59.71] # ERROR: isCollectionUsageThresholdExceeded() returned false, while threshold = 0 and used = 0

it's sufficient to ensure that max / committed return a non-zero value (let's say a minimum of the G1 region size). See Evaluation for more details.
                                     
2009-12-07
EVALUATION

From an e-mail I sent:

One more thing. If I change the test to get the max from getUsage() instead of getCollectionUsage() the test passes and both failures are eliminated:

[2009-11-25T13:12:56.92] # ERROR: Unexpected java.lang.IllegalArgumentException: Invalid threshold: 932184064 > max (930086912). in pool G1 Old Gen

[2009-11-25T13:12:56.92] # ERROR: IllegalArgumentException is not thrown in pool G1 Eden for threshold 1048577 (init = 1048576(1024K) used = 0(0K) committed = 0(0K) max = 1048576(1024K))

Thinking about this for a minute, yes, I understand why the test does getCollectionUsage() instead of getUsage(). So, the real cause for both failures is really that max is assumed not to change, right?
                                     
2009-12-07
EVALUATION

From an e-mail I sent:

> [2009-11-25T13:12:56.92] # ERROR: IllegalArgumentException is not thrown in pool G1 Eden for threshold 1048577 (init = 1048576(1024K) used = 0(0K) committed = 0(0K) max = 1048576(1024K))
>
> I'm not quite sure why this happens. I'm investigating right now...

I _think_ I have an explanation for this.

The test (nsk/monitoring/MemoryPoolMBean/setCollectionUsageThreshold//setthreshold004) gets max by doing getCollectionUsage() on the pool.

>>>>   init = 1048576(1024K) used = 0(0K) committed = 0(0K) max = 1048576(1024K)
>>>> max = 1048576

Then it tries to do setCollectionUsageThreshold with max + 1 (i.e., 1048577) and it doesn't get an illegal argument error.

# ERROR: IllegalArgumentException is not thrown in pool G1 Eden for threshold 1048577 (init = 1048576(1024K) used = 0(0K) committed = 0(0K) max = 1048576(1024K))

Now, the setCollectionUsageThreshold(), it gets max from getUsage() (the native method getUsage0() in fact) and that actually returns a different set of values:

>>> init = 1048576(1024K) used = 2097152(2048K) committed = 2097152(2048K) max = 2097152(2048K)

(and I'm pretty sure this is the max that setCollectionUsageThreshold uses to do the bounds check, which is why it succeeds)

So, is the test correct? Should it get the max value from getUsage() instead of getCollectionUsage()?
                                     
2009-12-07
EVALUATION

Mandy's reply:

> So, is the test correct? Should it get the max value from getUsage() instead of getCollectionUsage()?

This is a test bug.  Since max can be changing as specified in the spec, the test has to use a better way to determine a threshold value big enough to cause the IAE.  The usage or collection usage is not a reliable source for such test case.

G1 is the first collector that changes the max value dynamically (to be accurate, it's the first collector being tested with varying max value).  I am sorry for causing more work for you.

> [2009-11-25T13:12:56.92] # ERROR: Unexpected java.lang.IllegalArgumentException: Invalid threshold: 932184064 > max (930086912). in pool G1 Old Gen
>
> This happens because the max size of the old gen actually decreases over time (as the max size of the eden and survivor increase; remember: old max  = overall_max - eden_max - survivor_max). Mandy: is this a test bug?

This is a test bug again although I think the API has to be enhanced to help determining the right threshold to set with variable max value (e.g. setting a percentage instead of a fixed value in bytes).  I will file a RFE to add the new API for that.

> Or shall I bail out of the policy of having a dynamically changing max?

I can imagine that it's not easy to set a fixed max value for G1 since their sum should be equal to the max heap size returned from MemoryMXBean.  Or set the max to undefined (-1) which may uncover more test bugs.  We should check with SQE if the NSK monitoring tests deal with undefined max well (but I doubt it as the test failures you ran into use the max value to determine the threshold).

Mandy
                                     
2009-12-07
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/9118860519b6
                                     
2009-12-07
EVALUATION

Thanks to Mandy for pointing me to 6546089. This was the reason why I couldn't reproduce the isCollectionUsageThresholdExceeded (I was using a JDK 6 release to test my libjvm.so). With a JDK 7 release I can now reproduce the isCollectionUsageThresholdExceeded failures. Setting the max and committed sizes to be at least as large as a region size (i.e., > 0) resolves this failure.
                                     
2009-12-04
PUBLIC COMMENTS

This (clearly!) looks related to the introduction of the G1 memory pools (see CR 6815790).
                                     
2009-11-26
EVALUATION

The failing tests seem to fall into two categories: isCollectionUsageThresholdExceeded and setCollectionUsageThreshold. I haven't been able to reproduce the former, I've been able to reproduce some of the latter).

For the former (isCollectionUsageThresholdExceeded tests), here's the typical error:

[2009-11-25T13:11:59.62] 2 pool G1 Survivor
[2009-11-25T13:11:59.62]   supports collection usage thresholds
[2009-11-25T13:11:59.62]   setting threshold 0 init = 0(0K) used = 0(0K) committed = 0(0K) max = 0(0K)
[2009-11-25T13:11:59.62]   threshold 0 is set, (used = 0, isExceeded = false)
[2009-11-25T13:11:59.62]   used value (0) crossed the threshold value (0)
[2009-11-25T13:11:59.71] # ERROR: isCollectionUsageThresholdExceeded() returned false, while threshold = 0 and used = 0

What the test does is to set the threshold to used + 1 and notice that the pool is reporting that the threshold has not been crossed. However, it also uses max as a ceiling for used + 1 and in G1, unlike the other GCs, max can be 0 (for the eden and survivor pools). So, the threshold is actually set to 0 and it looks as if isCollectionUsageThresholdExceeded() returns true of used == threshold (which is true in this case: 0 == 0).

from ThresholdSupport class in lowMemoryDetector.cpp:

 bool        is_high_threshold_crossed(MemoryUsage usage) {
   if (_support_high_threshold && _high_threshold > 0) {
     return (usage.used() >= _high_threshold);
   }
   return false;
 }

First, should isCollectionUsageThresholdExceeded() return true if used == threshold? Second, is it reasonable for the test to assume that max > 0? I think I can fix that by assuming that the max of each pool is at least region_size.

For the latter (setCollectionUsageThreshold tests), which I can reproduce, these are the two errors I see:

[2009-11-25T13:12:56.92] # ERROR: Unexpected java.lang.IllegalArgumentException: Invalid threshold: 932184064 > max (930086912). in pool G1 Old Gen
[2009-11-25T13:12:56.92] # ERROR: IllegalArgumentException is not thrown in pool G1 Eden for threshold 1048577 (init = 1048576(1024K) used = 0(0K) committed = 0(0K) max = 1048576(1024K))

I'm still trying to figure out what's causing (I _think_ the first one might be caused by the fact that in G1 the max size of the old gen can change, and in fact decrease, over time).
                                     
2009-11-26



Hardware and Software, Engineered to Work Together