Leonid pointed me to the getthreshold* failures. They seem to be caused by the same root issues as the setthreshold* failures. Here's an example:
[2010-04-06T20:49:12.33] 0 pool java.lang:type=MemoryPool,name=G1 Old Gen
[2010-04-06T20:49:12.33] supports collection usage thresholds
[2010-04-06T20:49:12.33] setting threshold 266338304
[2010-04-06T20:49:12.33] # ERROR: Unexpected exception nsk.share.Failure: java.lang.IllegalArgumentException: Invalid threshold: 266338304 > max (265289728). in pool java.lang:type=MemoryPool,name=G1 Old Gen
[2010-04-06T20:49:12.33] nsk.share.Failure: java.lang.IllegalArgumentException: Invalid threshold: 266338304 > max (265289728).
[2010-04-06T20:49:12.33] at nsk.share.monitoring.Monitor.setLongAttribute(Monitor.java:421)
[2010-04-06T20:49:12.33] at nsk.share.monitoring.MemoryMonitor.setCollectionThresholdOnServer(MemoryMonitor.java:1067)
[2010-04-06T20:49:12.33] at nsk.share.monitoring.MemoryMonitor.setCollectionThreshold(MemoryMonitor.java:667)
[2010-04-06T20:49:12.33] at nsk.monitoring.MemoryPoolMBean.getCollectionUsageThreshold.getthreshold001.test(getthreshold001.java:80)
[2010-04-06T20:49:12.33] at nsk.monitoring.MemoryPoolMBean.getCollectionUsageThreshold.getthreshold001.run(getthreshold001.java:39)
[2010-04-06T20:49:12.34] at nsk.monitoring.MemoryPoolMBean.getCollectionUsageThreshold.getthreshold001.main(getthreshold001.java:17)
[2010-04-06T20:49:12.34] Caused by: java.lang.IllegalArgumentException: Invalid threshold: 266338304 > max (265289728).
[2010-04-06T20:49:12.34] at sun.management.MemoryPoolImpl.setCollectionUsageThreshold(MemoryPoolImpl.java:200)
[2010-04-06T20:49:12.34] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2010-04-06T20:49:12.34] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[2010-04-06T20:49:12.34] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[2010-04-06T20:49:12.34] at java.lang.reflect.Method.invoke(Method.java:597)
[2010-04-06T20:49:12.34] at com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:167)
[2010-04-06T20:49:12.34] at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:96)
[2010-04-06T20:49:12.34] at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:33)
[2010-04-06T20:49:12.34] at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeSetter(MBeanIntrospector.java:238)
[2010-04-06T20:49:12.34] at com.sun.jmx.mbeanserver.PerInterface.setAttribute(PerInterface.java:84)
[2010-04-06T20:49:12.34] at com.sun.jmx.mbeanserver.MBeanSupport.setAttribute(MBeanSupport.java:240)
[2010-04-06T20:49:12.35] at javax.management.StandardMBean.setAttribute(StandardMBean.java:369)
[2010-04-06T20:49:12.35] at nsk.share.monitoring.CustomMBeanServer.setAttribute(CustomMBeanServer.java:316)
[2010-04-06T20:49:12.35] at nsk.share.monitoring.Monitor.setLongAttribute(Monitor.java:418)
[2010-04-06T20:49:12.35] ... 5 more
One potential fix (pasted from an e-mail I sent to Mandy):
I can't think of a straightforward way of resolving all these failures in the G1 pools, provided we want to keep them organized the way they are now (i.e., separate pools for young, survivor, and old logical spaces). If we collapsed them into one big pool, we would certainly not have the above issues. But we will also lose a lot of helpful, IMHO, information.
Pasted from an e-mail I sent to Mandy. It explains the failures in the setthreshold00* tests, I'd guess the failures in the getthreshold00* tests would be of similar nature.
The failures are generally caused by the fact that G1 breaks some assumptions that the other GCs make. And, until now, the framework / tests have only been tested with the other GCs.
A couple of examples (from setthreshold002):
a) ERROR: Unexpected nsk.share.Failure: javax.management.RuntimeMBeanException: java.lang.IllegalArgumentException: Invalid threshold: 1071644672 > max (1070596096). in pool java.lang:type=MemoryPool,name=G1 Old Gen
This is caused by the fact that the max value can dynamically change for the G1 pools (as we had discussed in the past) and the pool seems to think it's lower than what it is currently.
b) ERROR: IllegalArgumentException is not thrown in pool java.lang:type=MemoryPool,name=G1 Eden for threshold 1048577 (init = 1048576(1024K) used = 0(0K) committed = 0(0K) max = 1048576(1024K))
This is a similar issue. The test tried to set the threshold to max + 1 but didn't get an error given that the max has increased (note the pool is still reporting the old max, I'm not quite sure what's the max that's being compared against here; maybe the max was saved when the pool was initialized?).
Note that the issues are not only test issues but also framework issues. b) might be a test issue (the test assumes that max will not change and assumes that max + 1 will be invalid). However, a) is really a framework issue which is making assumptions that do not hold in G1.
As you might recall, we have a "funky" way to calculate the maximum for the G1 spaces:
// 3) Another decision that is again not straightforward is what is
// the max size that each memory pool can grow to. Right now, we set
// that the committed size for the eden and the survivors and
// calculate the old gen max as follows (basically, it's a similar
// pattern to what we use for the committed space, as described
// old_gen_max = overall_max - eden_max - survivor_max
One way I thought of fixing the issues is to set max for each G1 space to "total reserved". This does seem to eliminate the failures in the setthreshold00* tests. However, it might break other assumptions (i.e., the total max of all G1 spaces will be larger than the total reserved space) and I wouldn't be surprised if it causes failures elsewhere (JConsole or VisualVM?).
I also noticed that the API for getMax() says the following:
"This method returns -1 if the maximum memory size is undefined. "
Is it worth trying to make the max undefined (at least for the young / survivor pools, we could use total reserved for the old pool)? How do I do this? I tried returning -1 from max_size() (am I supposed to do this? max_size() returns size_t). But I got another failure from the same test complaining that setting the threshold to (max + 1) did not throw an illegal argument exception. The reason for this is that the test doesn't seem to check whether max is valid or not. And it also uses abs(getMax()) as max for some reason. So, abs(-1) + 1 == 2, which is a valid threshold. (I suppose this really is a test issue)