United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: 4979031 BitSet.toString() is too slow on sparse large bitsets
4979031 : BitSet.toString() is too slow on sparse large bitsets

Details
Type:
Enhancement
Submit Date:
2004-01-15
Status:
Resolved
Updated Date:
2005-02-19
Project Name:
JDK
Resolved Date:
2005-02-19
Component:
core-libs
OS:
windows_xp
Sub-Component:
java.util
CPU:
x86
Priority:
P5
Resolution:
Fixed
Affected Versions:
1.4.2
Fixed Versions:
6

Related Reports

Sub Tasks

Description
Name: jl125535			Date: 01/15/2004


A DESCRIPTION OF THE REQUEST :
The BitSet.toString() method is too slow

JUSTIFICATION :
On sparse large bitsets, the BitSet.toString() method takes a considerable time to scan the whole set, when only a few elements are actually set in the BitSet.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
A much faster code would use the nextClearBit(int) and nextSetBit(int) methods to remove a lot of method calls, and use their optimized scanners.
ACTUAL -
It scans each bit one efter the other one, to see which one is set and must be sent to the output string buffer:

    public String toString() {
        int numBits = unitsInUse << ADDRESS_BITS_PER_UNIT;
        StringBuffer buffer = new StringBuffer(8 * numBits + 2);
        String separator = "";
        buffer.append('{');

        for (int i = nextSetBit(0); i < numBits; i++) {
            if (get(i)) {
                buffer.append(separator);
                separator = ", ";
                buffer.append(i);
            }
        }

        buffer.append('}');
        return buffer.toString();
    }


---------- BEGIN SOURCE ----------
Faster implementation:

    public String toString() {
        int numBits = unitsInUse << ADDRESS_BITS_PER_UNIT;
        StringBuffer buffer = new StringBuffer(7 * numBits + 1);
        String separator = "";
        buffer.append('{');

        for (int i = nextSetBit(0); i >= 0; i = nextSetBit(i + 1)) {
            buffer.append(separator);
            separator = ", ";
            buffer.append(i);
        }

        buffer.append('}');
        return buffer.toString();
    }

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Don't use BitSet.toString(). But create a derived class based on BitSet that overrides the toString() method with the faster implementation above.

Also, if capacity() is optimized later to cache in a transient field the number of bits actually set, use it instead of the excessive "numBits" to determine the initial allocation size for the working StringBuffer, as the method will return an excessively big string backing store in the case of large and sparse BitSet's.

(Note that the current method as well as this patch may simply fail to allocate the backing store with Bitsets containing a single bit set at a position of more than 256K):
Example:
  bs = new BitSet(300000);
  bs.set(265000);
  System.out.println(bs.toString()); // fails!

In that case we need to call capacity() to determine the initial size of StringBuffer. As currently it's impossible to have more than 6 digits in decimal numbers printed by BitSet.toString(), the multiplier is also excessive. It should be 7 instead of 8.
(Incident Review ID: 233281) 
======================================================================

                                    

Comments
EVALUATION

This can be investigated although since this is a general libraries class we are not interested in specializing the implementation for one particular type of usage pattern, such as sparse large bitsets.
###@###.### 2004-01-16

I like the submitter's ideas, and am implementing them.
###@###.### 2005-1-27 03:12:37 GMT
                                     
2004-01-16



Hardware and Software, Engineered to Work Together