|
Comments
|
Submitted On 18-MAY-2002
raojx
Ok, how come 4637640 is a duplicate of this one? Does that
mean they'll be fixed together? Because I don't think
they're related. Please take a closer look at 4637640, it
has the potential to cause very serious and hard to find
memory leak, or at least fix the javadoc so that people
will know substring and its parent share the same
underlying array.
Submitted On 29-AUG-2002
dlipp
As raojx wrote this is an extremely serious and hard to find
memory leak. We've been searching for it (in a complex
application) for several weeks right now. In combination with
heavy load and the implementation of the HotSpot GC we
always blew our VM (OutOfMemoryError) although we did not
keep references on most of this substrings.
It's unbelievable this bug is still there! Even if HotSpot GC was
working better under heavy load this would be unacceptable
as people might want to keep references of substrings in a
Hashtable or whatever.
At one point we import 30'000 records from file to db. Keeping
references on all Ids we already imported (first fews chars of
a record) gave us a memory waste of > 100 MB compared to
only keep the mini Strings i.e. created by new String
(unacceptableSubString) (!!!)
I cannot believe substring is still implemented that way.
Our workaround is not to use String.substring. We create an
empty char[] and copy what we want using the getChar(int,
int, char[], int) method.
By the way. We also replaced StringTokenizer as it uses the
substring mechanism...
Submitted On 15-NOV-2002
quelgar
I agree that this behaviour is not obvious. Even less
obvious is that you get a similar problem when using
String.substring() on a large String. In both cases, the
orignal backing char array is kept, even though only a small
part of it may be being used.
However, once you know that this is happening, you don't
need a trimToSize() method to fix it, just use the
String(String) constructor. This constructor copies the char
array if the source string has extra capacity - the comments
indicate this is explicitly for the purpose of trimming
baggage. Eg:
StringBuffer sb = new StringBuffer(1000000);
s.append('a');
String s = new String(sb.toString());
sb = null;
// that 1 million byte char array should be available for GC now
In fact, I cannot see any other use for the String(String)
constructor other than trimming the backing array. I think
the correct way to close this bug is for the documentation
for this constructor to explicitly spell out that it should
be used to trim the backing storage if you need to.
Submitted On 05-DEC-2002
bestsss
quelgar, simply use StringBuffer.substring(0); it's
more a bit more effective
Submitted On 11-FEB-2003
skells
[radical view] I have always thought that trimToSize() should
be a method on Object with empty implementation. This is
polymorphic behavior.
In the issue of the stringbuffer and string buffer reuse, i
believe that it is the job of the string constructor to decide if
there is too much wasted space, and it there is then copy to
a new array otherwise share. Something like
public String (StringBuffer buffer) {
synchronized(buffer) {
// not required remove, as it is the case anyway
this.offset = 0;
value = buffer.getValue();
count = buffer.length();
if (value.length > 16 && value.length > 2 * count) {
//there is too much wasted space, so copy the
value - too much to be detenmined by sun, maybe by
enviroment parameters
value = new char[count];
System.arraycopy(buffer.getValue(),0, this.value,
0, count);
} else {
buffer.setShared();
}
}
Submitted On 05-MAR-2003
rnaseef
I agree with markt regarding this feature (see bug 4310930).
Optimizations that are questionable such as this one must be
considered options to be explicitly agreed-to by the
programmer, not default behavior.
We have some java apps that are very quickly using up tons
of memory, probably because of this issue. We use
StringBuffer all over the place so the cost of fixing this may
be extremely high.
This is a memory leak in the classical sense. Memory is not
being freed even after the StringBuffer has been freed
because there are String objects around using some portion
of that memory.
Perhaps a more focused optimization for string+string is in
order here.
Submitted On 21-MAR-2003
verdy_p
For now the workaround is to use:
new String(stringBuffer.toString())
to allocate a String that won't leak unused space in the
StringBuffer,
and avoid reusing StringBuffers (which will constantly grow in
memory) because only 1 use really requires the indicated
length: when expanding StringBuffers, it performs an
unneeded copy of its existing content and it stresses the CPU.
Its much better to use this code before reusing StringBuffers:
stringBuffer.setLength(0);
stringBuffer.setLength(desiredCapacity);
The setLength(0) call will force reducing the capacity of the
backing array to 16 characters by allocating a new array.
The setLength(desiredCapacity) will reallocate this small
buffer and initialize it to 0s.
I just wonder why the StringBuffer initialize these characters
to 0 when expanding or allocating this array. May be this is
for security reasons, but I think that this should be the role of
the native VM allocator which will perform this array erasure
much faster than in the StringBuffer.java class.
In my opinion, after all calls to "new char[length]" one should
not need to clear the content of the allocated array and so
the StringBuffer does not need to perform this initialization
too: this is waste of time and of CPU ressource.
Submitted On 21-MAR-2003
verdy_p
This bug also affects String+String operations that are
compiled as:
new StringBuffer().append(str1).append(str2).toString()
I.e. it reallocates the backing store in the StringBuffer for
each string appended, requiring backing store expansion with
copies. But this expansion uses a new size which is the
largest of twice (the older size +1) and the needed size.
When appending small strings to an already large string, this
expansion wastes memory.
The typical leak occurs when appending characters or lines in
a large StringBuffer for a whole text: the wasted space can
be but to 50% in the final String returned by toString().
Partial Workaround:
-> replace "string1+string2" by "new String(string1+string2)"
You won't avoid the StringBuffer expansion with intermediate
copies (which stresses the CPU), but at least the backing
store shared in the return of toString() will be deallocated,
and the new String will fit exactly the string size.
Better workaround:
replace "string1+string2;" by:
new StringBuffer(string1.length()+string2.length())
.append(string1).append(string2).toString()
This version allocates the backing store only once, removes
all expansions and additional copies, and return a String that
wastes no space, by just sharing the backingbuffer used in
StringBuffer.
This code should be the one compiled for String
concatenations with the + operator: il will just require
creating temporary variables in the bytecode to store the
evaluation of each String operand, then dereferencing them in
the correct left-to-right order to get their length, and finally
append them in the StringBuffer as before, before freeing the
temporary variables from the bytecode operand stack.
Because such optimization does only uses a temporary
StringBuffer that is not used after calling toString(), there is
no need to mark the StringBuffer with setShared(): we could
remove this operation, by making StringBuffer inherit from a
new internal class "TemporaryStringBuffer" that would be used
for String Concatenations, so that the String(StringBuffer)
constructor used by the StringBuffer.toString() method will
not need to set it to shared state (the setShared() method
would be a noop for TemporaryStringBuffers).
Submitted On 11-OCT-2006
diroussel2
Note that the fix Java 1.5 is not just to add trimToSize(), but also to change the toString() implementation from
return new String(this);
to
return new String(value, 0, count);
so it does look like the bug is fixed properly
PLEASE NOTE: JDK6 is formerly known as Project Mustang
|