SUGGESTED FIX
Deflate no more than stride bytes at a time. This avoids excess copying in deflateBytes.
|
|
|
EVALUATION
The problem is that when the client code invokes DeflaterOutputStream.write() with a byte[] that is much larger than the Deflater's buffer size (512 by default), that the client's byte[] can get copied many times in Deflater.c's deflateBytes function.
|
|
|
EVALUATION
Use of Deflater and Inflater becomes inefficient if the input buffer is very large
compared to the output buffer, due to repeated O(N**2) copying.
Here is, in my opinion, a better benchmark, that clearly illustrates the
loss of performance, without using unnecessary higher-level classes
like DeflaterOutputStream:
----------------------------------------------------------------
import java.util.*;
import java.util.zip.*;
public class Bench {
private static byte[] grow(byte[] a, int capacity) {
while (a.length < capacity) {
byte[] a2 = new byte[a.length * 2];
System.arraycopy(a, 0, a2, 0, a.length);
a = a2;
}
return a;
}
private static byte[] trim(byte[] a, int length) {
byte[] res = new byte[length];
System.arraycopy(a, 0, res, 0, length);
return res;
}
private static byte[] deflate(byte[] in) throws Throwable {
final Deflater flater = new Deflater();
flater.setInput(in);
flater.finish();
final byte[] smallBuffer = new byte[32];
byte[] flated = new byte[32];
int count = 0;
int n;
while ((n = flater.deflate(smallBuffer)) > 0) {
flated = grow(flated, count + n);
System.arraycopy(smallBuffer, 0, flated, count, n);
count += n;
}
return trim(flated, count);
}
private static byte[] inflate(byte[] in) throws Throwable {
final Inflater flater = new Inflater();
flater.setInput(in);
final byte[] smallBuffer = new byte[32];
byte[] flated = new byte[32];
int count = 0;
int n;
while ((n = flater.inflate(smallBuffer)) > 0) {
flated = grow(flated, count + n);
System.arraycopy(smallBuffer, 0, flated, count, n);
count += n;
}
return trim(flated, count);
}
public static void main(String[] args) throws Throwable {
byte[] data = new byte[1024*1024];
new Random().nextBytes(data);
byte[] deflated = deflate(data);
byte[] inflated = inflate(deflated);
if (! Arrays.equals(data,inflated))
throw new Error();
}
}
----------------------------------------------------------------
1.4.2_08
==> javac -source 1.4 Bench.java
==> java -esa -ea Bench
jver $v jr Bench 2.27s user 0.53s system 41% cpu 6.802 total
1.4.2_09
==> javac -source 1.4 Bench.java
==> java -esa -ea Bench
jver $v jr Bench 2.30s user 0.42s system 28% cpu 9.479 total
1.4.2_10
==> javac -source 1.4 Bench.java
==> java -esa -ea Bench
jver $v jr Bench 197.51s user 0.51s system 75% cpu 4:20.80 total
1.4.2_11
==> javac -source 1.4 Bench.java
==> java -esa -ea Bench
jver $v jr Bench 47.22s user 0.56s system 72% cpu 1:05.47 total
I agree with the submitter that this performance problem is important to fix.
|
|
|
WORK AROUND
Workaround from Dave Bristor(PDE) and complete explanation from him as well.
dout = new DataOutputStream(new GZIPOutputStream(sout));
Summary:
We can probably solve the performance issues by changing the above to:
dout = new DataOutputStream(
new BufferedOutputStream(new GZIPOutputStream(sout), 4096));
I have to guess, but from the fact that they're using a DataOutputStream and
from the stack trace you sent earlier, some pretty small things might be
written. By adding buffering, I hope to see a performance improvement.
What follows is maybe more detail than you need to know. But I had to investigate this some, and so did some testing.
As an example of the performance differences, see the attached test. It's an
extreme case, doing I/O byte-at-a-time. Here are some results with the same
JDK versions for which the customer reported timings. The "bytes" number is byte-at-a-time (no buffering), "sized" means that the GZIP stream was created with a 2nd size parameter, "buffered" means that the GZIP stream was wrapped in a Buffered stream e.g.
new BufferedOutputStream(new GZIPOutputStream(...))
and "reversed" means that the wrapping was done backwards, e.g.
new GZIPOutputStream(new BufferedOutputStream(...))
% jver 1.4.2_09 java GZIPTest;jver 1.4.2_09 java GZIPTest
writing bytes: 9769, for sized: 9491, for buffered: 970, for reversed: 9517
reading bytes: 5993, for sized: 5450, for buffered: 316, for reversed: 5516
writing bytes: 9650, for sized: 9687, for buffered: 964, for reversed: 9572
reading bytes: 5878, for sized: 5513, for buffered: 495, for reversed: 5548
% jver 1.4.2_11 java GZIPTest;jver 1.4.2_11 java GZIPTest
writing bytes: 13853, for sized: 13997, for buffered: 1014, for reversed: 13877
reading bytes: 8727, for sized: 8563, for buffered: 345, for reversed: 8461
writing bytes: 14116, for sized: 13842, for buffered: 1026, for reversed: 14215
reading bytes: 8660, for sized: 8633, for buffered: 338, for reversed: 8642
% jver 1.5.0_06 java GZIPTest;jver 1.5.0_06 java GZIPTest
writing bytes: 9538, for sized: 9655, for buffered: 1030, for reversed: 9848
reading bytes: 6476, for sized: 6353, for buffered: 369, for reversed: 6429
writing bytes: 9698, for sized: 9660, for buffered: 1033, for reversed: 9689
reading bytes: 6459, for sized: 6518, for buffered: 347, for reversed: 6415
(FWIW: GZIPTest was compiled with 1.4.2_09 for all runs, which were done on a SunBlade 150.)
Providing a size to GZIP stream constructors doesn't make much difference (the "sized" times). Also, that wrapping a GZIP stream with a Buffered stream gives about a 10x improvement, and that the differences across JDK versions are small.
Importantly, note that the "reversed" case is no better than byte-at-a-time. This means that even though the customer's streams which are being passed to the GZIP stream's constructor have some buffering, performance is still dominated by the deflating and inflating. I.e., I think that the "reversed" case is what the customer is getting with their code.
Why?
Consider only GZIPOutputStream. Each time GZIPOutputStream.write(byte[], int, int) is invoked, data is compressed. The goal is to invoke that as few times as possible. Only wrapping the GZIP stream with a Buffered stream accomplishes that.
|
|
|
WORK AROUND
Use 1.4.2_09 rather than 1.4.2_11-b02.
Customer can't use JRE 5.0 for the SAP Software,
|
|
|
|