Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 4274779
Votes 6
Synopsis GZIPInputStream and InflaterInputStream is very memory inefficient
Category java:jar
Reported Against kestrel
Release Fixed 1.4.1(hopper)
State 10-Fix Delivered, Verified, bug
Priority: 4-Low
Related Bugs 4411230 , 4986239
Submit Date 23-SEP-1999
Description
JDK 1.2.2 (Win95)

The symptoms
------------
+ I am reading in a large  customer  net (~10 Mb uncompressed) which I
  wrote to disk compressed using GZIPOutputStream().
+ Because of the large size of my application I am starting with a
  large heap (256m), and must avoid garbage collection
  (above 200m garbage collection time rise very rapidly).
+ Reading in my large  customer  using GZIPInputStream caused memory
  to fill (invoking gc), even though there should have been plenty
  of space for the  customer 
+ Reading in the same  customer  uncompressed caused no problems.
+ using freeMemory() I determined that GZIPInputStream was using
  up roughly 10 times the uncompressed size of the  customer  in heap.

The Cause
---------
+ Digging into GZIPInputStream.java and InflaterInputStream.java
  I determined that each had one or two core member functions
  which declared new scratch array objects each time the function
  is called.  In particular, InflaterInputStream.read()
  contains the line:

    byte[] b = new byte[1];

  Seems pretty innocuous, but
  (a) "b" is a full array  customer , not just one byte, and
  (b) read() is called once for each byte of *uncompressed* data

Suggested Fix
-------------
I implemented the following changes in local copies of
GZIPInputStream.java and InflaterInputStream.java and reading now
take a small fixed amount of heap (~3Kb) regardless of file size.

InflaterInputStream.java

1) add member field: private byte[] b = new byte[512];

2) remove line 106 : read() : byte[] b = new byte[1];
   remove line 177 : skip() : byte[] b = new byte[512];

GZIPInputStream.java

1) add member field: private byte[] skipBuff = new byte[128];

2) remove line 215 : skip() : byte[] buf = new byte[128];

3) line 217 : skip() : change "buf" to "skipBuff"


Summary
-------
GZIP compression is most likely to be used with large objects and
large heap sizes.  Because garbage collection for large heap sizes
does not currently work adequately what should be just an inefficiency
in the inflater has become a serious liability.  I would,
therefore, recommend that these or some equivalent changes be made to
the JDK for the earliest possible release.

Note for SQE team: performance improvement, no test case needed
Work Around
N/A
Evaluation
This is a valid problem and we should fix it along the lines of the suggested
changes.

  xxxxx@xxxxx   1999-11-29
Comments
  
  Include a link with my name & email   


PLEASE NOTE: JDK6 is formerly known as Project Mustang