Class BZip2CompressorOutputStream

java.lang.Object
java.io.OutputStream
org.apache.commons.compress.compressors.CompressorOutputStream
org.apache.commons.compress.compressors.bzip2.BZip2CompressorOutputStream
All Implemented Interfaces:
Closeable, Flushable, AutoCloseable, BZip2Constants

public class BZip2CompressorOutputStream extends CompressorOutputStream implements BZip2Constants
An output stream that compresses into the BZip2 format into another stream.

The compression requires large amounts of memory. Thus you should call the close() method as soon as possible, to force BZip2CompressorOutputStream to release the allocated memory.

You can shrink the amount of allocated memory and maybe raise the compression speed by choosing a lower blocksize, which in turn may cause a lower compression ratio. You can avoid unnecessary memory allocation by avoiding using a blocksize which is bigger than the size of the input.

You can compute the memory usage for compressing by the following formula:

 <code>400k + (9 * blocksize)</code>.
 

To get the memory required for decompression by BZip2CompressorInputStream use

 <code>65k + (5 * blocksize)</code>.
 
Memory usage by blocksize
Memory usage by blocksize
Blocksize Compression
memory usage
Decompression
memory usage
100k 1300k 565k
200k 2200k 1065k
300k 3100k 1565k
400k 4000k 2065k
500k 4900k 2565k
600k 5800k 3065k
700k 6700k 3565k
800k 7600k 4065k
900k 8500k 4565k

For decompression BZip2CompressorInputStream allocates less memory if the bzipped input is smaller than one block.

Instances of this class are not threadsafe.

TODO: Update to BZip2 1.0.1

  • Field Details

    • MIN_BLOCKSIZE

      public static final int MIN_BLOCKSIZE
      The minimum supported blocksize == 1.
      See Also:
    • MAX_BLOCKSIZE

      public static final int MAX_BLOCKSIZE
      The maximum supported blocksize == 9.
      See Also:
    • GREATER_ICOST

      private static final int GREATER_ICOST
      See Also:
    • LESSER_ICOST

      private static final int LESSER_ICOST
      See Also:
    • last

      private int last
      Index of the last char in the block, so the block size == last + 1.
    • blockSize100k

      private final int blockSize100k
      Always: in the range 0 .. 9. The current block size is 100000 * this number.
    • bsBuff

      private int bsBuff
    • bsLive

      private int bsLive
    • crc

      private final CRC crc
    • nInUse

      private int nInUse
    • nMTF

      private int nMTF
    • currentChar

      private int currentChar
    • runLength

      private int runLength
    • blockCRC

      private int blockCRC
    • combinedCRC

      private int combinedCRC
    • allowableBlockSize

      private final int allowableBlockSize
    • data

      All memory intensive stuff.
    • blockSorter

      private BlockSort blockSorter
    • out

      private OutputStream out
    • closed

      private volatile boolean closed
  • Constructor Details

    • BZip2CompressorOutputStream

      public BZip2CompressorOutputStream(OutputStream out) throws IOException
      Constructs a new BZip2CompressorOutputStream with a blocksize of 900k.
      Parameters:
      out - the destination stream.
      Throws:
      IOException - if an I/O error occurs in the specified stream.
      NullPointerException - if out == null.
    • BZip2CompressorOutputStream

      public BZip2CompressorOutputStream(OutputStream out, int blockSize) throws IOException
      Constructs a new BZip2CompressorOutputStream with specified blocksize.
      Parameters:
      out - the destination stream.
      blockSize - the blockSize as 100k units.
      Throws:
      IOException - if an I/O error occurs in the specified stream.
      IllegalArgumentException - if (blockSize < 1) || (blockSize > 9).
      NullPointerException - if out == null.
      See Also:
  • Method Details

    • hbMakeCodeLengths

      private static void hbMakeCodeLengths(byte[] len, int[] freq, BZip2CompressorOutputStream.Data dat, int alphaSize, int maxLen)
    • chooseBlockSize

      public static int chooseBlockSize(long inputLength)
      Chooses a blocksize based on the given length of the data to compress.
      Parameters:
      inputLength - The length of the data which will be compressed by BZip2CompressorOutputStream.
      Returns:
      The blocksize, between MIN_BLOCKSIZE and MAX_BLOCKSIZE both inclusive. For a negative inputLength this method returns MAX_BLOCKSIZE always.
    • write

      public void write(int b) throws IOException
      Specified by:
      write in class OutputStream
      Throws:
      IOException
    • writeRun

      private void writeRun() throws IOException
      Writes the current byte to the buffer, run-length encoding it if it has been repeated at least four times (the first step RLEs sequences of four identical bytes).

      Flushes the current block before writing data if it is full.

      "write to the buffer" means adding to data.buffer starting two steps "after" this.last - initially starting at index 1 (not 0) - and updating this.last to point to the last index written minus 1.

      Throws:
      IOException
    • finalize

      protected void finalize() throws Throwable
      Overridden to warn about an unclosed stream.
      Overrides:
      finalize in class Object
      Throws:
      Throwable
    • finish

      public void finish() throws IOException
      Throws:
      IOException
    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Overrides:
      close in class OutputStream
      Throws:
      IOException
    • flush

      public void flush() throws IOException
      Specified by:
      flush in interface Flushable
      Overrides:
      flush in class OutputStream
      Throws:
      IOException
    • init

      private void init() throws IOException
      Writes magic bytes like BZ on the first position of the stream and bytes indicating the file-format, which is huffmanised, followed by a digit indicating blockSize100k.
      Throws:
      IOException - if the magic bytes could not been written
    • initBlock

      private void initBlock()
    • endBlock

      private void endBlock() throws IOException
      Throws:
      IOException
    • endCompression

      private void endCompression() throws IOException
      Throws:
      IOException
    • getBlockSize

      public final int getBlockSize()
      Returns the blocksize parameter specified at construction time.
      Returns:
      the blocksize parameter specified at construction time
    • write

      public void write(byte[] buf, int offs, int len) throws IOException
      Overrides:
      write in class OutputStream
      Throws:
      IOException
    • write0

      private void write0(int b) throws IOException
      Keeps track of the last bytes written and implicitly performs run-length encoding as the first step of the bzip2 algorithm.
      Throws:
      IOException
    • hbAssignCodes

      private static void hbAssignCodes(int[] code, byte[] length, int minLen, int maxLen, int alphaSize)
    • bsFinishedWithStream

      private void bsFinishedWithStream() throws IOException
      Throws:
      IOException
    • bsW

      private void bsW(int n, int v) throws IOException
      Throws:
      IOException
    • bsPutUByte

      private void bsPutUByte(int c) throws IOException
      Throws:
      IOException
    • bsPutInt

      private void bsPutInt(int u) throws IOException
      Throws:
      IOException
    • sendMTFValues

      private void sendMTFValues() throws IOException
      Throws:
      IOException
    • sendMTFValues0

      private void sendMTFValues0(int nGroups, int alphaSize)
    • sendMTFValues1

      private int sendMTFValues1(int nGroups, int alphaSize)
    • sendMTFValues2

      private void sendMTFValues2(int nGroups, int nSelectors)
    • sendMTFValues3

      private void sendMTFValues3(int nGroups, int alphaSize)
    • sendMTFValues4

      private void sendMTFValues4() throws IOException
      Throws:
      IOException
    • sendMTFValues5

      private void sendMTFValues5(int nGroups, int nSelectors) throws IOException
      Throws:
      IOException
    • sendMTFValues6

      private void sendMTFValues6(int nGroups, int alphaSize) throws IOException
      Throws:
      IOException
    • sendMTFValues7

      private void sendMTFValues7() throws IOException
      Throws:
      IOException
    • moveToFrontCodeAndSend

      private void moveToFrontCodeAndSend() throws IOException
      Throws:
      IOException
    • blockSort

      private void blockSort()
    • generateMTFValues

      private void generateMTFValues()