java.lang.Object
org.apache.commons.compress.archivers.sevenz.SevenZFile
All Implemented Interfaces:
Closeable, AutoCloseable

public class SevenZFile extends Object implements Closeable
Reads a 7z file, using SeekableByteChannel under the covers.

The 7z file format is a flexible container that can contain many compression and encryption types, but at the moment only only Copy, LZMA, LZMA2, BZIP2, Deflate and AES-256 + SHA-256 are supported.

The format is very Windows/Intel specific, so it uses little-endian byte order, doesn't store user/group or permission bits, and represents times using NTFS timestamps (100 nanosecond units since 1 January 1601). Hence the official tools recommend against using it for backup purposes on *nix, and recommend .tar.7z or .tar.lzma or .tar.xz instead.

Both the header and file contents may be compressed and/or encrypted. With both encrypted, neither file names nor file contents can be read, but the use of encryption isn't plausibly deniable.

Multi volume archives can be read by concatenating the parts in correct order - either manually or by using {link org.apache.commons.compress.utils.MultiReadOnlySeekableByteChannel} for example.

Since:
1.6
  • Field Details

    • SIGNATURE_HEADER_SIZE

      static final int SIGNATURE_HEADER_SIZE
      See Also:
    • DEFAULT_FILE_NAME

      private static final String DEFAULT_FILE_NAME
      See Also:
    • fileName

      private final String fileName
    • channel

      private SeekableByteChannel channel
    • archive

      private final Archive archive
    • currentEntryIndex

      private int currentEntryIndex
    • currentFolderIndex

      private int currentFolderIndex
    • currentFolderInputStream

      private InputStream currentFolderInputStream
    • password

      private byte[] password
    • options

      private final SevenZFileOptions options
    • compressedBytesReadFromCurrentEntry

      private long compressedBytesReadFromCurrentEntry
    • uncompressedBytesReadFromCurrentEntry

      private long uncompressedBytesReadFromCurrentEntry
    • deferredBlockStreams

      private final ArrayList<InputStream> deferredBlockStreams
    • sevenZSignature

      static final byte[] sevenZSignature
    • PASSWORD_ENCODER

      private static final CharsetEncoder PASSWORD_ENCODER
  • Constructor Details

    • SevenZFile

      public SevenZFile(File fileName, char[] password) throws IOException
      Reads a file as 7z archive
      Parameters:
      fileName - the file to read
      password - optional password if the archive is encrypted
      Throws:
      IOException - if reading the archive fails
      Since:
      1.17
    • SevenZFile

      public SevenZFile(File fileName, char[] password, SevenZFileOptions options) throws IOException
      Reads a file as 7z archive with additional options.
      Parameters:
      fileName - the file to read
      password - optional password if the archive is encrypted
      options - the options to apply
      Throws:
      IOException - if reading the archive fails or the memory limit (if set) is too small
      Since:
      1.19
    • SevenZFile

      @Deprecated public SevenZFile(File fileName, byte[] password) throws IOException
      Deprecated.
      use the char[]-arg version for the password instead
      Reads a file as 7z archive
      Parameters:
      fileName - the file to read
      password - optional password if the archive is encrypted - the byte array is supposed to be the UTF16-LE encoded representation of the password.
      Throws:
      IOException - if reading the archive fails
    • SevenZFile

      public SevenZFile(SeekableByteChannel channel) throws IOException
      Reads a SeekableByteChannel as 7z archive

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      Throws:
      IOException - if reading the archive fails
      Since:
      1.13
    • SevenZFile

      public SevenZFile(SeekableByteChannel channel, SevenZFileOptions options) throws IOException
      Reads a SeekableByteChannel as 7z archive with addtional options.

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      options - the options to apply
      Throws:
      IOException - if reading the archive fails or the memory limit (if set) is too small
      Since:
      1.19
    • SevenZFile

      public SevenZFile(SeekableByteChannel channel, char[] password) throws IOException
      Reads a SeekableByteChannel as 7z archive

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      password - optional password if the archive is encrypted
      Throws:
      IOException - if reading the archive fails
      Since:
      1.17
    • SevenZFile

      public SevenZFile(SeekableByteChannel channel, char[] password, SevenZFileOptions options) throws IOException
      Reads a SeekableByteChannel as 7z archive with additional options.

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      password - optional password if the archive is encrypted
      options - the options to apply
      Throws:
      IOException - if reading the archive fails or the memory limit (if set) is too small
      Since:
      1.19
    • SevenZFile

      public SevenZFile(SeekableByteChannel channel, String fileName, char[] password) throws IOException
      Reads a SeekableByteChannel as 7z archive

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      fileName - name of the archive - only used for error reporting
      password - optional password if the archive is encrypted
      Throws:
      IOException - if reading the archive fails
      Since:
      1.17
    • SevenZFile

      public SevenZFile(SeekableByteChannel channel, String fileName, char[] password, SevenZFileOptions options) throws IOException
      Reads a SeekableByteChannel as 7z archive with addtional options.

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      fileName - name of the archive - only used for error reporting
      password - optional password if the archive is encrypted
      options - the options to apply
      Throws:
      IOException - if reading the archive fails or the memory limit (if set) is too small
      Since:
      1.19
    • SevenZFile

      public SevenZFile(SeekableByteChannel channel, String fileName) throws IOException
      Reads a SeekableByteChannel as 7z archive

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      fileName - name of the archive - only used for error reporting
      Throws:
      IOException - if reading the archive fails
      Since:
      1.17
    • SevenZFile

      public SevenZFile(SeekableByteChannel channel, String fileName, SevenZFileOptions options) throws IOException
      Reads a SeekableByteChannel as 7z archive with additional options.

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      fileName - name of the archive - only used for error reporting
      options - the options to apply
      Throws:
      IOException - if reading the archive fails or the memory limit (if set) is too small
      Since:
      1.19
    • SevenZFile

      @Deprecated public SevenZFile(SeekableByteChannel channel, byte[] password) throws IOException
      Deprecated.
      use the char[]-arg version for the password instead
      Reads a SeekableByteChannel as 7z archive

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      password - optional password if the archive is encrypted - the byte array is supposed to be the UTF16-LE encoded representation of the password.
      Throws:
      IOException - if reading the archive fails
      Since:
      1.13
    • SevenZFile

      @Deprecated public SevenZFile(SeekableByteChannel channel, String fileName, byte[] password) throws IOException
      Deprecated.
      use the char[]-arg version for the password instead
      Reads a SeekableByteChannel as 7z archive

      SeekableInMemoryByteChannel allows you to read from an in-memory archive.

      Parameters:
      channel - the channel to read
      fileName - name of the archive - only used for error reporting
      password - optional password if the archive is encrypted - the byte array is supposed to be the UTF16-LE encoded representation of the password.
      Throws:
      IOException - if reading the archive fails
      Since:
      1.13
    • SevenZFile

      private SevenZFile(SeekableByteChannel channel, String filename, byte[] password, boolean closeOnError, SevenZFileOptions options) throws IOException
      Throws:
      IOException
    • SevenZFile

      public SevenZFile(File fileName) throws IOException
      Reads a file as unencrypted 7z archive
      Parameters:
      fileName - the file to read
      Throws:
      IOException - if reading the archive fails
    • SevenZFile

      public SevenZFile(File fileName, SevenZFileOptions options) throws IOException
      Reads a file as unencrypted 7z archive
      Parameters:
      fileName - the file to read
      options - the options to apply
      Throws:
      IOException - if reading the archive fails or the memory limit (if set) is too small
      Since:
      1.19
  • Method Details

    • close

      public void close() throws IOException
      Closes the archive.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException - if closing the file fails
    • getNextEntry

      public SevenZArchiveEntry getNextEntry() throws IOException
      Returns the next Archive Entry in this archive.
      Returns:
      the next entry, or null if there are no more entries
      Throws:
      IOException - if the next entry could not be read
    • getEntries

      public Iterable<SevenZArchiveEntry> getEntries()
      Returns a copy of meta-data of all archive entries.

      This method only provides meta-data, the entries can not be used to read the contents, you still need to process all entries in order using getNextEntry() for that.

      The content methods are only available for entries that have already been reached via getNextEntry().

      Returns:
      a copy of meta-data of all archive entries.
      Since:
      1.11
    • readHeaders

      private Archive readHeaders(byte[] password) throws IOException
      Throws:
      IOException
    • tryToLocateEndHeader

      private Archive tryToLocateEndHeader(byte[] password) throws IOException
      Throws:
      IOException
    • initializeArchive

      private Archive initializeArchive(StartHeader startHeader, byte[] password, boolean verifyCrc) throws IOException
      Throws:
      IOException
    • readStartHeader

      private StartHeader readStartHeader(long startHeaderCrc) throws IOException
      Throws:
      IOException
    • readHeader

      private void readHeader(ByteBuffer header, Archive archive) throws IOException
      Throws:
      IOException
    • sanityCheckAndCollectStatistics

      private SevenZFile.ArchiveStatistics sanityCheckAndCollectStatistics(ByteBuffer header) throws IOException
      Throws:
      IOException
    • readArchiveProperties

      private void readArchiveProperties(ByteBuffer input) throws IOException
      Throws:
      IOException
    • sanityCheckArchiveProperties

      private void sanityCheckArchiveProperties(ByteBuffer header) throws IOException
      Throws:
      IOException
    • readEncodedHeader

      private ByteBuffer readEncodedHeader(ByteBuffer header, Archive archive, byte[] password) throws IOException
      Throws:
      IOException
    • sanityCheckStreamsInfo

      private void sanityCheckStreamsInfo(ByteBuffer header, SevenZFile.ArchiveStatistics stats) throws IOException
      Throws:
      IOException
    • readStreamsInfo

      private void readStreamsInfo(ByteBuffer header, Archive archive) throws IOException
      Throws:
      IOException
    • sanityCheckPackInfo

      private void sanityCheckPackInfo(ByteBuffer header, SevenZFile.ArchiveStatistics stats) throws IOException
      Throws:
      IOException
    • readPackInfo

      private void readPackInfo(ByteBuffer header, Archive archive) throws IOException
      Throws:
      IOException
    • sanityCheckUnpackInfo

      private void sanityCheckUnpackInfo(ByteBuffer header, SevenZFile.ArchiveStatistics stats) throws IOException
      Throws:
      IOException
    • readUnpackInfo

      private void readUnpackInfo(ByteBuffer header, Archive archive) throws IOException
      Throws:
      IOException
    • sanityCheckSubStreamsInfo

      private void sanityCheckSubStreamsInfo(ByteBuffer header, SevenZFile.ArchiveStatistics stats) throws IOException
      Throws:
      IOException
    • readSubStreamsInfo

      private void readSubStreamsInfo(ByteBuffer header, Archive archive) throws IOException
      Throws:
      IOException
    • sanityCheckFolder

      private int sanityCheckFolder(ByteBuffer header, SevenZFile.ArchiveStatistics stats) throws IOException
      Throws:
      IOException
    • readFolder

      private Folder readFolder(ByteBuffer header) throws IOException
      Throws:
      IOException
    • readAllOrBits

      private BitSet readAllOrBits(ByteBuffer header, int size) throws IOException
      Throws:
      IOException
    • readBits

      private BitSet readBits(ByteBuffer header, int size) throws IOException
      Throws:
      IOException
    • sanityCheckFilesInfo

      private void sanityCheckFilesInfo(ByteBuffer header, SevenZFile.ArchiveStatistics stats) throws IOException
      Throws:
      IOException
    • readFilesInfo

      private void readFilesInfo(ByteBuffer header, Archive archive) throws IOException
      Throws:
      IOException
    • checkEntryIsInitialized

      private void checkEntryIsInitialized(Map<Integer,SevenZArchiveEntry> archiveEntries, int index)
    • calculateStreamMap

      private void calculateStreamMap(Archive archive) throws IOException
      Throws:
      IOException
    • buildDecodingStream

      private void buildDecodingStream(int entryIndex, boolean isRandomAccess) throws IOException
      Build the decoding stream for the entry to be read. This method may be called from a random access(getInputStream) or sequential access(getNextEntry). If this method is called from a random access, some entries may need to be skipped(we put them to the deferredBlockStreams and skip them when actually needed to improve the performance)
      Parameters:
      entryIndex - the index of the entry to be read
      isRandomAccess - is this called in a random access
      Throws:
      IOException - if there are exceptions when reading the file
    • reopenFolderInputStream

      private void reopenFolderInputStream(int folderIndex, SevenZArchiveEntry file) throws IOException
      Discard any queued streams/ folder stream, and reopen the current folder input stream.
      Parameters:
      folderIndex - the index of the folder to reopen
      file - the 7z entry to read
      Throws:
      IOException - if exceptions occur when reading the 7z file
    • skipEntriesWhenNeeded

      private boolean skipEntriesWhenNeeded(int entryIndex, boolean isInSameFolder, int folderIndex) throws IOException
      Skip all the entries if needed. Entries need to be skipped when:

      1. it's a random access 2. one of these 2 condition is meet :

      2.1 currentEntryIndex != entryIndex : this means there are some entries to be skipped(currentEntryIndex < entryIndex) or the entry has already been read(currentEntryIndex > entryIndex)

      2.2 currentEntryIndex == entryIndex && !hasCurrentEntryBeenRead: if the entry to be read is the current entry, but some data of it has been read before, then we need to reopen the stream of the folder and skip all the entries before the current entries

      Parameters:
      entryIndex - the entry to be read
      isInSameFolder - are the entry to be read and the current entry in the same folder
      folderIndex - the index of the folder which contains the entry
      Returns:
      true if there are entries actually skipped
      Throws:
      IOException - there are exceptions when skipping entries
      Since:
      1.21
    • hasCurrentEntryBeenRead

      private boolean hasCurrentEntryBeenRead()
      Find out if any data of current entry has been read or not. This is achieved by comparing the bytes remaining to read and the size of the file.
      Returns:
      true if any data of current entry has been read
      Since:
      1.21
    • buildDecoderStack

      private InputStream buildDecoderStack(Folder folder, long folderOffset, int firstPackStreamIndex, SevenZArchiveEntry entry) throws IOException
      Throws:
      IOException
    • read

      public int read() throws IOException
      Reads a byte of data.
      Returns:
      the byte read, or -1 if end of input is reached
      Throws:
      IOException - if an I/O error has occurred
    • getCurrentStream

      private InputStream getCurrentStream() throws IOException
      Throws:
      IOException
    • getInputStream

      public InputStream getInputStream(SevenZArchiveEntry entry) throws IOException
      Returns an InputStream for reading the contents of the given entry.

      For archives using solid compression randomly accessing entries will be significantly slower than reading the archive sequentially.

      Parameters:
      entry - the entry to get the stream for.
      Returns:
      a stream to read the entry from.
      Throws:
      IOException - if unable to create an input stream from the zipentry
      Since:
      Compress 1.20
    • read

      public int read(byte[] b) throws IOException
      Reads data into an array of bytes.
      Parameters:
      b - the array to write data to
      Returns:
      the number of bytes read, or -1 if end of input is reached
      Throws:
      IOException - if an I/O error has occurred
    • read

      public int read(byte[] b, int off, int len) throws IOException
      Reads data into an array of bytes.
      Parameters:
      b - the array to write data to
      off - offset into the buffer to start filling at
      len - of bytes to read
      Returns:
      the number of bytes read, or -1 if end of input is reached
      Throws:
      IOException - if an I/O error has occurred
    • getStatisticsForCurrentEntry

      public InputStreamStatistics getStatisticsForCurrentEntry()
      Provides statistics for bytes read from the current entry.
      Returns:
      statistics for bytes read from the current entry
      Since:
      1.17
    • readUint64

      private static long readUint64(ByteBuffer in) throws IOException
      Throws:
      IOException
    • getChar

      private static char getChar(ByteBuffer buf) throws IOException
      Throws:
      IOException
    • getInt

      private static int getInt(ByteBuffer buf) throws IOException
      Throws:
      IOException
    • getLong

      private static long getLong(ByteBuffer buf) throws IOException
      Throws:
      IOException
    • get

      private static void get(ByteBuffer buf, byte[] to) throws IOException
      Throws:
      IOException
    • getUnsignedByte

      private static int getUnsignedByte(ByteBuffer buf) throws IOException
      Throws:
      IOException
    • matches

      public static boolean matches(byte[] signature, int length)
      Checks if the signature matches what is expected for a 7z file.
      Parameters:
      signature - the bytes to check
      length - the number of bytes to check
      Returns:
      true, if this is the signature of a 7z archive.
      Since:
      1.8
    • skipBytesFully

      private static long skipBytesFully(ByteBuffer input, long bytesToSkip) throws IOException
      Throws:
      IOException
    • readFully

      private void readFully(ByteBuffer buf) throws IOException
      Throws:
      IOException
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • getDefaultName

      public String getDefaultName()
      Derives a default file name from the archive name - if known.

      This implements the same heuristics the 7z tools use. In 7z's case if an archive contains entries without a name - i.e. SevenZArchiveEntry.getName() returns null - then its command line and GUI tools will use this default name when extracting the entries.

      Returns:
      null if the name of the archive is unknown. Otherwise if the name of the archive has got any extension, it is stripped and the remainder returned. Finally if the name of the archive hasn't got any extension then a ~ character is appended to the archive name.
      Since:
      1.19
    • utf16Decode

      private static byte[] utf16Decode(char[] chars) throws IOException
      Throws:
      IOException
    • assertFitsIntoNonNegativeInt

      private static int assertFitsIntoNonNegativeInt(String what, long value) throws IOException
      Throws:
      IOException