pcompress/README.md

Pcompress
=========

Copyright (C) 2012 Moinak Ghosh. All rights reserved.
Use is subject to license terms.
moinakg (_at) gma1l _dot com.
Comments, suggestions, code, rants etc are welcome.

Pcompress is a utility to do compression and decompression in parallel by
splitting input data into chunks. It has a modular structure and includes
support for multiple algorithms like LZMA, Bzip2, PPMD, etc, with SKEIN
checksums for data integrity. It can also do Lempel-Ziv pre-compression
(derived from libbsc) to improve compression ratios across the board. SSE
optimizations for the bundled LZMA are included. It also implements
chunk-level Content-Aware Deduplication and Delta Compression features
based on a Semi-Rabin Fingerprinting scheme. Delta Compression is done
via the widely popular bsdiff algorithm. Similarity is detected using a
custom hashing of maximal features of a block. When doing chunk-level
dedupe it attempts to merge adjacent non-duplicate blocks index entries
into a single larger entry to reduce metadata. In addition to all these it
can internally split chunks at rabin boundaries to help dedupe and
compression.

It has low metadata overhead and overlaps I/O and compression to achieve
maximum parallelism. It also bundles a simple slab allocator to speed
repeated allocation of similar chunks. It can work in pipe mode, reading
from stdin and writing to stdout. It also provides some adaptive compression
modes in which multiple algorithms are tried per chunk to determine the best
one for the given chunk. Finally it supports 14 compression levels to allow
for ultra compression modes in some algorithms.

NOTE: This utility is Not an archiver. It compresses only single files or
      datastreams. To archive use something else like tar, cpio or pax.

Usage
=====

    To compress a file:
       pcompress -c <algorithm> [-l <compress level>] [-s <chunk size>] <file>
       Where <algorithm> can be the folowing:
       lzfx   - Very fast and small algorithm based on LZF.
       lz4    - Ultra fast, high-throughput algorithm reaching RAM B/W at level1.
       zlib   - The base Zlib format compression (not Gzip).
       lzma   - The LZMA (Lempel-Ziv Markov) algorithm from 7Zip.
       lzmaMt - Multithreaded version of LZMA. This is a faster version but
                uses more memory for the dictionary. Thread count is balanced
                between chunk processing threads and algorithm threads.
       bzip2  - Bzip2 Algorithm from libbzip2.
       ppmd   - The PPMd algorithm excellent for textual data. PPMd requires
                at least 64MB X CPUs more memory than the other modes.

       libbsc - A Block Sorting Compressor using the Burrows Wheeler Transform
                like Bzip2 but runs faster and gives better compression than
                Bzip2 (See: libbsc.com).

       adapt  - Adaptive mode where ppmd or bzip2 will be used per chunk,
                depending on heuristics. If at least 50% of the input data is
                7-bit text then PPMd will be used otherwise Bzip2.
       adapt2 - Adaptive mode which includes ppmd and lzma. If at least 80% of
                the input data is 7-bit text then PPMd will be used otherwise
                LZMA. It has significantly more memory usage than adapt.
       none   - No compression. This is only meaningful with -D and -E so Dedupe
                can be done for post-processing with an external utility.
       <chunk_size> - This can be in bytes or can use the following suffixes:
                g - Gigabyte, m - Megabyte, k - Kilobyte.
                Larger chunks produce better compression at the cost of memory.
       <compress_level> - Can be a number from 0 meaning minimum and 14 meaning
                maximum compression.

NOTE: The option "libbsc" uses  Ilya Grebnov's block sorting compression library
      from http://libbsc.com/ . It is only available if pcompress in built with
      that library. See INSTALL file for details.
      
    To decompress a file compressed using above command:
       pcompress -d <compressed file> <target file>

    To operate as a pipe, read from stdin and write to stdout:
       pcompress -p ...

    Attempt Rabin fingerprinting based deduplication on chunks:
       pcompress -D ...
       pcompress -D -r ... - Do NOT split chunks at a rabin boundary. Default
                             is to split.

    Perform Delta Encoding in addition to Identical Dedup:
       pcompress -E ... - This also implies '-D'. This performs Delta Compression
                          between 2 blocks if they are 40% to 60% similar. The
                          similarity %age is selected based on the dedupe block
                          size to balance performance and effectiveness.
       pcompress -EE .. - This causes Delta Compression to happen if 2 blocks are
                          at least 40% similar regardless of block size. This can
                          effect greater final compression ratio at the cost of
                          higher processing overhead.

    Number of threads can optionally be specified: -t <1 - 256 count>
    Other flags:
       '-L' -     Enable LZP pre-compression. This improves compression ratio of all
                  algorithms with some extra CPU and very low RAM overhead. Using
                  delta encoding in conjunction with this may not always be beneficial.
       '-S' <cksum>
            -     Specify chunk checksum to use: CRC64, SKEIN256, SKEIN512
                  Default one is SKEIN256. The implementation actually uses SKEIN
                  512-256. This is 25% slower than simple CRC64 but is many times more
                  robust than CRC64 in detecting data integrity errors. SKEIN is a
                  finalist in the NIST SHA-3 standard selection process and is one of
                  the fastest in the group, especially on x86 platforms. BLAKE is faster
                  than SKEIN on a few platforms.
                  SKEIN 512-256 is about 60% faster than SHA 512-256 on x64 platforms.

       '-F' -     Perform Fixed Block Deduplication. This is faster than fingerprinting
                  based content-aware deduplication in some cases. However this is mostly
                  usable for disk dumps especially virtual machine images. This generally
                  gives lower dedupe ratio than content-aware dedupe (-D) and does not
                  support delta compression.
       '-M' -     Display memory allocator statistics
       '-C' -     Display compression statistics

NOTE: It is recommended not to use '-L' with libbsc compression since libbsc uses
      LZP internally as well.

Environment Variables
=====================

Set ALLOCATOR_BYPASS=1 in the environment to avoid using the the built-in
allocator. Due to the the way it rounds up an allocation request to the nearest
slab the built-in allocator can allocate extra unused memory. In addition you
may want to use a different allocator in your environment.

Examples
========

Compress "file.tar" using bzip2 level 6, 64MB chunk size and use 4 threads. In
addition perform identity deduplication and delta compression prior to compression.

    pcompress -D -E -c bzip2 -l6 -s64m -t4 file.tar

Compress "file.tar" using extreme compression mode of LZMA and a chunk size of
of 1GB. Allow pcompress to detect the number of CPU cores and use as many threads.

    pcompress -c lzma -l14 -s1g file.tar

Compression Algorithms
======================

LZFX	- Ultra Fast, average compression. This algorithm is the fastest overall.
	  Levels: 1 - 5
LZ4	- Very Fast, better compression than LZFX.
	  Levels: 1 - 3
Zlib	- Fast, better compression.
	  Levels: 1 - 9
Bzip2	- Slow, much better compression than Zlib.
	  Levels: 1 - 9

LZMA	- Very slow. Extreme compression.
	  Levels: 1 - 14
          Till level 9 it is standard LZMA parameters. Levels 10 - 12 use
          more memory and higher match iterations so are slower. Levels
          13 and 14 use larger dictionaries upto 256MB and really suck up
          RAM. Use these levels only if you have at the minimum 4GB RAM on
          your system.

PPMD	- Slow. Extreme compression for Text, average compression for binary.
          In addition PPMD decompression time is also high for large chunks.
          This requires lots of RAM similar to LZMA.
	  Levels: 1 - 14.

Adapt	- Very slow synthetic mode. Both Bzip2 and PPMD are tried per chunk and
	  better result selected.
	  Levels: 1 - 14
Adapt2	- Ultra slow synthetic mode. Both LZMA and PPMD are tried per chunk and
	  better result selected. Can give best compression ratio when splitting
	  file into multiple chunks.
	  Levels: 1 - 14
          Since both LZMA and PPMD are used together memory requirements are
          quite extensive especially if you are also using extreme levels above
          10. For example with 64MB chunk, Level 14, 2 threads and with or without
          dedupe, it uses upto 3.5GB physical RAM and requires 6GB of virtual
          memory space.

It is possible for a single chunk to span the entire file if enough RAM is
available. However for adaptive modes to be effective for large files, especially
multi-file archives splitting into chunks is required so that best compression
algorithm can be selected for textual and binary portions.

Caveats
=======
This utility is not meant for resource constrained environments. Minimum memory
usage (RES/RSS) with barely meaningful settings is around 10MB. This occurs when
using the minimal LZFX compression algorithm at level 2 with a 1MB chunk size and
running 2 threads.
Normally this utility requires lots of RAM depending on compression algorithm,
compression level, and dedupe being enabled. Larger chunk sizes can give
better compression ratio but at the same time use more RAM.
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`Pcompress`
Initial commit 2012-05-19 15:54:47 +00:00			`=========`

Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`Copyright (C) 2012 Moinak Ghosh. All rights reserved.`
			`Use is subject to license terms.`
Adjust break pattern check mask for closer approximation to average block size. Remove unused structure member. 2012-09-29 18:01:45 +00:00			`moinakg (_at) gma1l _dot com.`
			`Comments, suggestions, code, rants etc are welcome.`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00
			`Pcompress is a utility to do compression and decompression in parallel by`
			`splitting input data into chunks. It has a modular structure and includes`
Update README to reflect current features. 2012-09-07 16:02:20 +00:00			`support for multiple algorithms like LZMA, Bzip2, PPMD, etc, with SKEIN`
			`checksums for data integrity. It can also do Lempel-Ziv pre-compression`
			`(derived from libbsc) to improve compression ratios across the board. SSE`
			`optimizations for the bundled LZMA are included. It also implements`
			`chunk-level Content-Aware Deduplication and Delta Compression features`
			`based on a Semi-Rabin Fingerprinting scheme. Delta Compression is done`
			`via the widely popular bsdiff algorithm. Similarity is detected using a`
			`custom hashing of maximal features of a block. When doing chunk-level`
			`dedupe it attempts to merge adjacent non-duplicate blocks index entries`
			`into a single larger entry to reduce metadata. In addition to all these it`
			`can internally split chunks at rabin boundaries to help dedupe and`
			`compression.`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00
			`It has low metadata overhead and overlaps I/O and compression to achieve`
			`maximum parallelism. It also bundles a simple slab allocator to speed`
			`repeated allocation of similar chunks. It can work in pipe mode, reading`
			`from stdin and writing to stdout. It also provides some adaptive compression`
			`modes in which multiple algorithms are tried per chunk to determine the best`
Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`one for the given chunk. Finally it supports 14 compression levels to allow`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`for ultra compression modes in some algorithms.`

Bump file version. 2012-08-26 09:31:18 +00:00			`NOTE: This utility is Not an archiver. It compresses only single files or`
			`datastreams. To archive use something else like tar, cpio or pax.`

Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`Usage`
			`=====`

			`To compress a file:`
			`pcompress -c <algorithm> [-l <compress level>] [-s <chunk size>] <file>`
			`Where <algorithm> can be the folowing:`
			`lzfx - Very fast and small algorithm based on LZF.`
			`lz4 - Ultra fast, high-throughput algorithm reaching RAM B/W at level1.`
			`zlib - The base Zlib format compression (not Gzip).`
			`lzma - The LZMA (Lempel-Ziv Markov) algorithm from 7Zip.`
Delay allocation of per-thread chunks for performance and memory efficiency. Avoid allocating double-buffer for single-chunk files. Introduce lzmaMt option to indicate multithreaded LZMA. Update README. 2012-08-18 16:30:14 +00:00			`lzmaMt - Multithreaded version of LZMA. This is a faster version but`
			`uses more memory for the dictionary. Thread count is balanced`
			`between chunk processing threads and algorithm threads.`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`bzip2 - Bzip2 Algorithm from libbzip2.`
			`ppmd - The PPMd algorithm excellent for textual data. PPMd requires`
			`at least 64MB X CPUs more memory than the other modes.`
Add support for libbsc a high-performance block sorting compressor. Enable external algorithm threading for single chunk compressed files. Update docs. 2012-08-27 16:21:55 +00:00
			`libbsc - A Block Sorting Compressor using the Burrows Wheeler Transform`
			`like Bzip2 but runs faster and gives better compression than`
			`Bzip2 (See: libbsc.com).`

Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`adapt - Adaptive mode where ppmd or bzip2 will be used per chunk,`
Update adaptive mode heuristic based on algorithms. Remove incorrect check in PPMd decompression code. More refactoring of variable names. 2012-09-27 16:59:08 +00:00			`depending on heuristics. If at least 50% of the input data is`
			`7-bit text then PPMd will be used otherwise Bzip2.`
			`adapt2 - Adaptive mode which includes ppmd and lzma. If at least 80% of`
			`the input data is 7-bit text then PPMd will be used otherwise`
			`LZMA. It has significantly more memory usage than adapt.`
Delay allocation of per-thread chunks for performance and memory efficiency. Avoid allocating double-buffer for single-chunk files. Introduce lzmaMt option to indicate multithreaded LZMA. Update README. 2012-08-18 16:30:14 +00:00			`none - No compression. This is only meaningful with -D and -E so Dedupe`
			`can be done for post-processing with an external utility.`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`<chunk_size> - This can be in bytes or can use the following suffixes:`
			`g - Gigabyte, m - Megabyte, k - Kilobyte.`
			`Larger chunks produce better compression at the cost of memory.`
			`<compress_level> - Can be a number from 0 meaning minimum and 14 meaning`
			`maximum compression.`

Add support for libbsc a high-performance block sorting compressor. Enable external algorithm threading for single chunk compressed files. Update docs. 2012-08-27 16:21:55 +00:00			`NOTE: The option "libbsc" uses Ilya Grebnov's block sorting compression library`
			`from http://libbsc.com/ . It is only available if pcompress in built with`
			`that library. See INSTALL file for details.`

Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`To decompress a file compressed using above command:`
			`pcompress -d <compressed file> <target file>`

			`To operate as a pipe, read from stdin and write to stdout:`
			`pcompress -p ...`

			`Attempt Rabin fingerprinting based deduplication on chunks:`
			`pcompress -D ...`
Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`pcompress -D -r ... - Do NOT split chunks at a rabin boundary. Default`
			`is to split.`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00
Fix polynomial table computation. Change hashing and length bias to reduce hashtable bucket collisions. Add support for user-selectable 60% or 40% similarity for Delta Compression. Overall slight speedup. 2012-09-24 16:50:27 +00:00			`Perform Delta Encoding in addition to Identical Dedup:`
			`pcompress -E ... - This also implies '-D'. This performs Delta Compression`
Speed up adaptive modes by using heuristics to select compression algorithm. Select similarity percentage based on dedupe block size for effectiveness. 2012-09-26 14:17:32 +00:00			`between 2 blocks if they are 40% to 60% similar. The`
			`similarity %age is selected based on the dedupe block`
			`size to balance performance and effectiveness.`
Fix polynomial table computation. Change hashing and length bias to reduce hashtable bucket collisions. Add support for user-selectable 60% or 40% similarity for Delta Compression. Overall slight speedup. 2012-09-24 16:50:27 +00:00			`pcompress -EE .. - This causes Delta Compression to happen if 2 blocks are`
Speed up adaptive modes by using heuristics to select compression algorithm. Select similarity percentage based on dedupe block size for effectiveness. 2012-09-26 14:17:32 +00:00			`at least 40% similar regardless of block size. This can`
			`effect greater final compression ratio at the cost of`
			`higher processing overhead.`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00
			`Number of threads can optionally be specified: -t <1 - 256 count>`
Add LZP Pre-Compression support ported from libbsc. Add generic pre-processing wrappers for future support of other pre-processors. Clean up computation of Rabin block sizes. Compute Rabin scratch space accurately to avoid RAM wastage. 2012-08-23 17:28:44 +00:00			`Other flags:`
			`'-L' - Enable LZP pre-compression. This improves compression ratio of all`
			`algorithms with some extra CPU and very low RAM overhead. Using`
			`delta encoding in conjunction with this may not always be beneficial.`
Add ASM version of Skein for x64 platforms with auto-detection Error checking for checksum flag when decompressing Update comments and READMEs 2012-09-01 09:10:15 +00:00			`'-S' <cksum>`
			`- Specify chunk checksum to use: CRC64, SKEIN256, SKEIN512`
			`Default one is SKEIN256. The implementation actually uses SKEIN`
			`512-256. This is 25% slower than simple CRC64 but is many times more`
			`robust than CRC64 in detecting data integrity errors. SKEIN is a`
			`finalist in the NIST SHA-3 standard selection process and is one of`
			`the fastest in the group, especially on x86 platforms. BLAKE is faster`
			`than SKEIN on a few platforms.`
			`SKEIN 512-256 is about 60% faster than SHA 512-256 on x64 platforms.`
Add support for Fixed-Block deduplication. More refactoring of symbol names. 2012-09-16 05:42:58 +00:00
			`'-F' - Perform Fixed Block Deduplication. This is faster than fingerprinting`
			`based content-aware deduplication in some cases. However this is mostly`
			`usable for disk dumps especially virtual machine images. This generally`
			`gives lower dedupe ratio than content-aware dedupe (-D) and does not`
			`support delta compression.`
Add LZP Pre-Compression support ported from libbsc. Add generic pre-processing wrappers for future support of other pre-processors. Clean up computation of Rabin block sizes. Compute Rabin scratch space accurately to avoid RAM wastage. 2012-08-23 17:28:44 +00:00			`'-M' - Display memory allocator statistics`
			`'-C' - Display compression statistics`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00
Fix single chunk flag handling during decompression. Update docs. 2012-08-27 16:54:23 +00:00			`NOTE: It is recommended not to use '-L' with libbsc compression since libbsc uses`
			`LZP internally as well.`

Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`Environment Variables`
			`=====================`

			`Set ALLOCATOR_BYPASS=1 in the environment to avoid using the the built-in`
			`allocator. Due to the the way it rounds up an allocation request to the nearest`
Delay allocation of per-thread chunks for performance and memory efficiency. Avoid allocating double-buffer for single-chunk files. Introduce lzmaMt option to indicate multithreaded LZMA. Update README. 2012-08-18 16:30:14 +00:00			`slab the built-in allocator can allocate extra unused memory. In addition you`
			`may want to use a different allocator in your environment.`
Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00			`Examples`
			`========`

			`Compress "file.tar" using bzip2 level 6, 64MB chunk size and use 4 threads. In`
Reduce dedupe loop checks for slight speed edge. Beginnings of Fixed-block dedupe. Update variable name for clarity. 2012-09-15 05:44:58 +00:00			`addition perform identity deduplication and delta compression prior to compression.`
Update chunk size computation to reduce memory usage. Implement runtime bypass of custom allocator. Update README. 2012-07-27 16:33:24 +00:00
			`pcompress -D -E -c bzip2 -l6 -s64m -t4 file.tar`

			`Compress "file.tar" using extreme compression mode of LZMA and a chunk size of`
			`of 1GB. Allow pcompress to detect the number of CPU cores and use as many threads.`

			`pcompress -c lzma -l14 -s1g file.tar`

Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`Compression Algorithms`
			`======================`

			`LZFX - Ultra Fast, average compression. This algorithm is the fastest overall.`
			`Levels: 1 - 5`
			`LZ4 - Very Fast, better compression than LZFX.`
			`Levels: 1 - 3`
			`Zlib - Fast, better compression.`
			`Levels: 1 - 9`
			`Bzip2 - Slow, much better compression than Zlib.`
			`Levels: 1 - 9`
Fix crash when algo init function returns error. Fix LZFX error handling. More updates to README. 2012-07-31 15:37:35 +00:00
Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`LZMA - Very slow. Extreme compression.`
			`Levels: 1 - 14`
Fix crash when algo init function returns error. Fix LZFX error handling. More updates to README. 2012-07-31 15:37:35 +00:00			`Till level 9 it is standard LZMA parameters. Levels 10 - 12 use`
			`more memory and higher match iterations so are slower. Levels`
			`13 and 14 use larger dictionaries upto 256MB and really suck up`
			`RAM. Use these levels only if you have at the minimum 4GB RAM on`
			`your system.`

Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`PPMD - Slow. Extreme compression for Text, average compression for binary.`
Implement secondary sketch based on character counts to refine similarity checksum. Proper checksum update for last block. Update comments. 2012-08-12 07:36:49 +00:00			`In addition PPMD decompression time is also high for large chunks.`
			`This requires lots of RAM similar to LZMA.`
Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`Levels: 1 - 14.`

			`Adapt - Very slow synthetic mode. Both Bzip2 and PPMD are tried per chunk and`
			`better result selected.`
			`Levels: 1 - 14`
			`Adapt2 - Ultra slow synthetic mode. Both LZMA and PPMD are tried per chunk and`
Fix crash when algo init function returns error. Fix LZFX error handling. More updates to README. 2012-07-31 15:37:35 +00:00			`better result selected. Can give best compression ratio when splitting`
Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`file into multiple chunks.`
			`Levels: 1 - 14`
Fix crash when algo init function returns error. Fix LZFX error handling. More updates to README. 2012-07-31 15:37:35 +00:00			`Since both LZMA and PPMD are used together memory requirements are`
			`quite extensive especially if you are also using extreme levels above`
			`10. For example with 64MB chunk, Level 14, 2 threads and with or without`
Delay allocation of per-thread chunks for performance and memory efficiency. Avoid allocating double-buffer for single-chunk files. Introduce lzmaMt option to indicate multithreaded LZMA. Update README. 2012-08-18 16:30:14 +00:00			`dedupe, it uses upto 3.5GB physical RAM and requires 6GB of virtual`
			`memory space.`
Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00
			`It is possible for a single chunk to span the entire file if enough RAM is`
			`available. However for adaptive modes to be effective for large files, especially`
			`multi-file archives splitting into chunks is required so that best compression`
			`algorithm can be selected for textual and binary portions.`

			`Caveats`
			`=======`
Fix crash when algo init function returns error. Fix LZFX error handling. More updates to README. 2012-07-31 15:37:35 +00:00			`This utility is not meant for resource constrained environments. Minimum memory`
			`usage (RES/RSS) with barely meaningful settings is around 10MB. This occurs when`
			`using the minimal LZFX compression algorithm at level 2 with a 1MB chunk size and`
			`running 2 threads.`
			`Normally this utility requires lots of RAM depending on compression algorithm,`
Fix buffer size computation when allocating Rabin block array. Reduce memory usage of Rabin block array. Add an SSE optimization for bsdiff. Move integer hashing function to utils file. More updates to README. 2012-07-28 18:25:24 +00:00			`compression level, and dedupe being enabled. Larger chunk sizes can give`
			`better compression ratio but at the same time use more RAM.`