Update README.

This commit is contained in:
Moinak Ghosh 2012-11-18 23:19:22 +05:30
parent 393ced991a
commit 2909a3abff

View file

@ -8,29 +8,29 @@ Comments, suggestions, code, rants etc are welcome.
Pcompress is a utility to do compression and decompression in parallel by
splitting input data into chunks. It has a modular structure and includes
support for multiple algorithms like LZMA, Bzip2, PPMD, etc, with SKEIN
checksums for data integrity. It can also do Lempel-Ziv pre-compression
support for multiple algorithms like LZMA, Bzip2, PPMD, etc, with SKEIN/
SHA checksums for data integrity. It can also do Lempel-Ziv pre-compression
(derived from libbsc) to improve compression ratios across the board. SSE
optimizations for the bundled LZMA are included. It also implements
chunk-level Content-Aware Deduplication and Delta Compression features
based on a Semi-Rabin Fingerprinting scheme. Delta Compression is done
via the widely popular bsdiff algorithm. Similarity is detected using a
custom hashing of maximal features of a block. When doing chunk-level
dedupe it attempts to merge adjacent non-duplicate blocks index entries
into a single larger entry to reduce metadata. In addition to all these it
can internally split chunks at rabin boundaries to help dedupe and
compression.
technique based on MinHashing. When doing chunk-level dedupe it attempts
to merge adjacent non-duplicate blocks index entries into a single larger
entry to reduce metadata. In addition to all these it can internally split
chunks at rabin boundaries to help dedupe and compression.
It has low metadata overhead and overlaps I/O and compression to achieve
maximum parallelism. It also bundles a simple slab allocator to speed
repeated allocation of similar chunks. It can work in pipe mode, reading
from stdin and writing to stdout. It also provides some adaptive compression
modes in which multiple algorithms are tried per chunk to determine the best
one for the given chunk. Finally it supports 14 compression levels to allow
from stdin and writing to stdout. It also provides adaptive compression
modes in which data analysis heuristics are used to identify near-optimal
algorithms per chunk. Finally it supports 14 compression levels to allow
for ultra compression modes in some algorithms.
Pcompress also supports encryption via AES and uses Scrypt from Tarsnap
for Password Based Key generation.
for Password Based Key generation. A unique key is generated per session
even if the same password is used and HMAC is used to do authentication.
NOTE: This utility is Not an archiver. It compresses only single files or
datastreams. To archive use something else like tar, cpio or pax.