Update README.

2012-11-18 23:19:22 +05:30 · 2012-11-18 23:19:22 +05:30 · 2909a3abff
commit 2909a3abff
parent 393ced991a
1 changed files with 11 additions and 11 deletions
--- a/README.md
+++ b/README.md
@ -8,29 +8,29 @@ Comments, suggestions, code, rants etc are welcome.
 Pcompress is a utility to do compression and decompression in parallel by
 splitting input data into chunks. It has a modular structure and includes
-support for multiple algorithms like LZMA, Bzip2, PPMD, etc, with SKEIN
+support for multiple algorithms like LZMA, Bzip2, PPMD, etc, with SKEIN/
-checksums for data integrity. It can also do Lempel-Ziv pre-compression
+SHA checksums for data integrity. It can also do Lempel-Ziv pre-compression
 (derived from libbsc) to improve compression ratios across the board. SSE
 optimizations for the bundled LZMA are included. It also implements
 chunk-level Content-Aware Deduplication and Delta Compression features
 based on a Semi-Rabin Fingerprinting scheme. Delta Compression is done
 via the widely popular bsdiff algorithm. Similarity is detected using a
-custom hashing of maximal features of a block. When doing chunk-level
+technique based on MinHashing. When doing chunk-level dedupe it attempts
-dedupe it attempts to merge adjacent non-duplicate blocks index entries
+to merge adjacent non-duplicate blocks index entries into a single larger
-into a single larger entry to reduce metadata. In addition to all these it
+entry to reduce metadata. In addition to all these it can internally split
-can internally split chunks at rabin boundaries to help dedupe and
+chunks at rabin boundaries to help dedupe and compression.
 compression.
 It has low metadata overhead and overlaps I/O and compression to achieve
 maximum parallelism. It also bundles a simple slab allocator to speed
 repeated allocation of similar chunks. It can work in pipe mode, reading
-from stdin and writing to stdout. It also provides some adaptive compression
+from stdin and writing to stdout. It also provides adaptive compression
-modes in which multiple algorithms are tried per chunk to determine the best
+modes in which data analysis heuristics are used to identify near-optimal
-one for the given chunk. Finally it supports 14 compression levels to allow
+algorithms per chunk. Finally it supports 14 compression levels to allow
 for ultra compression modes in some algorithms.
 Pcompress also supports encryption via AES and uses Scrypt from Tarsnap
-for Password Based Key generation.
+for Password Based Key generation. A unique key is generated per session
 even if the same password is used and HMAC is used to do authentication.
 NOTE: This utility is Not an archiver. It compresses only single files or
      datastreams. To archive use something else like tar, cpio or pax.