Update Changelog, docs and bump version for 2.1 release.

This commit is contained in:
Moinak Ghosh 2013-05-09 18:53:11 +05:30
parent a755d59dff
commit 8b3761ee81
3 changed files with 38 additions and 9 deletions

View file

@ -1,3 +1,32 @@
== 2.1 Update Release ==
Add more tests covering Segmented Global Dedupe.
Fix some tests.
Switch location of Dedupe context creation to allow correct index memory sizing.
Update README with details of Global Dedupe block hash selection.
Add SSE2 optimizations for Segmented Dedupe.
Fix segment offset sorting.
Get rid of incorrect duplicate checks in index.
Allow SKEIN to be used as a Global Dedupe chunk lookup hash.
Add a qsort variant optimized for integers and use in global dedupe.
Cleanup LZMA CRC64/32 declarations and add a header.
Fix heapq header.
Use openmp parallelism always for chunk hash computation during Global Dedupe.
Use SHA256 for Global Dedupe chunk lookup hash by default.
Allow changing Global Dedupe chunk lookup hash via env variable.
Fix crash with some older GCC versions. Reported in issue #7.
Fix issue #7.
Ensure tempfile cleanup even with error abort.
Fix bugs and improve accuracy in Segmented Dedupe.
Fix segment hashlist size computation.
Remove unnecessary sync of segment hashlist file writes.
Pass correct number of threads to index creation routine.
Add more error checks.
Handle correct positioning of segment hashlist file offset on write error.
Add missing semaphore signaling at dedupe abort points with global dedupe.
Use closer min-values sampling for improved segmented dedupe accuracy.
Update proper checksum info in README.
Fix sizing of similarity hash buffer.
Tweak index size computation.
== 2.0 Major Release ==
Add test cases for Global Deduplication.
Update documentation and code comments.

View file

@ -1,7 +1,7 @@
Pcompress
=========
Copyright (C) 2012 Moinak Ghosh. All rights reserved.
Copyright (C) 2012-2013 Moinak Ghosh. All rights reserved.
Use is subject to license terms.
moinakg (_at) gma1l _dot com.
Comments, suggestions, code, rants etc are welcome.
@ -12,13 +12,13 @@ support for multiple algorithms like LZMA, Bzip2, PPMD, etc, with SKEIN/
SHA checksums for data integrity. It can also do Lempel-Ziv pre-compression
(derived from libbsc) to improve compression ratios across the board. SSE
optimizations for the bundled LZMA are included. It also implements
chunk-level Content-Aware Deduplication and Delta Compression features
based on a Semi-Rabin Fingerprinting scheme. Delta Compression is done
via the widely popular bsdiff algorithm. Similarity is detected using a
technique based on MinHashing. When doing chunk-level dedupe it attempts
to merge adjacent non-duplicate blocks index entries into a single larger
entry to reduce metadata. In addition to all these it can internally split
chunks at rabin boundaries to help dedupe and compression.
Variable Block Deduplication and Delta Compression features based on a
Semi-Rabin Fingerprinting scheme. Delta Compression is done via the widely
popular bsdiff algorithm. Similarity is detected using a technique based
on MinHashing. When doing Dedupe it attempts to merge adjacent non-
duplicate block index entries into a single larger entry to reduce metadata.
In addition to all these it can internally split chunks at rabin boundaries
to help Dedupe and compression.
It has low metadata overhead and overlaps I/O and compression to achieve
maximum parallelism. It also bundles a simple slab allocator to speed

View file

@ -44,7 +44,7 @@ extern "C" {
#define FLAG_DEDUP 1
#define FLAG_DEDUP_FIXED 2
#define FLAG_SINGLE_CHUNK 4
#define UTILITY_VERSION "2.0"
#define UTILITY_VERSION "2.1"
#define MASK_CRYPTO_ALG 0x30
#define MAX_LEVEL 14