Moinak Ghosh
224fb529e9
Get rid of size_t in places where 64-bitness is assumed.
2012-12-09 10:15:06 +05:30
Moinak Ghosh
d250322490
Fix issues with error handling.
...
Add new tests for out of range values and corrupted file.
2012-11-24 23:53:07 +05:30
Moinak Ghosh
d054e0f713
Zlib optimizations. Use raw deflate streams to avoid unnecessary adler32.
...
Change some function signatures to improve algo init function behavior.
Fix corner case dedupe bug in error handling flow.
Bump archive version signature.
2012-11-22 21:02:50 +05:30
Moinak Ghosh
393ced991a
A couple of minor cleanups.
2012-11-18 20:20:16 +05:30
Moinak Ghosh
b2cbf0699e
Use fixed rolling-hash mask for better block size approximation.
2012-11-11 14:58:39 +05:30
Moinak Ghosh
eacbf207aa
Tweak chunking parameters for better block size distribution and dedupe ratio.
2012-11-08 19:41:33 +05:30
Moinak Ghosh
e437390e53
Add some more debug mode info.
2012-11-07 22:55:14 +05:30
Moinak Ghosh
c5ebe1f30a
Portability to Debian based distros.
...
Enable SSE4/AVX detection for AMD platforms (Bulldozer has both).
Portable long long int print formatting to silence gcc 4.6 warnings.
2012-10-21 21:03:07 +05:30
Moinak Ghosh
9475ccc3d6
Fix polynomial computation.
...
Fix incorrect block length when doing fixed-block dedupe.
2012-10-02 00:20:44 +05:30
Moinak Ghosh
0019efbadb
Adjust break pattern check mask for closer approximation to average block size.
...
Remove unused structure member.
2012-09-29 23:31:45 +05:30
Moinak Ghosh
24e6f4e629
Switch to multiplicative rolling hash for good distribution properties.
2012-09-29 00:09:49 +05:30
Moinak Ghosh
8f8af7ed6b
Update adaptive mode heuristic based on algorithms.
...
Remove incorrect check in PPMd decompression code.
More refactoring of variable names.
2012-09-27 22:29:08 +05:30
Moinak Ghosh
449dc35675
Speed up adaptive modes by using heuristics to select compression algorithm.
...
Select similarity percentage based on dedupe block size for effectiveness.
2012-09-26 19:47:32 +05:30
Moinak Ghosh
333b7b011e
Fix check for size reduction in dedupe.
...
Tweak debug message for clarity.
2012-09-25 23:24:33 +05:30
Moinak Ghosh
3544a8c708
Fix polynomial table computation.
...
Change hashing and length bias to reduce hashtable bucket collisions.
Add support for user-selectable 60% or 40% similarity for Delta Compression.
Overall slight speedup.
2012-09-24 22:20:27 +05:30
Moinak Ghosh
8386e72566
Rewrite core dedupe logic to simplify code and improve performance.
...
Hashtable based chunk-level deduplication instead of Quicksort.
Fix a corner case bug in Dedupe decompression.
2012-09-23 14:57:09 +05:30
Moinak Ghosh
99a8e4cd98
Speed up Hash computation for dedupe blocks.
...
Add missing initialization of sliding window.
Update help text.
2012-09-19 20:29:44 +05:30
Moinak Ghosh
e3befd9e16
Add support for Fixed-Block deduplication.
...
More refactoring of symbol names.
2012-09-16 11:12:58 +05:30
Moinak Ghosh
b9355a5dcc
Reduce dedupe loop checks for slight speed edge.
...
Beginnings of Fixed-block dedupe.
Update variable name for clarity.
2012-09-15 11:14:58 +05:30
Moinak Ghosh
f3f472b860
Implement K-min-values Sketch for Similarity detection.
2012-09-11 20:26:36 +05:30
Moinak Ghosh
e6f042aaf8
Allow user-specified minimum Dedupe block size.
...
Compute similarity sketch only if Delta Compression enabled.
2012-09-05 22:43:54 +05:30
Moinak Ghosh
560fa85aab
Fix secondary sketch computation, some more accuracy in diff detection.
2012-09-04 23:28:02 +05:30
Moinak Ghosh
262566b59a
Add xxHash for Rabin block checksums, slightly faster than CRC64.
...
Fix missing initialization of character counts table.
Some file reorganization.
2012-09-02 20:40:32 +05:30