Commit graph

115 commits

Author SHA1 Message Date
Moinak Ghosh
2740a00c76 Switch location of Dedupe context creation to allow correct index memory sizing. 2013-05-07 20:50:13 +05:30
Moinak Ghosh
b23b5789fb Fix bugs and improve accuracy in Segmented Dedupe.
Fix segment hashlist size computation.
Remove unnecessary sync of segment hashlist file writes.
Pass correct number of threads to index creation routine.
Add more error checks.
Handle correct positioning of segment hashlist file offset on write error.
Add missing semaphore signaling at dedupe abort points with global dedupe.
Use closer min-values sampling for improved segmented dedupe accuracy.
Update proper checksum info in README.
2013-04-30 19:35:18 +05:30
Moinak Ghosh
79a6e7f770 Capability to output data to stdout when compressing.
Always use segmented similarity bases dedupe when using -G option in pipe mode.
Standardize on average 8MB segment size for segmented dedupe.
Fix hashtable sizing.
Some miscellaneous cleanups.
Update README with details of new features.
2013-04-24 23:03:58 +05:30
Moinak Ghosh
d29f125ca7 Clean up temp cache dir handling.
Allow temp dir setting via specific env variable to point to fast devices like ramdisk,ssd.
2013-04-22 22:57:31 +05:30
Moinak Ghosh
c0b4aa0116 Many optimizations and changes to Segmented Global Dedupe.
Use chunk hash based similarity matching rather than content based.
Use sorting to order hash buffer rather than min-heap for better accuracy.
Use fast CRC64 for similarity hash for speed and lower memory requirements.
2013-04-21 18:11:16 +05:30
Moinak Ghosh
2f6ccca6e5 Update usage text and add minor tweaks. 2013-04-18 22:55:49 +05:30
Moinak Ghosh
8ae571124d Complete implementation for Segmented Global Deduplication. 2013-04-18 21:26:24 +05:30
Moinak Ghosh
50251107de Work in progress changes for Segmented Global Deduplication. 2013-04-09 22:23:51 +05:30
Moinak Ghosh
3d7a179a77 Work in progress changes for scalable segmented global deduplication.
Allow user-specified environment setting to control in-memory index size.
2013-04-06 15:15:27 +05:30
Moinak Ghosh
c357452079 Implement global dedupe in pipe mode.
Update hash index calculations to use upto 75% memavail when file size is not known.
Use little-endian nonce format for Salsa20.
2013-03-29 15:18:25 +05:30
Moinak Ghosh
19b304f30c Add global index cleanup function.
Fix location of sem_wait().
More comments.
2013-03-25 21:04:16 +05:30
Moinak Ghosh
1143207cd5 Add check to disable Delta Compression with Global deduplication for now. 2013-03-24 23:30:40 +05:30
Moinak Ghosh
fbf4658635 Implement Global Deduplication. 2013-03-24 23:21:17 +05:30
Moinak Ghosh
876796be5c Work in progress changes for global dedupe. 2013-03-21 22:00:38 +05:30
Moinak Ghosh
b7fdeb08bc Work in progress global dedupe changes. 2013-03-20 22:47:03 +05:30
Moinak Ghosh
f8f23e5200 Major License text cleanup. 2013-03-07 20:26:48 +05:30
Moinak Ghosh
e41f156beb Update README and test cases with new crypto options.
Update usage text.
2013-03-05 21:07:54 +05:30
Moinak Ghosh
dce424ec85 Use 128-bit key length when decompressing older version archives. 2013-03-04 22:35:33 +05:30
Moinak Ghosh
20250aa5dc Add XSalsa20 encryption algorithm from the NaCL library.
Include 128-bit key support based on the Salsa20 eSTREAM submission.
Allow variable-length nonces.
Use random bytes for initial nonce value.
Increase PBE hash rounds to 50000.
2013-03-04 21:56:07 +05:30
Moinak Ghosh
7a29c7be1e Change default encryption key length to 256 bits.
Add optional ability to change key length at runtime via cli option.
Include key length property in archive header.
Fix header HMAC to include salt, nonce and key length properties.
Retain backward compatibility to handle older format archives.
Fix compilation of AES ASM code.
2013-03-03 20:02:14 +05:30
Moinak Ghosh
efe5232cdc Add compatibility to decode old-format parallel hashes created with version 1.2.
Bump archive version to 7 as parallel hashes are now merkle style hashes.
2013-02-24 20:05:16 +05:30
Moinak Ghosh
3e1737b4ab Use OpenMP parallelism when computing xxHashes for chunks. 2013-02-02 09:27:58 +05:30
Moinak Ghosh
9983d79e62 Bump version and update changelog for 1.3.0 release.
Fix issue #3.
2013-01-29 21:42:54 +05:30
Moinak Ghosh
468044d816 Add parallel versions of various checksums for single-segment, single-thread compression. 2013-01-27 23:47:55 +05:30
Moinak Ghosh
2da0d0950b Use BLAKE2 parallel version for single-chunk archives (whole file in one chunk).
Set decompression threads correctly for single-chunk archives.
2013-01-26 18:28:13 +05:30
Moinak Ghosh
68f60ba50b Update default checksum to BLAKE256. 2013-01-26 17:40:23 +05:30
Moinak Ghosh
d08b5ea399 Add optimized BLAKE2 implementations with runtime detection of CPU capability (SSE/AVX).
Minor cleanups.
2013-01-26 15:39:10 +05:30
Moinak Ghosh
43af97042a Major changes to use Intel's optimized SHA512 code for SHA512 and SHA512/256.
Remove earlier SHA256 code which is slower than SHA512/256 (on 64-bit CPU).
Use HMAC from Alan Saddi's implementation for cleaner, faster code.
2013-01-25 22:55:55 +05:30
Moinak Ghosh
26bb137257 Changes for generalized runtime SSE/AVX/XOP detection.
Multi instruction set XXhash build with runtime selection.
Extend CPUID code to detect more instruction sets.
Add options for BLAKE2 hash.
Move GCC builtins into utils header.
Bump file format version number due to extended digest flags.
Add descriptions to digest list.
2013-01-25 00:10:12 +05:30
Moinak Ghosh
3888c8d316 Many optimization tweaks
Optimize Rabin Deduplication and Bsdiff
Vectorize XXHash using SE4
2013-01-20 22:02:26 +05:30
Moinak Ghosh
49ec3a054d Add SSE2 improvements to CTR mode AES.
Add debug print of encryption and HMAC throughput.
Fix error message for invalid option.
2013-01-16 19:52:46 +05:30
Moinak Ghosh
39dbc4be43 Implement algo-specific minimum distance match for Delta Compression. 2013-01-14 13:20:07 +05:30
Moinak Ghosh
a8fd60fb06 Fix issue #1 and issue #2.
Enable building with older openssl (at least 0.9.8e).
Add variants of missing functions in older openssl versions.
Allow proper linking with libraries in alternate locations and setting RUNPATH.
Increase hash rounds for nonce generation.
2013-01-05 00:16:15 +05:30
Moinak Ghosh
16b1d9e7a3 Bump version, update Changelog and documentation for 1.2 release. 2013-01-03 23:40:21 +05:30
Moinak Ghosh
47ebd5b752 Fix calculation of extra scratch space for Dedupe.
Add missing return value check in small buffer Delta2.
Update test failure detection in test driver.
2013-01-03 22:25:14 +05:30
Moinak Ghosh
d9eb82e0e8 Fix numeric parsing.
Fix dedupe bug introduced in last commit.
Reset valid flag when resetting dedupe context.
Cleanup test suites.
Do not abort test suite on failure of a test case.
2013-01-03 00:27:18 +05:30
Moinak Ghosh
6b756eb165 Fix Delta2 handling for small buffers.
Fix LZP handling during preprocessing.
Fix type flag handling during preprocessing.
Update test cases to use configurable list of test corpus files.
Update some more int64_t datatypes to uint64_t
Add a gitignore.
2013-01-02 22:56:21 +05:30
Moinak Ghosh
26a4f42506 Introduce strict compiler flags and fix scores of warnings/issues.
Avoid different optimization flags for Dedupe sources.
Fix liberal mixing of uint64_t and int64_t (should all be uint64_t).
Fix corner case crash when decompressing.
2012-12-27 23:06:48 +05:30
Moinak Ghosh
063624d0d3 Fix Delta2 decode handling when not compressing.
More debug statements and other minor changes.
2012-12-24 23:21:02 +05:30
Moinak Ghosh
d597f0f05c Major enhancements to Delta2 encoding.
Avoid transposing below-threshold spans. Reduces compression ratio.
Use little-endian storage format for numbers to optimize for x86.
Improve embedded table detection.
Reduce header sizes.
Get rid of Gcc's LTO flag. Causes a performance drop.
Fix preprocessing behavior when LZP does not compress but Delta2 works.
2012-12-23 23:50:45 +05:30
Moinak Ghosh
a43fdd7d2c Improve Delta2 scanning speed and effectiveness.
Add destination buffer overflow check in Delta2.
Add rough speed computation.
2012-12-23 00:44:56 +05:30
Moinak Ghosh
c2df666daf Major improvement to Delta2 encoding algorithm. 2012-12-18 23:32:55 +05:30
Moinak Ghosh
fb30b5c295 Enable building with alternate Zlib and Bzlib.
Update README and comments.
Fix correct setting of output size when using Delta2 without LZP.
2012-12-16 23:17:04 +05:30
Moinak Ghosh
ef0191729e Make Delta2 encoding independent of LZP.
Tweak Delta2 parameters.
Update README and test cases.
2012-12-15 22:03:23 +05:30
Moinak Ghosh
a98778d62f Fine tune transpose parameters.
Fix minor nits.
2012-12-14 19:12:48 +05:30
Moinak Ghosh
b0f41c2888 Add matrix transpose to Delta2 encoding.
Change confusing structure member name.
2012-12-13 21:18:16 +05:30
Moinak Ghosh
375ebefa0d Add Matrix Transpose of Dedupe index to compress it better.
Fix handling of Dedupe index compression failure.
2012-12-13 00:00:47 +05:30
Moinak Ghosh
03840b31c5 Update to latest LZ4.
Update a couple of comments.
2012-12-11 11:38:42 +05:30
Moinak Ghosh
224fb529e9 Get rid of size_t in places where 64-bitness is assumed. 2012-12-09 10:15:06 +05:30
Moinak Ghosh
75c81b5e9c Add support for 64-bit Keccak implementation.
Sanitize error message from tests.
Add more tests.
Improve platform detection in config script.
2012-12-08 14:19:01 +05:30