Use SSE4 register as sliding window for default 16-byte window size.
Use local variable for sliding window position to avoid spurios memory access in non-SIMD case.
Avoid computing breakpoint check value if processed length < minimum block length.
Avoid different optimization flags for Dedupe sources.
Fix liberal mixing of uint64_t and int64_t (should all be uint64_t).
Fix corner case crash when decompressing.
Change some function signatures to improve algo init function behavior.
Fix corner case dedupe bug in error handling flow.
Bump archive version signature.
Change hashing and length bias to reduce hashtable bucket collisions.
Add support for user-selectable 60% or 40% similarity for Delta Compression.
Overall slight speedup.