Commit graph

9 commits

Author SHA1 Message Date
Moinak Ghosh
7ef20ec5be Specialized dictionary encoding for FASTA files. 2015-02-01 16:20:03 -08:00
Moinak Ghosh
077da83d5d A bunch of small fixes in Dict.
Improve text analysis for markup tags.
Use Libbsc for plain text and PPMd for markup mixed text.
Change thresholds.
2015-01-11 17:36:46 +05:30
Moinak Ghosh
66a482c968 A new Dictionary preprocessor for text files. 2015-01-09 22:13:24 +05:30
Moinak Ghosh
f970b41e34 A bunch of improvements and fixes.
- Fix heap corruption in DICT Filter.
- Make default Dedup block size as 8KB.
- Revamp executable file handling: Part#1.
- Developed new E8E9 filter that works better than Dispack on raw data blocks.
- Remove block-based Dispack encoding. File-specific Dispack filter to be added.
- Improve file header based executable file detection.
- Introduce new sorting algorithm for filenames without extension.
2014-12-11 19:15:36 +05:30
Moinak Ghosh
507e7c75d3 Centralise data analysis routine for optimum performance and leverage.
Utilise buffer data analysis for preprocessing filters.
2014-11-06 22:23:33 +05:30
Moinak Ghosh
e7081eb5a3 Git commit - rehash. Incorrect earlier commit.
Implement Separate metadata stream.
Fix blatant wrong check in Bzip2 compressor.
Implement E8E9 filter fallback in Dispack.
Improve dict buffer size checks.
Reduce thread count to control memory usage in archive mode.
2014-10-24 23:30:40 +05:30
Moinak Ghosh
2e5f2d8aab Make DICT filter useful.
Improve data analysis in adaptive_compress.
2014-09-20 21:49:06 +05:30
Moinak Ghosh
071a9e2b26 Update,simplify analyzer function to indicate text data for Dict filter.
Fix archive header writing bug.
Strip ^M chars from dict filter files.
Include DICT preprocessing type.
Fix a bunch of bugs found by Xcode.
2014-09-20 12:49:00 +05:30
Moinak Ghosh
4fedebc607 Dict filter work in progress. 2014-09-18 22:51:25 +05:30