Commit graph

59 commits

Author SHA1 Message Date
Moinak Ghosh
6a757ddb2c Multitue of tweaks and improvements.
* Use BSC for PNM type and Markup containing binary data.
* Change thresholds in analyzer.
* Properly use double precision in analyzer for accuracy.
* Indicate BSC processing of packPNM output
* Bring back raw-block Dispack for file not processed by Dispack filter.
2015-03-22 23:36:04 +05:30
Moinak Ghosh
678a6a2da4 A few small fixes.
Effect same compression algo for Jpeg and PackJPG output.
Fix compiler warning in PackPNM.
Allow unknown type (0) to be specified for Dispack output (for analyzer).
2015-01-17 20:03:06 +05:30
Moinak Ghosh
d5e1d2cdef Some fixes in the Dictionary preprocessor.
Fix checking of data type flags.
Allow file-level filters to change output data type.
Tweak analyzer threshold for markup type.
2015-01-13 19:59:09 +05:30
Moinak Ghosh
73307c3996 Multiple checks and balances in Dispack to avoid buffer overlfow.
Allow filter variants to omit the standard header.
Use E8E9 in Dispack filter as a fallback.
Fix integer overflow for type value in thread data struct.
Do not inline functions in DEBUG build.
2014-12-21 14:13:58 +05:30
Moinak Ghosh
1db822d866 Add Dispack file-level filter in the libarchive chain.
Add new file type for Win32-PE executables (Dispack).
Reset file type flag after filter processing for better compression.
Fix array index handling for file type list.
2014-12-20 11:24:09 +05:30
Moinak Ghosh
2cd41ec257 Revamp Filter handling code.
1) Really avoid adding filter xattr for non-processed files.
2) Clean up filter error handling.
3) Avoid libarchive data writes in filter callbacks.
4) Have libarchive data writes in a single place.
5) Properly handle skipping filter processing for a file.
6) Fix temporary file pathname handling.
2014-12-14 23:37:40 +05:30
Moinak Ghosh
dfe18ef48f Fix missed archive entry record.
Fix enabling of metadata stream feature.
Fix log message text.
Use macro for path separator.
2014-12-11 23:16:26 +05:30
Moinak Ghosh
f970b41e34 A bunch of improvements and fixes.
- Fix heap corruption in DICT Filter.
- Make default Dedup block size as 8KB.
- Revamp executable file handling: Part#1.
- Developed new E8E9 filter that works better than Dispack on raw data blocks.
- Remove block-based Dispack encoding. File-specific Dispack filter to be added.
- Improve file header based executable file detection.
- Introduce new sorting algorithm for filenames without extension.
2014-12-11 19:15:36 +05:30
Moinak Ghosh
b257c83f33 Detect a few mozilla file signatures.
Add missing option to suppress pathname sorting.
Fix chunk sizing to properly auto-enable deduplication.
Fix default dedupe block size to 8KB.
2014-11-16 22:57:47 +05:30
Moinak Ghosh
29b5efc988 Add couple of mozilla file extensions.
Check for files > INT64_T when sorting.
Makefile targets to help development.
2014-11-15 19:17:33 +05:30
Moinak Ghosh
62c7590f26 Detiled listing of archive members (-i). 2014-11-04 00:36:18 +05:30
Moinak Ghosh
b2ad225fbb iImplement fast TOC listing for metadata streams.
Fix help text.
Removed redundant allocator code.
Actually free memory on exit.
2014-11-03 20:20:05 +05:30
Moinak Ghosh
b7804a0caa Improve file sorting algorithm.
Add more file extension names.
Fix data type mask size.
2014-10-27 19:23:03 +05:30
Moinak Ghosh
cc68550670 Add metadata stream flag for archive.
Change flag bit to not collide with checksum id.
Handle '-T' option properly.
2014-10-25 22:57:31 +05:30
Moinak Ghosh
e7081eb5a3 Git commit - rehash. Incorrect earlier commit.
Implement Separate metadata stream.
Fix blatant wrong check in Bzip2 compressor.
Implement E8E9 filter fallback in Dispack.
Improve dict buffer size checks.
Reduce thread count to control memory usage in archive mode.
2014-10-24 23:30:40 +05:30
Moinak Ghosh
e3c32ed6d6 Remove unneeded archive writing function.
Improve filter scratch buffer handling.
Improve memory accounting.
Remove delayed allocation when compressing. Allows better memory estimation.
Some cstyle fixes.
2014-09-24 21:54:36 +05:30
Moinak Ghosh
2e5f2d8aab Make DICT filter useful.
Improve data analysis in adaptive_compress.
2014-09-20 21:49:06 +05:30
Moinak Ghosh
071a9e2b26 Update,simplify analyzer function to indicate text data for Dict filter.
Fix archive header writing bug.
Strip ^M chars from dict filter files.
Include DICT preprocessing type.
Fix a bunch of bugs found by Xcode.
2014-09-20 12:49:00 +05:30
Moinak Ghosh
f34962f8cc Set Wavpack compression mode based on compression level. 2014-09-17 21:43:00 +05:30
Moinak Ghosh
af39994a59 Working Wavpack filter for compressing WAV filies.
Improved error handling of filter routines.
Improved verbose logging.
2014-09-17 20:34:38 +05:30
Moinak Ghosh
fd087a8949 Step 0 of adding WavPack filter - does not work yet.
WAV file detection.
Rename libarchive dir to be generic.
2014-09-14 23:56:38 +05:30
Moinak Ghosh
5a875f3174 Regenerate extensions hash. 2014-09-12 17:00:36 +05:30
Moinak Ghosh
3e9a46a602 Add tagging of filter-processed entries with custom XATTR.
Add magic number based detection of JPEG and PNM formats.
2014-09-11 20:29:53 +05:30
Moinak Ghosh
9ecbbbafd0 Pull in private copy of libarchive to add pcmpress-specific functionality.
First step to add packPNM support.
2014-09-11 18:34:43 +05:30
Moinak Ghosh
10f40e1c6f Part 1 changes to allow dual licensing to MPLV2.
Make external LGPL code/features disabled in MPLV2 variant.
Nuke some unwanted whitespace (cstyle).
2014-07-24 22:20:30 +05:30
Moinak Ghosh
63bef473cc Working MAC OS X port.
Compatibility layer for semaphore handling.
2014-05-04 21:11:31 +05:30
Moinak Ghosh
4a9cd8c48e More portability tweaks.
Fix compiler warnings.
2014-05-04 13:32:11 +05:30
Moinak Ghosh
518ecf23a7 Fix issue #16. 2014-01-28 20:44:45 +05:30
Moinak Ghosh
62568e9066 Basic capability to list contents of an archive without extracting to disk. 2014-01-12 20:38:20 +05:30
Moinak Ghosh
aef48f715f Change nftw() to depth-first scan to handle restoring directory permissions correctly.
When sorting cause directories to be sorted after files and in descending order of nesting level.
Take out stray printf().
2014-01-01 21:38:17 +05:30
Moinak Ghosh
683c3e48b5 Detect some DICOM formats and use BSC for DICOM data. 2014-01-01 19:44:58 +05:30
Moinak Ghosh
ea345a902a Overhaul documentation part #1
Detect and handle uncompressed PDF files using libbsc.
Force binary/text data detection for tar archives.
Get rid of unnecessary CLI option.
Add full pipeline mode check when archiving.
2013-12-30 23:24:37 +05:30
Moinak Ghosh
4c75a2da48 Fix issue #12.
Fix issue #13.
Create output directory with correct mode.
Fix the flow where pathname list is not sorted.
Fix ppmd decompression bug introduced in previous commit.
Reduce compression level for automatic pathname sorting.
Change to extraction directory only after opening archive.
2013-12-27 23:49:47 +05:30
Moinak Ghosh
5521955a94 Detect AR archives and set the type.
Re-use a less common type code for AR.
Use Dispack generically for all executables and AR archives.
2013-12-18 23:00:39 +05:30
Moinak Ghosh
a741f34f78 Move MSDOS COM single-byte magic number checks to last in the list.
Move advanced options flag into context structure.
Include dtd files as text type.
2013-12-18 00:09:32 +05:30
Moinak Ghosh
393fd790b0 Add more robust checks for Jpeg and packJPG format files in filter routine.
Use case-insensitive checks for extension names.
Enable more features based on compression level, when archiving.
2013-12-08 23:24:06 +05:30
Moinak Ghosh
306f145f22 Use libbsc/ppmd for BMP files.
Fix extension based hashing.
Do not append .pz extension to filenames already having it.
Some code formatting changes.
2013-11-28 22:42:51 +05:30
Moinak Ghosh
0192790c02 Add Dispack filter with auto-detection of x86 executables in archive mode.
More elaborate magic header based detection of 32-bit and 64-bit x86 binaries.
Always use fast-mode LZ4 in Adaptive modes.
2013-11-24 19:45:58 +05:30
Moinak Ghosh
1e2c3e479a Optimize preprocessed compression and avoid a bunch of memory copies.
Fix a crash.
Add a few more file types.
More comments.
2013-11-22 20:44:26 +05:30
Moinak Ghosh
664c8ef75b Fix fd leak. 2013-11-15 23:06:31 +05:30
Moinak Ghosh
c09a2b7b81 Fix issues when handling Jpegs where packJPG borks. 2013-11-15 23:02:09 +05:30
Moinak Ghosh
c567a1d2f5 Enable auto-filtering of archive entries based on compression level.
Miscellaneous fixes.
2013-11-14 21:54:46 +05:30
Moinak Ghosh
e90c52e516 Work in progress changes for packJPG encoding and decoding.
Enhance custom LibArchive filter functionlity.
2013-11-13 23:28:01 +05:30
Moinak Ghosh
75dfa6a6fb Add basic framework for file type based filters during libarchive stage.
Add packJPG filter for Jpeg files (not active yet).
Directory format changes for clarity.
2013-11-10 23:09:42 +05:30
Moinak Ghosh
a5f1624a33 Add own implementation of archive entry extraction to allow custom filters.
Fix magic number check for endianness.
2013-11-09 21:55:18 +05:30
Moinak Ghosh
6aacd903ff Structured handling of file types.
Handling of already compressed data based on compression algorithm.
Add a few more extension types.
2013-11-09 16:46:19 +05:30
Moinak Ghosh
cae9de9b2e Leverage file type detection(archiver) to improve compression performance.
Use detected file/data type(archiver) for Adaptive compression modes.
Update type flags and add more extensions.
2013-11-08 23:50:28 +05:30
Moinak Ghosh
b7facc929e Add file type detection based on magic values.
Add more comments.
Add more extensions.
2013-11-07 23:57:15 +05:30
Moinak Ghosh
991482403b Add extension based file type detection and setting segment data type.
Use Bob Jenkins Minimal Perfect Hash to check for known extensions.
Use semaphore signaling and direct buffer copy for extraction.
Miscellaneous fixes.
2013-11-07 21:48:54 +05:30
Moinak Ghosh
489b97cc79 Clear off private xattrs when extracting.
Enable pathname sorting only for high compression levels.
2013-11-04 18:35:22 +05:30