This commit is contained in:
Gregory Burd 2024-07-01 04:12:37 -04:00
parent 6c8ad3b25f
commit 3801bbdaf5
3 changed files with 27 additions and 2 deletions

View file

@ -87,7 +87,7 @@ target_link_libraries(soak PUBLIC m)
add_custom_target(run_soak COMMAND soak WORKING_DIRECTORY ${CMAKE_BINARY_DIR})
# Add fuzzer program
# add_executable(fuzzer tests/fuzzer.c)
# add_executable(fuzzer EXCLUDE_FROM_ALL tests/fuzzer.c)
# target_link_libraries(fuzzer PRIVATE sparsemap)
# target_include_directories(fuzzer PRIVATE ${HEADER_DIR} lib)
# target_link_libraries(fuzzer PUBLIC m)

View file

@ -42,6 +42,31 @@ The actual memory sequence looks like this:
Instead of storing 8 Words (16 bytes), we only store 2 Words (2 bytes): one
for the descriptor, and one for the last sm_bitvec_t #7.
A 2nd example shows a sequence of 4 x 16 bits (here, each sm_bitvec_t and the
Descriptor word has 16 bits), in this example there is a run of adjacent 1s
greater than 64 bits long after 64 0s.
Descriptor: Vector for descriptor #1 as follows:
00 01 00 00 00 00 00 10 00 00 00 00 10 00 01 00
^^-- sm_bitvec_t #0 is "0000000000000000"
^^-- sm_bitvec_t #1 is a run of 132 (0x84 or 0b0000000010000100) 1s
The first 2 bits of the descriptor above (a sm_bitvec_t) indicate a run of 16
zeros. The number of zeros is 16 in this case because the length of a bitvec_t
is 16 (in this somewhat contrived example), but commonly this is either 32 or 64
depending on your system's architecture. After that, the next 2 bits in the
descriptor are '01' indicating a run-length encoded set of adjacent 1s longer
than 16 (again in this case where the bitvec_t is 16 bits wide). The
corosponding bitvec_t contains the actual length, in this case it is 132 (0x84
or 0b0000000010000100).
The actual memory sequence for this second example looks like this:
0001000000000000 0000000010000100
Using this method of RLE for adjacent 1s we can compress (again, in this case
where bitvec_t is 16 bits wide) 2^16 or 65536 adjacent 1s.
The sparsemap stores a list of chunk maps, and for each chunk map, it stores the
absolute address (i.e. if the user sets bit 0 and bit 10000, and the chunk map
capacity is 2048, the sparsemap creates two chunk maps; the first starts at

View file

@ -49,8 +49,8 @@
*
* 00 The sm_bitvec_t is all zero -> sm_bitvec_t is not stored
* 11 The sm_bitvec_t is all one -> sm_bitvec_t is not stored
* 01 The sm_bitvec_t is a run of 1s of a sm_bitvec_t length
* 10 The sm_bitvec_t contains a bitmap -> sm_bitvec_t is stored
* 01 The sm_bitvec_t is not used (**)
*
* The serialized size of a chunk map in memory therefore is at least
* one sm_bitvec_t for the flags, and (optionally) additional sm_bitvec_ts