Enormous node memory usage for tiny values #1
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Hi gburd, I'm ecstatic that someone built this for and would love to use this in place of an existing gb_tree storing and searching thousands of keys but I noticed that I start getting pagefaults at storing 12k dwords on a linux machine with 16GB of RAM running R15B, making each hamt_node clock in at something like 1250000/8 bytes. I haven't yet run it through erlang:debugger(), any idea what might give?
I just got it working today, it's only a week old! ;-P There are a ton of places to improve performance, this really is just my learning exercise at this point. I'll continue to make this better as I work toward building a real ctrie for Erlang. We (the Erlang community) need HAMT/Ctrie, no more being jealous of Haskell and Clojure. I did manage in tests to store 1M {int(), int()} pairs, but that was just a quick test.
Ok, I am going to look into this tonight and see if I can figure out the origin of this and replicate it on R16.
You are the man, thanks for the help. BTW, there is at least one issue I know in the delete code where an cnode isn't collapsed when it only has one Node left. Here's how to trigger it:
2> hamt:delete(<<5>>, hamt:put([{<>, <>} || X <- lists:seq(1,6)], hamt:new())).
{hamt,{cnode,17629440,
[{snode,<<3>>,<<3>>},
{snode,<<6>>,<<6>>},
{snode,<<1>>,<<1>>},
{cnode,131072,[{snode,<<2>>,<<2>>}]},
{snode,<<4>>,<<4>>}]}}
the {cnode,131072,[{snode,<<2>>,<<2>>}]} should have collapsed to an {snode, <<2>>, <<2>>}.
BTW, FWIW I've fixed the delete issue but not yet looked closely at memory consumption yet.
Awesome!I couldn't replicate the issue on either R16 or R15 at home (very strange), I'm working on a NIF now to try to implement CTPOP to see if that helps me here.
Cool, you should consider this C implementation (http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel) or, if you're game for optimizing down to the ASM instruction to do the popcnt on those architectures that support it.
https://github.com/coapp-packages/nasm/blob/master/test/popcnt.asm
https://patchwork.kernel.org/patch/81479/
https://subversion.assembla.com/svn/min-loss-hashing/trunk/utils/bitops.h using __builtin_popcount*
or on some architectures there exists a POPCNT instruction in the CPU. This is a bit like the wonderful world of crc... it's deeper than expected.
The thing to do is to lobby for a popcnt BIF. :)
Last thing regarding bitpop, I never did fully debug why this wasn't working, I could spend a bit more time looking into that... to avoid the need for a NIF.
bitpop(0) -> 0;
bitpop(X) when is_integer(X), X > 0, X =< 16#FFFFFFFF ->
X1 = X - ((X bsr 1) band 16#55555555),
X2 = (X1 band 16#33333333) + ((X1 bsr 2) band 16#33333333),
((X2 + (X2 bsr 4) band 16#0F0F0F0F) * 16#01010101) bsr 24;
bitpop(X) when is_integer(X) -> %, X > 16#FFFFFFFF, X =< 16#FFFFFFFFFFFFFFFF ->
M1 = 16#5555555555555555, %% 0101...
M2 = 16#3333333333333333, %% 00110011..
M4 = 16#0f0f0f0f0f0f0f0f, %% 4 zeros, 4 ones ...
H01 = 16#0101010101010101, %% the sum of 256 to the power of 0,1,2,3...