10 KiB
Using basho_bench to twiddle with Machi
"Twiddle"? Really, is that a word? (Yes, it is a real English word.)
Benchmarking Machi's performance ... no, don't do it.
Machi isn't ready for benchmark testing. Its public-facing API isn't finished yet. Its internal APIs aren't quite finished yet either. So any results of "benchmarking" effort is something that has even less value N months from now than the usual benchmarking effort.
However, there are uses for a benchmark tool. For example, one of my favorites is to put stress on a system. I don't care about average or 99th-percentile latencies, but I might care very much about behavior.
-
What happens if a Machi system is under moderate load, and then I stop one of the servers? What happens?
- How quickly do the chain managers react?
- How quickly do the client libraries within Machi react?
- How quickly do the external client API libraries react?
-
What happens if a Machi system is under heavy load, for example, 100% CPU load. Not all 100% might be the Machi services. Some CPU consumption might be from the load generator, like
basho_bench
itself that is running on the same machine as a Machi server. Or perhaps it's a tiny C program that I wrote:main() { while (1) { ; } }
An example of how adding moderate stress can find weird bugs
The driver/plug-in module for basho_bench
is only a few hours old.
(I'm writing on Wednesday, 2015-05-20.) But just now, I configured my
basho_bench config file to try to contact a Machi cluster of three
nodes ... but really, only one was running. The client library,
machi_cr_client.erl
, has an extremely simple method for dealing
with failed servers. I know it's simple and dumb, but that's OK in
many cases.
However, basho_bench
and the machi_cr_client.erl
were acting very,
very badly. I couldn't figure it out until I took a peek at my OS's
dmesg
output, namely: dmesg | tail
. It said things like this:
Limiting closed port RST response from 690 to 50 packets per second
Limiting closed port RST response from 367 to 50 packets per second
Limiting closed port RST response from 101 to 50 packets per second
Limiting closed port RST response from 682 to 50 packets per second
Limiting closed port RST response from 467 to 50 packets per second
Well, isn't that interesting?
This system was running on a single OS X machine: my MacBook Pro laptop, running OS X 10.10 (Yosemite). I have seen that error before. And I know how to fix it.
- Option 1: Change the client library config to ignore the Machi servers that I know will always be down during my experiment.
- ** Option 2**: Use the following to change my OS's TCP stack RST behavior. (If a TCP port is not being listened to, the OS will send a RST packet to signal "connection refused".)
On OS X, the limit for RST packets is 50/second. The
machi_cr_client.erl
client can generate far more than 50/second, as
the Limiting closed port RST response...
messages above show. So, I
used some brute-force to change the environment:
sudo sysctl -w net.inet.icmp.icmplim=20000
... and the problem disappeared.
Starting with basho_bench: a step-by-step tutorial
First, clone the basho_bench
source code, then compile it. You will
need Erlang/OTP version R16B or later to compile. I recommend using
Erlang/OTP 17.x, because I've been doing my Machi development using
17.x.
cd /some/nice/dev/place
git clone https://github.com/basho/basho_bench.git
cd basho_bench
make
In order to create graphs of basho_bench
output, you'll need
installed one of the following:
- R (the statistics package)
- gnuplot
If you don't have either available on the machine(s) you're testing, but you do have R (or gnuplot) on some other machine Y, then you can copy the output files to machine Y and generate the graphs there.
Compiling the Machi source
First, clone the Machi source code, then compile it. You will need Erlang/OTP version 17.x to compile.
cd /some/nice/dev/place
git clone https://github.com/basho/machi.git
cd machi
make
Creating a basho_bench test configuration file.
There are a couple of example basho_bench
configuration files in the
Machi priv
directory.
- basho_bench.append-example.config, an example for writing Machi files.
- basho_bench.read-example.config, an example for reading Machi files.
If you want a test to do both reading & writing ... well, the
driver/plug-in is not mature enough to do it well. If you really
want to, refer to the basho_bench
docs for how to use the
operations
config option.
The basho_bench
config file is configured in Erlang term format.
Each configuration item is a 2-tuple followed by a period. Comments
begin with a %
character and continue to the end-of-line.
%% Mandatory: adjust this code path to top of your compiled Machi source distro
{code_paths, ["/Users/fritchie/b/src/machi"]}.
{driver, machi_basho_bench_driver}.
%% Chose your maximum rate (per worker proc, see 'concurrent' below)
{mode, {rate, 25}}.
%% Runtime & reporting interval
{duration, 10}. % minutes
{report_interval, 1}. % seconds
%% Choose your number of worker procs
{concurrent, 5}.
%% Here's a chain of (up to) length 3, all on localhost
{machi_server_info,
[
{p_srvr,a,machi_flu1_client,"localhost",4444,[]},
{p_srvr,b,machi_flu1_client,"localhost",4445,[]},
{p_srvr,c,machi_flu1_client,"localhost",4446,[]}
]}.
{machi_ets_key_tab_type, set}. % 'set' or 'ordered_set'
%% Workload-specific definitions follow....
%% 10 parts 'append' operation + 0 parts anything else = 100% 'append' ops
{operations, [{append, 10}]}.
%% For append, key = Machi file prefix name
{key_generator, {concat_binary, <<"prefix">>,
{to_binstr, "~w", {uniform_int, 30}}}}.
%% Increase size of value_generator_source_size if value_generator is big!!
{value_generator_source_size, 2111000}.
{value_generator, {fixed_bin, 32768}}. % 32 KB
In summary:
- Yes, you really need to change
code_paths
to be the same as your/some/nice/dev/place/basho_bench
directory ... and that directory must be on the same machine(s) that you intend to runbasho_bench
. - Each worker process will have a rate limit of 25 ops/sec.
- The test will run for 10 minutes and report stats every 1 second.
- There are 5 concurrent worker processes. Each worker will
concurrently issue commands from the
operations
list, within the workload throttle limit. - The Machi cluster is a collection of three servers, all on "localhost", and using TCP ports 4444-4446.
- Don't change the
machi_ets_key_tab_type
- Our workload operation mix is 100%
append
operations. - The key generator for the
append
operation specifies the file prefix that will be chosen (at pseudo-random). In this case, we'll choose uniformly randomly between file prefixprefix0
andprefix29
. - The values that we append will be fixed 32KB length, but they will be chosen from a random byte string of 2,111,000 bytes.
There are many other options for basho_bench
, especially for the
key_generator
and value_generator
options. Please see the
basho_bench
docs for further information.
Running basho_bench
You can run basho_bench
using the command:
/some/nice/dev/place/basho_bench/basho_bench /path/to/config/file
... where /path/to/config/file
is the path to your config file. (If
you use an example from the priv
dir, we recommend that you make a
copy elsewhere, edit the copy, and then use the copy to run
basho_bench
.)
You'll create a stats output directory, called tests
, in the current
working directory. (Add {results_dir, "/some/output/dir"}.
to change
the default!)
Each time basho_bench
is run, a new output stats directory is
created in the tests
directory. The symbolic link tests/current
will always point to the last basho_bench
run's output. But all
prior results are always accessible! Take a look in this directory
for all of the output.
Generating some pretty graphs
If you are using R, then the following command will create a graph:
Rscript --vanilla /some/nice/dev/place/basho_bench/basho_bench/priv/summary.r -i $CWD/tests/current
If the tests
directory is not in your current working dir (i.e. not
in $CWD
), then please alter the command accordingly.
R will create the final results graph in $CWD/tests/current/summary.png
.
If you are using gnuplot, please look at
/some/nice/dev/place/basho_bench/basho_bench/Makefile
to see how to
use gnuplot to create the final results graph.
An example graph
So, without a lot of context about the Machi system or about the basho_bench system or about the ops being performed, here is an example graph that was created by R:
*Without context?? How do I remember the context?
My recommendation is: always keep the .config
file together with the
graph file. In the tests
directory, basho_bench
will always make
a copy of the config file used to generate the test data.
This config tells you very little about the environment of the load generator machine or the Machi cluster, but ... you need to maintain that documentation yourself, please! You'll thank me for that advice, someday, 11 months from now when you can't remember the details of that important test that you ran so very long ago.
Conclusion
Really, we don't recommend using basho_bench
for any serious
performance measurement of Machi yet: Machi needs more maturity before
it's reasonable to measure & judge its performance. But stress
testing is indeed useful for reasons other than measuring
Nth-percentile latency of operation flarfbnitz
. We hope that this
tutorial has been helpful!
If you encounter any difficulty with this tutorial or with Machi,
please open an issue/ticket at GH Issues for
Machi ... use the green "New
issue" button. There are bugs and misfeatures in the basho_bench
plugin, sorry, but please help us fix them.
-Scott Lystig Fritchie, Machi Team @ Basho