Add README.basho_bench.md
This commit is contained in:
parent
9e41162e65
commit
b44c88fb97
4 changed files with 266 additions and 10 deletions
258
README.basho_bench.md
Normal file
258
README.basho_bench.md
Normal file
|
@ -0,0 +1,258 @@
|
|||
|
||||
# Using basho_bench to twiddle with Machi
|
||||
|
||||
"Twiddle"? Really, is that a word? (Yes, it is a real English word.)
|
||||
|
||||
## Benchmarking Machi's performance ... no, don't do it.
|
||||
|
||||
Machi isn't ready for benchmark testing. Its public-facing API isn't
|
||||
finished yet. Its internal APIs aren't quite finished yet either. So
|
||||
any results of "benchmarking" effort is something that has even less
|
||||
value **N** months from now than the usual benchmarking effort.
|
||||
|
||||
However, there are uses for a benchmark tool. For example, one of my
|
||||
favorites is to put **stress** on a system. I don't care about
|
||||
average or 99th-percentile latencies, but I **might** care very much
|
||||
about behavior.
|
||||
|
||||
* What happens if a Machi system is under moderate load, and then I
|
||||
stop one of the servers? What happens?
|
||||
* How quickly do the chain managers react?
|
||||
* How quickly do the client libraries within Machi react?
|
||||
* How quickly do the external client API libraries react?
|
||||
|
||||
* What happens if a Machi system is under heavy load, for example,
|
||||
100% CPU load. Not all 100% might be the Machi services. Some CPU
|
||||
consumption might be from the load generator, like `basho_bench`
|
||||
itself that is running on the same machine as a Machi server. Or
|
||||
perhaps it's a tiny C program that I wrote:
|
||||
|
||||
main()
|
||||
{ while (1) { ; } }
|
||||
|
||||
## An example of how adding moderate stress can find weird bugs
|
||||
|
||||
The driver/plug-in module for `basho_bench` is only a few hours old.
|
||||
(I'm writing on Wednesday, 2015-05-20.) But just now, I configured my
|
||||
basho_bench config file to try to contact a Machi cluster of three
|
||||
nodes ... but really, only one was running. The client library,
|
||||
`machi_cr_client.erl`, has **an extremely simple** method for dealing
|
||||
with failed servers. I know it's simple and dumb, but that's OK in
|
||||
many cases.
|
||||
|
||||
However, `basho_bench` and the `machi_cr_client.erl` were acting very,
|
||||
very badly. I couldn't figure it out until I took a peek at my OS's
|
||||
`dmesg` output, namely: `dmesg | tail`. It said things like this:
|
||||
|
||||
Limiting closed port RST response from 690 to 50 packets per second
|
||||
Limiting closed port RST response from 367 to 50 packets per second
|
||||
Limiting closed port RST response from 101 to 50 packets per second
|
||||
Limiting closed port RST response from 682 to 50 packets per second
|
||||
Limiting closed port RST response from 467 to 50 packets per second
|
||||
|
||||
Well, isn't that interesting?
|
||||
|
||||
This system was running on a single OS X machine: my MacBook Pro
|
||||
laptop, running OS X 10.10 (Yosemite). I have seen that error
|
||||
before. And I know how to fix it.
|
||||
|
||||
* **Option 1**: Change the client library config to ignore the Machi
|
||||
servers that I know will always be down during my experiment.
|
||||
* ** Option 2**: Use the following to change my OS's TCP stack RST
|
||||
behavior. (If a TCP port is not being listened to, the OS will
|
||||
send a RST packet to signal "connection refused".)
|
||||
|
||||
On OS X, the limit for RST packets is 50/second. The
|
||||
`machi_cr_client.erl` client can generate far more than 50/second, as
|
||||
the `Limiting closed port RST response...` messages above show. So, I
|
||||
used some brute-force to change the environment:
|
||||
|
||||
sudo sysctl -w net.inet.icmp.icmplim=20000
|
||||
|
||||
... and the problem disappeared.
|
||||
|
||||
## Starting with basho_bench: a step-by-step tutorial
|
||||
|
||||
First, clone the `basho_bench` source code, then compile it. You will
|
||||
need Erlang/OTP version R16B or later to compile. I recommend using
|
||||
Erlang/OTP 17.x, because I've been doing my Machi development using
|
||||
17.x.
|
||||
|
||||
cd /some/nice/dev/place
|
||||
git clone https://github.com/basho/basho_bench.git
|
||||
cd basho_bench
|
||||
make
|
||||
|
||||
In order to create graphs of `basho_bench` output, you'll need
|
||||
installed one of the following:
|
||||
|
||||
* R (the statistics package)
|
||||
* gnuplot
|
||||
|
||||
If you don't have either available on the machine(s) you're testing,
|
||||
but you do have R (or gnuplot) on some other machine **Y**, then you can
|
||||
copy the output files to machine **Y** and generate the graphs there.
|
||||
|
||||
## Compiling the Machi source
|
||||
|
||||
First, clone the `basho_bench` source code, then compile it. You will
|
||||
need Erlang/OTP version 17.x to compile.
|
||||
|
||||
cd /some/nice/dev/place
|
||||
git clone https://github.com/basho/machi.git
|
||||
cd machi
|
||||
make
|
||||
|
||||
## Creating a basho_bench test configuration file.
|
||||
|
||||
There are a couple of example `basho_bench` configuration files in the
|
||||
Machi `priv` directory.
|
||||
|
||||
* [basho_bench.append-example.config](priv/basho_bench.append-example.config),
|
||||
an example for writing Machi files.
|
||||
* [basho_bench.read-example.config](priv/basho_bench.read-example.config),
|
||||
an example for reading Machi files.
|
||||
|
||||
If you want a test to do both reading & writing ... well, the
|
||||
driver/plug-in is not mature enough to do it **well**. If you really
|
||||
want to, refer to the `basho_bench` docs for how to use the
|
||||
`operations` config option.
|
||||
|
||||
The `basho_bench` config file is configured in Erlang term format.
|
||||
Each configuration item is a 2-tuple followed by a period. Comments
|
||||
begin with a `%` character and continue to the end-of-line.
|
||||
|
||||
%% Mandatory: adjust this code path to top of your compiled Machi source distro
|
||||
{code_paths, ["/Users/fritchie/b/src/machi"]}.
|
||||
{driver, machi_basho_bench_driver}.
|
||||
|
||||
%% Chose your maximum rate (per worker proc, see 'concurrent' below)
|
||||
{mode, {rate, 25}}.
|
||||
|
||||
%% Runtime & reporting interval
|
||||
{duration, 10}. % minutes
|
||||
{report_interval, 1}. % seconds
|
||||
|
||||
%% Choose your number of worker procs
|
||||
{concurrent, 5}.
|
||||
|
||||
%% Here's a chain of (up to) length 3, all on localhost
|
||||
{machi_server_info,
|
||||
[
|
||||
{p_srvr,a,machi_flu1_client,"localhost",4444,[]},
|
||||
{p_srvr,b,machi_flu1_client,"localhost",4445,[]},
|
||||
{p_srvr,c,machi_flu1_client,"localhost",4446,[]}
|
||||
]}.
|
||||
{machi_ets_key_tab_type, set}. % 'set' or 'ordered_set'
|
||||
|
||||
%% Workload-specific definitions follow....
|
||||
|
||||
%% 10 parts 'append' operation + 0 parts anything else = 100% 'append' ops
|
||||
{operations, [{append, 10}]}.
|
||||
|
||||
%% For append, key = Machi file prefix name
|
||||
{key_generator, {concat_binary, <<"prefix">>,
|
||||
{to_binstr, "~w", {uniform_int, 30}}}}.
|
||||
|
||||
%% Increase size of value_generator_source_size if value_generator is big!!
|
||||
{value_generator_source_size, 2111000}.
|
||||
{value_generator, {fixed_bin, 32768}}. % 32 KB
|
||||
|
||||
In summary:
|
||||
|
||||
* Yes, you really need to change `code_paths` to be the same as your
|
||||
`/some/nice/dev/place/basho_bench` directory ... and that directory
|
||||
must be on the same machine(s) that you intend to run `basho_bench`.
|
||||
* Each worker process will have a rate limit of 25 ops/sec.
|
||||
* The test will run for 10 minutes and report stats every 1 second.
|
||||
* There are 5 concurrent worker processes. Each worker will
|
||||
concurrently issue commands from the `operations` list, within the
|
||||
workload throttle limit.
|
||||
* The Machi cluster is a collection of three servers, all on
|
||||
"localhost", and using TCP ports 4444-4446.
|
||||
* Don't change the `machi_ets_key_tab_type`
|
||||
* Our workload operation mix is 100% `append` operations.
|
||||
* The key generator for the `append` operation specifies the file
|
||||
prefix that will be chosen (at pseudo-random). In this case, we'll
|
||||
choose uniformly randomly between file prefix `prefix0` and
|
||||
`prefix29`.
|
||||
* The values that we append will be fixed 32KB length, but they will
|
||||
be chosen from a random byte string of 2,111,000 bytes.
|
||||
|
||||
There are many other options for `basho_bench`, especially for the
|
||||
`key_generator` and `value_generator` options. Please see the
|
||||
`basho_bench` docs for further information.
|
||||
|
||||
## Running basho_bench
|
||||
|
||||
You can run `basho_bench` using the command:
|
||||
|
||||
/some/nice/dev/place/basho_bench/basho_bench /path/to/config/file
|
||||
|
||||
... where `/path/to/config/file` is the path to your config file. (If
|
||||
you use an example from the `priv` dir, we recommend that you make a
|
||||
copy elsewhere, edit the copy, and then use the copy to run
|
||||
`basho_bench`.)
|
||||
|
||||
You'll create a stats output directory, called `tests`, in the current
|
||||
working directory. (Add `{results_dir, "/some/output/dir"}.` to change
|
||||
the default!)
|
||||
|
||||
Each time `basho_bench` is run, a new output stats directory is
|
||||
created in the `tests` directory. The symbolic link `tests/current`
|
||||
will always point to the last `basho_bench` run's output. But all
|
||||
prior results are always accessible! Take a look in this directory
|
||||
for all of the output.
|
||||
|
||||
## Generating some pretty graphs
|
||||
|
||||
If you are using R, then the following command will create a graph:
|
||||
|
||||
Rscript --vanilla /some/nice/dev/place/basho_bench/basho_bench/priv/summary.r -i $CWD/tests/current
|
||||
|
||||
If the `tests` directory is not in your current working dir (i.e. not
|
||||
in `$CWD`), then please alter the command accordingly.
|
||||
|
||||
R will create the final results graph in `$CWD/tests/current/summary.png`.
|
||||
|
||||
If you are using gnuplot, please look at
|
||||
`/some/nice/dev/place/basho_bench/basho_bench/Makefile` to see how to
|
||||
use gnuplot to create the final results graph.
|
||||
|
||||
## An example graph
|
||||
|
||||
So, without a lot of context about the **Machi system** or about the
|
||||
**basho_bench system** or about the ops being performed, here is an
|
||||
example graph that was created by R:
|
||||
|
||||
![](basho.github.io/machi/images/basho_bench.example0.png)
|
||||
|
||||
**Without context??* How do I remember the context?
|
||||
|
||||
My recommendation is: always keep the `.config` file together with the
|
||||
graph file. In the `tests` directory, `basho_bench` will always make
|
||||
a copy of the config file used to generate the test data.
|
||||
|
||||
This config tells you very little about the environment of the load
|
||||
generator machine or the Machi cluster, but ... you need to maintain
|
||||
that documentation yourself, please! You'll thank me for that advice,
|
||||
someday, 11 months from now when you can't remember the details of
|
||||
that important test that you ran so very long ago.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Really, we don't recommend using `basho_bench` for any serious
|
||||
performance measurement of Machi yet: Machi needs more maturity before
|
||||
it's reasonable to measure & judge its performance. But stress
|
||||
testing is indeed useful for reasons other than measuring
|
||||
Nth-percentile latency of operation `flarfbnitz`. We hope that this
|
||||
tutorial has been helpful!
|
||||
|
||||
If you encounter any difficulty with this tutorial or with Machi,
|
||||
please open an issue/ticket at [GH Issues for
|
||||
Machi](https://github.com/basho/machi/issues) ... use the green "New
|
||||
issue" button. There are bugs and misfeatures in the `basho_bench`
|
||||
plugin, sorry, but please help us fix them.
|
||||
|
||||
> -Scott Lystig Fritchie
|
||||
> Machi Team @ Basho
|
|
@ -39,7 +39,7 @@ func, and pattern match Erlang style in that func.
|
|||
|
||||
** DONE Adapt the projection-aware, CR-implementing client from demo-day
|
||||
** DONE Add major comment sections to the CR-impl client
|
||||
** TODO Simple basho_bench driver, put some unscientific chalk on the benchtop
|
||||
** DONE Simple basho_bench driver, put some unscientific chalk on the benchtop
|
||||
** TODO Create parallel PULSE test for basic API plus chain manager repair
|
||||
** TODO Add client-side vs. server-side checksum type, expand client API?
|
||||
** TODO Add gproc and get rid of registered name rendezvous
|
||||
|
|
|
@ -7,9 +7,9 @@
|
|||
%{mode, {rate,20}}.
|
||||
{mode, max}.
|
||||
|
||||
%% Runtime & reporting interval (seconds)
|
||||
{duration, 10}.
|
||||
{report_interval, 1}.
|
||||
%% Runtime & reporting interval
|
||||
{duration, 10}. % minutes
|
||||
{report_interval, 1}. % seconds
|
||||
|
||||
%% Choose your number of worker procs
|
||||
%{concurrent, 1}.
|
||||
|
@ -37,11 +37,9 @@
|
|||
{operations, [{append, 10}]}.
|
||||
|
||||
%% For append, key = Machi file prefix name
|
||||
{key_generator, {concat_binary, <<"prefix">>,
|
||||
{to_binstr, "~w", {uniform_int, 30}}}}.
|
||||
{key_generator, {to_binstr, "prefix~w", {uniform_int, 30}}}.
|
||||
|
||||
%% Increase size of value_generator_source_size if value_generator is big!!
|
||||
{value_generator_source_size, 2111000}.
|
||||
{value_generator, {fixed_bin, 32768}}. % 32 KB
|
||||
%{value_generator, {fixed_bin, 1048576}}. % 1024 KB
|
||||
|
||||
|
|
|
@ -7,9 +7,9 @@
|
|||
%{mode, {rate,20}}.
|
||||
{mode, max}.
|
||||
|
||||
%% Runtime & reporting interval (seconds)
|
||||
{duration, 10}.
|
||||
{report_interval, 1}.
|
||||
%% Runtime & reporting interval
|
||||
{duration, 10}. % minutes
|
||||
{report_interval, 1}. % seconds
|
||||
|
||||
%% Choose your number of worker procs
|
||||
%{concurrent, 1}.
|
||||
|
|
Loading…
Reference in a new issue