In progress of updating README.

2021-02-04 22:08:39 -08:00 · 2021-02-04 22:08:39 -08:00 · 1348ee4527
commit 1348ee4527
parent 49b9c17dd4
2 changed files with 293 additions and 87 deletions
--- a/README.md
+++ b/README.md
@ -1,24 +1,23 @@
-Quorums
+Quoracle
-=======
+========
-## Installation
+Quoracle is a library for constructing and analyzing [read-write quorum
-TODO(mwhittaker): Make this package pip'able. For now, you have to clone and
+systems](https://scholar.google.com/scholar?cluster=4847365665094368145). Run
-install the dependencies yourself:
+`pip install quoracle` and then follow along with the tutorial below to get
 started.
-```
+## Quorum Systems
 pip install -r requirements.txt
 ```
 ## Tutorial
 Given a set of nodes `X`, a _read-write quorum system_ is a pair `(R, W)` where
-`R` is a set of subsets of `X` called _read quorums_ and `W` is a set of
+
-subsets of `X` called _write quorums_. A read-write quorum system satisfies the
+1. `R` is a set of subsets of `X` called _read quorums_,
-property that every read quorum intersects every write quorum. This library
+2. `W` is a set of subsets of `X` called _write quorums_, and
-allows us to construct and analyze arbitrary read-write quorum systems. First,
+3. every read quorum intersects every write quorum.
-we import the library.
+
 quoracle allows us to construct and analyze arbitrary read-write quorum
 systems. First, we import the library.
 ```python
-from quorums import *
+from quoracle import *
 ```
 Next, we specify the nodes in our quorum system. Our nodes can be strings,
@ -33,7 +32,7 @@ e = Node('e')
 f = Node('f')
 ```
-Here, we construct a two by three grid of nodes. Every row is read quorum, and
+Now, we construct a two by three grid of nodes. Every row is read quorum, and
 one element from every row is a write quorum. Note that when we construct a
 quorum system, we only have to specify the set of read quorums. The library
 figures out the optimal set of write quorums automatically.
@ -42,32 +41,44 @@ figures out the optimal set of write quorums automatically.
 grid = QuorumSystem(reads=a*b*c + d*e*f)
 ```
-This prints `{'a', 'b', 'c'}` and `{'d', 'e', 'f'}`.
+This next code snippet prints out the read quorums `{'a', 'b', 'c'}` and `{'d',
 'e', 'f'}`.
 ```python
 for r in grid.read_quorums():
    print(r)
 ```
-This prints `{'a', 'd'}`, `{'a', 'e'}`, `{'b', 'f'}`, `{'b', 'd'}`, ...
+And this next code snippet prints out the write quorums `{'a', 'd'}`, `{'a',
 'e'}`, `{'b', 'f'}`, `{'b', 'd'}`, ...
 ```python
 for w in grid.write_quorums():
    print(w)
 ```
-Alternatively, we could specify the write quorums...
+Alternatively, we can construct a quorum system be specifying the write
 quorums.
 ```python
 QuorumSystem(writes=(a + b + c) * (d + e + f))
 ```
-or both the read and write quorums.
+Or, we can specify both the read and write quorums.
 ```python
 QuorumSystem(reads=a*b*c + d*e*f, writes=(a + b + c) * (d + e + f))
 ```
 But, remember that every read quorum must intersect every write quorum. If we
 try to construct a quorum system with non-overlapping quorums, an exception
 will be thrown.
 ```python
 QuorumSystem(reads=a+b+c, writes=d+e+f)
 # ValueError: Not all read quorums intersect all write quorums
 ```
 We can check whether a given set is a read or write quorum. Note that any
 superset of a quorum is also considered a quorum.
@ -81,10 +92,19 @@ grid.is_write_quorum({'a', 'd', 'd'}) # True
 grid.is_write_quorum({'a', 'b'})      # False
 ```
-The read resilience of our quorum system is the largest number `f` such that
+## Resilience
 The _read resilience_ of our quorum system is the largest number `f` such that
 despite the failure of any `f` nodes, we still have at least one read quorum.
-Write resilience is defined similarly, and resilience is the minimum of read
+_Write resilience_ is defined similarly, and _resilience_ is the minimum of
-and write resilience.
+read and write resilience.
 Here, we print out the read resilience, write resilience, and resilience of our
 grid quorum system. We can fail any one node and still have a read quorum, but
 if we fail one node from each row, we eliminate every read quorum, so the read
 resilience is 1. Similarly, we can fail any two nodes and still have a write
 quorum, but if we fail one node from every column, we eliminate every write
 quorum, so our write resilience is 1. The resilience is the minimum of 1 and 2,
 which is 1.
 ```python
 grid.read_resilience()  # 1
@ -92,84 +112,226 @@ grid.write_resilience() # 2
 grid.resilience()       # 1
 ```
 ## Strategies
 A _strategy_ is a discrete probability distribution over the set of read and
-write quorums. A strategy gives us a way to pick quorums at random. The load of
+write quorums. A strategy gives us a way to pick quorums at random. We'll see
-a node is the probability that the node is selected by the strategy, and the
+how to construct optimal strategies in a second, but for now, we'll construct a
-load of a strategy is the load of the most heavily loaded node. Using the
+strategy by hand. To do so, we have to provide a probability distribution over
-`strategy` method, we get a load-optimal strategy, i.e. the strategy with the
+the read quorums and a probability distribution over the write quorums. Here,
-lowest possible load.
+we'll pick the top row twice as often as the bottom row, and we'll pick each
 column uniformly at random. Note that when we specify a probability
 distribution, we don't have to provide exact probabilities. We can simply pass
 in weights, and the library will automatically normalize the weights into a
 valid probability distribution.
 ```python
 # The read quorum strategy.
 sigma_r = {
    frozenset({'a', 'b', 'c'}): 2.,
    frozenset({'d', 'e', 'f'}): 1.,
 }
 # The write quorum strategy.
 sigma_w = {
    frozenset({'a', 'd'}): 1.,
    frozenset({'b', 'e'}): 1.,
    frozenset({'c', 'f'}): 1.,
 }
 strategy = grid.make_strategy(sigma_r, sigma_w)
 ```
 Once we have a strategy, we can use it to sample read and write quorums. Here,
 we expect `get_read_quorum` to return the top row twice as often as the bottom
 row, and we expect `get_write_quorum` to return every column uniformly at
 random.
 ```python
 print(strategy.get_read_quorum())
 print(strategy.get_read_quorum())
 print(strategy.get_read_quorum())
 print(strategy.get_read_quorum())
 print(strategy.get_write_quorum())
 print(strategy.get_write_quorum())
 print(strategy.get_write_quorum())
 print(strategy.get_write_quorum())
 ```
 ## Load and Capacity
 Typically in a distributed system, a read quorum of nodes is contacted to
 perform a read, and a write quorum of nodes is contacted to perform a write.
-Though we get to pick a strategy, we don't get to pick the fraction of
+Assume we have a workload with a _read fraction_ `fr` of reads and a _write
-operations that are reads and the fraction of operations that are writes.  This
+fraction_ `fw = 1 - fr` of writes. Given a strategy, the _load of a node_ is
-is determined by the workload. When constructing a strategy, we have to specify
+the probability that the node is selected by the strategy. The _load of a
-the workload. The returned strategy is optimal only against this workload.
+strategy_ is the load of the most heavily loaded node. The _load of a quorum
-Here, we construct a strategy assuming that 75% of all operations are reads.
+system_ is the load of the optimal strategy, i.e. the strategy that achieves
 the lowest load. The most heavily loaded node in a quorum system is a
 throughput bottleneck, so the lower the load the better.
 Let's calculate the load of our strategy assuming a 100% read workload (i.e. a
 workload with a read fraction of 1).
 - The load of `a` is 2/3 because the read quorum `{a, b, c}` is chosen 2/3 of
  the time.
 - The load of `b` is 2/3 because the read quorum `{a, b, c}` is chosen 2/3 of
  the time.
 - The load of `c` is 2/3 because the read quorum `{a, b, c}` is chosen 2/3 of
  the time.
 - The load of `d` is 1/3 because the read quorum `{d, e, f}` is chosen 2/3 of
  the time.
 - The load of `e` is 1/3 because the read quorum `{d, e, f}` is chosen 2/3 of
  the time.
 - The load of `f` is 1/3 because the read quorum `{d, e, f}` is chosen 2/3 of
  the time.
 The largest node load is 2/3, so our strategy has a load of 2/3. Rather than
 calculating load by hand, we can simply call the `load` function.
 ```python
-strategy = grid.strategy(read_fraction=0.75)
+print(strategy.load(read_fraction=1)) # 2/3
 ```
-We can use the strategy to sample read and write quorums.
+Now let's calculate the load of our strategy assuming a 100% write workload.
 Again, we calculate the load on every node.
 - The load of `a` is 1/3 because the write quorum `{a, d}` is chosen 1/3 of
  the time.
 - The load of `b` is 1/3 because the write quorum `{b, e}` is chosen 1/3 of
  the time.
 - The load of `c` is 1/3 because the write quorum `{c, f}` is chosen 1/3 of
  the time.
 - The load of `d` is 1/3 because the write quorum `{a, d}` is chosen 1/3 of
  the time.
 - The load of `e` is 1/3 because the write quorum `{b, e}` is chosen 1/3 of
  the time.
 - The load of `f` is 1/3 because the write quorum `{c, f}` is chosen 1/3 of
  the time.
 The largest node load is 1/3, so our strategy has a load of 1/3. Again, rather
 than calculating load by hand, we can simply call the `load` function. Note
 that we can pass in a `read_fraction` or `write_fraction` but not both.
 ```python
-print(strategy.get_read_quorum())
+print(strategy.load(write_fraction=1)) # 1/3
 print(strategy.get_read_quorum())
 print(strategy.get_read_quorum())
 print(strategy.get_write_quorum())
 print(strategy.get_write_quorum())
 print(strategy.get_write_quorum())
 ```
-We can query the strategy's load.
+Now let's calculate the load of our strategy on a 25% read and 75% write
 workload.
 - The load of `a` is `0.25 * 2/3 + 0.75 * 1/3 = 5/12` because 25% of the time
  we perform a read and select the read quorum `{a, b, c}` with 2/3 probability
  and 75% of the time, we perform a write and select the write quorum `{a, d}`
  with 1/3 probability.
 - The load of `b` is `0.25 * 2/3 + 0.75 * 1/3 = 5/12` because 25% of the time
  we perform a read and select the read quorum `{a, b, c}` with 2/3 probability
  and 75% of the time, we perform a write and select the write quorum `{b, e}`
  with 1/3 probability.
 - The load of `c` is `0.25 * 2/3 + 0.75 * 1/3 = 5/12` because 25% of the time
  we perform a read and select the read quorum `{a, b, c}` with 2/3 probability
  and 75% of the time, we perform a write and select the write quorum `{c, f}`
  with 1/3 probability.
 - The load of `d` is `0.25 * 1/3 + 0.75 * 1/3 = 1/3` because 25% of the time
  we perform a read and select the read quorum `{d, e, f}` with 2/3 probability
  and 75% of the time, we perform a write and select the write quorum `{a, d}`
  with 1/3 probability.
 - The load of `e` is `0.25 * 1/3 + 0.75 * 1/3 = 1/3` because 25% of the time
  we perform a read and select the read quorum `{d, e, f}` with 2/3 probability
  and 75% of the time, we perform a write and select the write quorum `{b, e}`
  with 1/3 probability.
 - The load of `f` is `0.25 * 1/3 + 0.75 * 1/3 = 1/3` because 25% of the time
  we perform a read and select the read quorum `{d, e, f}` with 2/3 probability
  and 75% of the time, we perform a write and select the write quorum `{c, f}`
  with 1/3 probability.
 The largest node load is 5/12, so our strategy has a load of 5/12. At this
 point, you can see that calculating load by hand is extremely tedious. We could
 have skipped all that work and called `load` instead!
 ```python
-strategy.load(read_fraction=0.75) # 0.458
+print(strategy.load(read_fraction=0.25)) # 5/12
 ```
-We can query the strategy's load on other workloads as well, though the
+We can also compute the load on every node.
 strategy may not be optimal.
 ```python
-strategy.load(read_fraction=0)   # 0.333
+print(strategy.node_load(a, read_fraction=0.25)) # 5/12
-strategy.load(read_fraction=0.5) # 0.416
+print(strategy.node_load(b, read_fraction=0.25)) # 5/12
-strategy.load(read_fraction=1)   # 0.5
+print(strategy.node_load(c, read_fraction=0.25)) # 5/12
 print(strategy.node_load(d, read_fraction=0.25)) # 1/3
 print(strategy.node_load(e, read_fraction=0.25)) # 1/3
 print(strategy.node_load(f, read_fraction=0.25)) # 1/3
 ```
-This is a shorthand for
+Our strategy has a load of 5/12 on a 25% read workload, but what about the
-`grid.strategy(read_fraction=0.25).load(read_fraction=0.25)`.
+quorum system? The quorum system does __not__ have a load of 5/12 because our
 strategy is not optimal. We can call the `strategy` function to compute the
 optimal strategy automatically.
 ```python
 strategy = grid.strategy(read_fraction=0.25)
 print(strategy)
 # Strategy(reads={('a', 'b', 'c'): 0.5,
 #                 ('d', 'e', 'f'): 0.5},
 #          writes={('a', 'f'): 0.33333333,
 #                  ('b', 'e'): 0.33333333,
 #                  ('c', 'd'): 0.33333333})
 print(strategy.load(read_fraction=0.25)) # 3/8
 ```
 Here, we see that the optimal strategy picks all rows and all columns
 uniformly. This strategy has a load of 3/8 on the 25% read workload. Since this
 strategy is optimal, that means our quorum system also has a load of 3/8 on a
 25% workload.
 We can also query this strategy's load on other workloads as well. Note that
 this strategy is optimal for a read fraction of 25%, but it may not be optimal
 for other read fractions.
 ```python
 print(strategy.load(read_fraction=0))   # 1/3
 print(strategy.load(read_fraction=0.5)) # 5/12
 print(strategy.load(read_fraction=1))   # 1/2
 ```
 We can also use a quorum system's `load` function. The code snippet below is a
 shorthand for `grid.strategy(read_fraction=0.25).load(read_fraction=0.25)`.
 ```python
 grid.load(read_fraction=0.25) # 0.375
 ```
-In the real world, we don't often have a fixed workload. Workloads change
+The capacity of strategy or quorum is simply the inverse of the load. Our
-over time. Instead of specifying a fixed read fraction, we can provide a
+quorum system has a load of 3/8 on a 25% read workload, so it has a capacity of
-discrete probability distribution of read fractions. Here, we say that the
+8/3.
-read fraction is 10% half the time and 75% half the time. `strategy` will
+
-return the strategy that minimizes the expected load according to this
+```python
 print(grid.capacity(read_fraction=0.25)) # 8/3
 ```
 The _capacity_ of a quorum system is proportional to the maximum throughput
 that it can achieve before a node becomes bottlenecked. Here, if every node
 could process 100 commands per second, then our quorum system could process
 800/3 commands per second.
 ## Workload Distributions
 In the real world, we don't often have a workload with a fixed read fraction.
 Workloads change over time. Instead of specifying a fixed read fraction, we can
 provide a discrete probability distribution of read fractions. Here, we say
 that the read fraction is 10% half the time and 75% half the time. `strategy`
 will return the strategy that minimizes the expected load according to this
 distribution.
 ```python
-distribution = {0.1: 0.5, 0.75: 0.5}
+distribution = {0.1: 1, 0.75: 1}
 strategy = grid.strategy(read_fraction=distribution)
 strategy.load(read_fraction=distribution) # 0.404
 ```
-We can also specify the write fraction instead of the read fraction, if we
+## Heterogeneous Node
 prefer.
 ```python
 strategy = grid.strategy(write_fraction=0.75)
 strategy.load(write_fraction=distribution) # 0.429
 ```
 In the real world, not all nodes are equal. We often run distributed systems on
-heterogenous hardware, so some nodes might be faster than others. To model
+heterogeneous hardware, so some nodes might be faster than others. To model
-this, we instatiate every node with its capacity. Here, nodes a, c, and e can
+this, we instantiate every node with its capacity. Here, nodes `a`, `c`, and
-process 1000 commands per second, while nodes b, d, and f can only process 500
+`e` can process 1000 commands per second, while nodes `b`, `d`, and `f` can
-requests per second.
+only process 500 requests per second.
 ```python
 a = Node('a', capacity=1000)
@ -180,10 +342,10 @@ e = Node('e', capacity=1000)
 f = Node('f', capacity=500)
 ```
-Now, load can be interpreted as the inverse of the peak throughput of the
+Now, the definition of capacity becomes much simpler. The capacity of a quorum
-quorum system. We can also call `capacity` to get this inverse directly.
+system is simply the maximum throughput that it can achieve. The load can be
-Here, our quorum system is capable of processing 1333 commands per second for
+interpreted as the inverse of the capacity. Here, our quorum system is capable
-a workload of 75% reads.
+of processing 1333 commands per second for a workload of 75% reads.
 ```python
 grid = QuorumSystem(reads=a*b*c + d*e*f)
@ -215,17 +377,18 @@ grid.capacity(read_fraction=0.5) # 3913
 grid.capacity(read_fraction=0)   # 2000
 ```
 # `f`-resilient Strategies
 Another real world complication is the fact that machines sometimes fail and
 are sometimes slow. If we contact a quorum of nodes, some of them may fail, and
 we'll get stuck waiting to hear back from them. Or, some of them may be
 stragglers, and we'll wait longer than we'd like. We can address this problem
 by contacting more than the bare minimum number of nodes.
-Formally, we say a read quorum (or write quorum) q is _f-resilient_ if despite
+Formally, we say a read quorum (or write quorum) q is _`f`-resilient_ if
-the failure of any f nodes, q still forms a read quorum (or write quorum). A
+despite the failure of any `f` nodes, q still forms a read quorum (or write
-strategy is f-resilient if it only selects f-resilient quorums. By default,
+quorum). A strategy is `f`-resilient if it only selects `f`-resilient quorums.
-`strategy` returns 0-resilient quorums. We can pass in the `f` argument to get
+By default, `strategy` returns `0`-resilient quorums. We can pass in the `f`
-more resilient strategies.
+argument to get more resilient strategies.
 ```python
 strategy = grid.strategy(read_fraction=0.5, f=1)
@ -238,6 +401,18 @@ strategy.get_read_quorum()
 strategy.get_write_quorum()
 ```
 ## Latency
 TODO(mwhittaker): Write.
 ## Network Load
 TODO(mwhittaker): Write.
 ## Search
 TODO(mwhittaker): Write.
 ## Case Study
 TODO(mwhittaker): Update.
 Putting everything together, we can use this library to pick quorum systems
 that are well suited to our workload. For example, say we're implementing a
 distributed file system and want to pick a 5 node quorum system with a
--- a/examples/tutorial.py
+++ b/examples/tutorial.py
@ -19,6 +19,9 @@ QuorumSystem(writes=(a + b + c) * (d + e + f))
 QuorumSystem(reads=a*b*c + d*e*f, writes=(a + b + c) * (d + e + f))
 # QuorumSystem(reads=a+b+c, writes=d+e+f)
 # ValueError: Not all read quorums intersect all write quorums
 print(grid.is_read_quorum({'a', 'b', 'c'}))       # True
 print(grid.is_read_quorum({'a', 'b', 'c', 'd'}))  # True
 print(grid.is_read_quorum({'a', 'b', 'd'}))       # False
@ -31,30 +34,58 @@ print(grid.read_resilience())  # 1
 print(grid.write_resilience()) # 2
 print(grid.resilience())       # 1
-strategy = grid.strategy(read_fraction=0.75)
+# The read quorum strategy.
 sigma_r = {
    frozenset({'a', 'b', 'c'}): 2.,
    frozenset({'d', 'e', 'f'}): 1.,
 }
 # The write quorum strategy.
 sigma_w = {
    frozenset({'a', 'd'}): 1.,
    frozenset({'b', 'e'}): 1.,
    frozenset({'c', 'f'}): 1.,
 }
 strategy = grid.make_strategy(sigma_r, sigma_w)
 print(strategy.get_read_quorum())
 print(strategy.get_read_quorum())
 print(strategy.get_read_quorum())
 print(strategy.get_read_quorum())
 print(strategy.get_write_quorum())
 print(strategy.get_write_quorum())
 print(strategy.get_write_quorum())
 print(strategy.get_write_quorum())
-print(strategy.load(read_fraction=0.75)) # 0.458
+print(strategy.load(read_fraction=1)) # 2/3
-print(strategy.load(read_fraction=0))   # 0.333
+print(strategy.load(write_fraction=1)) # 1/3
 print(strategy.load(read_fraction=0.5)) # 0.416
 print(strategy.load(read_fraction=1))   # 0.5
-print(grid.load(read_fraction=0.25)) # 0.375
+print(strategy.load(read_fraction=0.25)) # 5/12
 print(strategy.node_load(a, read_fraction=0.25)) # 5/12
 print(strategy.node_load(b, read_fraction=0.25)) # 5/12
 print(strategy.node_load(c, read_fraction=0.25)) # 5/12
 print(strategy.node_load(d, read_fraction=0.25)) # 1/3
 print(strategy.node_load(e, read_fraction=0.25)) # 1/3
 print(strategy.node_load(f, read_fraction=0.25)) # 1/3
 strategy = grid.strategy(read_fraction=0.25)
 print(strategy)
 print(strategy.load(read_fraction=0.25)) # 3/8
 print(strategy.load(read_fraction=0))   # 1/3
 print(strategy.load(read_fraction=0.5)) # 5/12
 print(strategy.load(read_fraction=1))   # 1/2
 print(grid.load(read_fraction=0.25)) # 3/8
 print(grid.capacity(read_fraction=0.25)) # 8/3
 distribution = {0.1: 0.5, 0.75: 0.5}
 strategy = grid.strategy(read_fraction=distribution)
 print(strategy.load(read_fraction=distribution)) # 0.404
 strategy = grid.strategy(write_fraction=0.75)
 print(strategy.load(write_fraction=distribution)) # 0.429
 a = Node('a', capacity=1000)
 b = Node('b', capacity=500)
 c = Node('c', capacity=1000)