Omnium Gatherum
Praemissa propositio .
Consider
k
threads independently selecting one item each from a collection of
n
items. The probability that at least one thread selects a particular item is the complement of the probability that every thread selects an item other than that:
Utile theorema .
When
, this is approximately
.
More precisely, the relative difference between these can be made arbitrarily small by decreasing
k
or increasing
n. This can be seen
here.
Sometimes this approximation is more useful in a reärranged form:
Ratio demonstrandi .
When
k
= 1, this is exact:
From here on, we will assume that
k
is greater than one.
Considering
k
= 2 and
k
= 3 gives the general idea of an argument. We first reärrange
and then expand:
Here we can see the origin of the
term,
and indeed, simply applying the binomial theorem will suffice:
The absolute error term is
and the relative error term
is
We need to show that, for any
ε
less than one, the relative error is less than
ε
when
n
is
.
Each term on the right-hand is larger than the succeeding term, and since the signs alternate, each term is also larger than the succeeding series.
Lemma. Suppose that
Then and , because and .
Inductively, we have that
And so we conclude that
Praemissa propositio .
Consider
k
threads independently selecting one item each from a collection of
n
items. The probability that two or more threads select a particular item is the complement of the probability that
⌜every
thread selects an item other than that or exactly one thread selects the
item⌝ :
Utile theorema .
When
, this is approximately
Admonitio .
This is different than the probability of a collision occurring, that is, the probability of two or more threads selecting the same item (where the selected item may be
any
item rather than a particular item).
Praemissa propositio .
Suppose there are
k
threads selecting openings (and playing out games from them) and that these threads cannot communicate. How do we avoid collisions (book exits being selected by more than one thread) while also having randomized and uniform coverage? Broadly speaking, we simply make the space of possible book exits to sample from large enough that the chance of a collision is low.
Suppose we want to generate
g
book exits (play out
g
games) in total, or
g / k
games per thread. Let
s
be a multiplier, so that that the number of possible book exits to pick from is
g
×
s. Then the fraction of the space
g
×
s
that each thread will explore is
For any particular thread, and any particular book exit, this is the probability that the thread will pick that book exit. For convenience, let’s also define
, which is the probability that a particular thread does
not
pick a particular book exit. The probability that a particular book exit is picked by any thread is
and the probability that a particular book exit is picked by exactly one thread is
Then the probability that a particular book exit is picked by exactly one thread given that it was picked is
u / a.
Quite happily,
u / a
rapidly converges to a limit as
k
increases, which follows from our approximation. We first substitute the definitions of
u,
p,
q, and
a
to obtain
Then we have
when
, or more simply, when
.
When
, this is approximately
.
Alternatively, we might instead write
and then
and once again, when
, this is approximately
. However, these are fairly poor approximations; an approximation like
works much better.
We might try repeating the same procedure above using
and derive
when
, which is an improvement, but still worse than
.
We can instead take a different tack.
Utile theorema .
The probability that a particular item (of
k
×
s
items) is selected by exactly one thread (of
k
threads) given that it was selected by any thread is
When
, this is approximately
or simply
when
.
Ratio demonstrandi .
Let us rewrite the expression as
and consider the rightmost term
in isolation.
We first rewrite this term further as
Æquatio. For any z, by the following sequence of rewrites:
The subterm
can be rewritten as
and then we replace
with its binomial expansion:
(The elaboration of the binomial expansion has been omitted for brevity.) We create an approximation of the subterm by truncating the series. Substituting this in † gives us
which is exactly
(The steps of polynomial division have been omitted for brevity.)
We now substitute this in
⋆
and proceed to simplify:
Annotatio .
By
inspection, we can see that the probability of uniqueness is 0.95 around
s = 10;
in other words, for each book exit, the probability that the book exit is duplicated is less than 5% when
s = 10,
and this is true independent of the number of threads,
k, and the total number of games,
g, we want to play. (The chance of duplication is even less when
k
is small.)