Sunday, July 19, 2020

From fair to irrational and back

This article is kind of related to the previous post about random numbers.

The main questions we want to answer:

Can we simulate a biased coin with a fair coin? What about the reverse?

We are more interested in the case where the biased coin needs to have an irrational probability.

We will consider the classic and a not so classic approach.

(We will also explicitly look at $\frac{1}{\sqrt{2}}$ and $\frac{1}{e}$ in this post).

From fair to irrational

So suppose we have a random bit generator which generates bit 0 (and 1) with probability $\frac{1}{2}$. Also assume that each call takes constant time.

Can we use this to simulate a bit generator that generates bit 0 with probability $p$ where $p$ is some irrational in $(0, 1)$? 

One condition we will require is that the biased bit generator will run in expected constant time, without that it is not of much use.

For motivation for the method: Say we had a way to generate a real number $r$ in $(0, 1)$ and compare that against $p$, we could then have an easy algorithm to return 0 when $r \lt p$ and return 1 when $r \gt p$. This will return $0$ with probability $p$.

The trick is to have a representation of $p$ and that of $r$ which can be compared while we are in the process of generating $r$. The most obvious representation that comes to mind is the base-2 representation!

Base-2 generator


So say we had a base-2 representation of $p = 0.d_1 d_2 d_3 \dots $

To generate $r = 0.r_1 r_2 r_3 \dots $ we can randomly generate the bits $r_1, r_2, \dots$ and compare against $d_1, d_2, \dots$ while we are generating the $r_i$. With a very high likelihood there will be some finite $k$ where $r_k \neq d_k$ for the first time. We can stop and return bit 0 or bit 1 depending on whether $r_k \gt d_k$ (and so $r \gt p$) or $r_k \lt d_k$ (and so $r \lt p$).

This can be written in code as follows:


# Given bitstream (base 2 representation) generates random bit with
# pbt of 0 = number represented by bitstream.
def random_bit_base2(bias_bitstream, fair_random_bit):
  # We generate a random number and if < given number
  # return 0, else return 1.
  for bit in bias_bitstream():
    fair_bit = fair_random_bit()

    # the number represented by the fair_bit seq is different!
    if fair_bit != bit:

      # This basically is returning 0
      # if number formed by fair_bit seq < bitstream seq.
      # else returning 1.
      return fair_bit



The probability of generating a 0 is

$$\frac{d_1}{2} + \frac{1}{2}\times\frac{d_2}{2} + \dots = p$$

The expected number of random number calls is $\sum P(X \gt n) = \sum \frac{1}{2^n} = 2$. [See this math.se post]

The problem with this approach is generating the base-2 representation of the required bias $p$, but for practical purposes (depending on the application) we can choose to only work with the first "few" (eg 128) digits of $p$. Very likely $p$ can be approximated via infinite series etc and the digits placed in a lookup table.

For certain $p$ it is pretty easy to do though. For eg:


Problem 1: Can you write a simple algorithm to lazily generate the base-2 representation of $\frac{1}{\sqrt{2}}$?

Hint: If $x^2 \lt 2^{2n+1} \lt (x+1)^2$ then we only need to consider $2x$ or $2x+1$ for $2^{2n+3}$.


This base-2 version generator is in fact pretty well known (Googling will give similar results here).

There are other approaches like base-n instead of base-2. Choosing the base $n$ will depend on the bias $p$. For eg, with the existence of spigot algorithms for $\pi$, we could use base-16 for say $p = \frac{\pi}{16}$.

We will now discuss the not-so-classic approach.

Alternative to base-n: factorial base


If we assume that instead of a fair bit generator, we had a random integer generator which given input $K$ returned one of $\{0, 1, \dots, K-1\}$ with probability $\frac{1}{K}$ then for constants like $\frac{1}{e}$ we could do something other than base-n.

We can use a representation which uses factorials. For a given $r \in (0, 1)$ there is a representation of $r$ as

$$ r = \frac{a_2}{2!} + \frac{a_3}{3!} + \dots  = \sum_{n=2}^{\infty} \frac{a_n}{n!}$$

And $a_m \in \{0, 1, \dots , m - 1\}$.

This representation (say $(a_2, a_3, \dots)$) is finite only when $r$ is rational (we also need $a_k \neq k-1$ infinitely often, but we can ignore that for our purposes).

For eg.

$$ e - 2 = \sum_{n=2}^{\infty} \frac{1}{n!} = (1, 1, \dots )$$

It can also be shown that to compare $r = (a_2, a_3, \dots) $ against $p = (b_2, b_3, \dots)$ it is enough to find the first $k$ where $a_k \neq b_k$ and compare the "digits".

To generate $a_k$ we just generate a random integer in $\{0, 1, \dots, k-1\}$ using the random integer generator we have.

The code would look something like this:


# Given digits in factorial base
# i.e num = \sum_{n=2}^{\infty} d_n/n!
# Generate a random bit with probability num.
def random_bit_base_factorial(digit_stream, fair_random_int):
  # We generate a random number and if < given number
  # return 0, else return 1.
  base = 2

  # We generate a random digit (0, ..., base - 1)
  # and then increment the base.
  for digit in digit_stream():
    fair_digit = fair_random_int(base)
    base = base + 1
    if fair_digit < digit:
      return 0
    if fair_digit > digit:
      return 1

Say we need the probability $p = \frac{b_2}{2!} + \frac{b_3}{3!} + \dots $.

The probability of returning 0 is

$$ \frac{b_2}{2} + \frac{1}{2} \times \frac{b_3}{3} + \frac{1}{2} \times \frac{1}{3} \times \frac{b_4}{4} + \dots = \frac{b_2}{2!} + \frac{b_3}{3!} + \dots = p$$

The expected number of fair random number calls is $\sum \frac{1}{n!} = e - 1 \lt 2$.

[It might look like this approach is better than base-2 because it converges faster and  uses fewer random number calls but we need to consider the implementation of the random integer generator, which might be using the random bit generator under the covers.]

This representation has the same problem as generating the digits in base-n of the required probability. For certain constants it is easy to do, though.

For eg:

Problem 2: Show that $$ \frac{1}{e} = \frac{2}{3!} + \frac{4}{5!} + \dots = \sum_{n=1}^{\infty} \frac{2n}{(2n+1)!}$$


A note regarding the "actual" runtime (applies to both base-n and factorial)


We have conveniently ignored the time taken to generate both the digits of the bias $p$ and the random digit of the generated real number $r$ but that is not a huge problem.

In fact we can prove the following:

Problem 3: If the $n^{th}$ digit generation of the bias $p$ and random number $r$ takes polynomial time in $n$, show that the expected runtime of the biased random bit generator is $\mathcal{O}(1)$.

Hint: Use the generating functions and their derivatives.

So if we hardcode the first say 128 digits in a lookup table and then switch to generating the rest, we will get a very good practical algorithm.

From Irrational back to Fair


This is actually not too hard, since if we generate two bits, the probability of getting 0,1 is the same as the probability of getting 1,0. 

So the algorithm is simple, call the biased bit generator twice. return 0 if we get 0,1. return 1 if we get 1,0. Continue if we get 0,0 or 1,1. This has finite expected runtime too.



No comments:

Post a Comment