跳转至

5.1 The hiring problem

5.1-1

Show that the assumption that we are always able to determine which candidate is best in line 4 of procedure \(\text{HIRE-ASSISTANT}\) implies that we know a total order on the ranks of the candidates.

A total order is a partial order that is a total relation \((\forall a, b \in A:aRb \text{ or } bRa)\). A relation is a partial order if it is reflexive, antisymmetric and transitive.

Assume that the relation is good or better.

  • Reflexive: This is a bit trivial, but everybody is as good or better as themselves.
  • Transitive: If \(A\) is better than \(B\) and \(B\) is better than \(C\), then \(A\) is better than \(C\).
  • Antisymmetric: If \(A\) is better than \(B\), then \(B\) is not better than \(A\).

So far we have a partial order.

Since we assume we can compare any two candidates, then comparison must be a total relation and thus we have a total order.

5.1-2 \(\star\)

Describe an implementation of the procedure \(\text{RANDOM}(a, b)\) that only makes calls to \(\text{RANDOM}(0, 1)\). What is the expected running time of your procedure, as a function of \(a\) and \(b\)?

As \((b - a)\) could be any number, we need at least \(\lceil \lg(b - a) \rceil\) bits to represent the number. We set \(\lceil \lg(b - a) \rceil\) as \(k\). Basically, we need to call \(\text{RANDOM}(0, 1)\) \(k\) times. If the number represented by binary is bigger than \(b - a\), it's not valid number and we give it another try, otherwise we return that number.

C++
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
RANDOM(a, b)
    range = b - a
    bits = ceil(log(2, range))
    result = 0
    for i = 0 to bits - 1
        r = RANDOM(0, 1)
        result = result + r << i
    if result > range
        return RANDOM(a, b)
    else return a + result

The expectation of times of calling procedure \(\text{RANDOM}(a, b)\) is \(\frac{2^k}{b - a}\). \(\text{RANDOM}(0, 1)\) will be called \(k\) times in that procedure.

The expected running time is \(\Theta(\frac{2^k}{b - a} \cdot k)\), \(k\) is \(\lceil \lg(b - a) \rceil\). Considering \(2^k\) is less than \(2 \cdot (b - a)\), so the running time is \(O(k)\).

5.1-3 \(\star\)

Suppose that you want to output \(0\) with probability \(1 / 2\) and \(1\) with probability \(1 / 2\). At your disposal is a procedure \(\text{BIASED-RANDOM}\), that outputs either \(0\) or \(1\). It outputs \(1\) with some probability \(p\) and \(0\) with probability \(1 - p\), where \(0 < p < 1\), but you do not know what \(p\) is. Give an algorithm that uses \(\text{BIASED-RANDOM}\) as a subroutine, and returns an unbiased answer, returning \(0\) with probability \(1 / 2\) and \(1\) with probability \(1 / 2\). What is the expected running time of your algorithm as a function of \(p\)?

There are 4 outcomes when we call \(\text{BIASED-RANDOM}\) twice, i.e., \(00\), \(01\), \(10\), \(11\).

The strategy is as following:

  • \(00\) or \(11\): call \(\text{BIASED-RANDOM}\) twice again
  • \(01\): output \(0\)
  • \(10\): output \(1\)

We can calculate the probability of each outcome:

  • \(\Pr\\{00 | 11\\} = p^2 + (1 - p)^2\)
  • \(\Pr\\{01\\} = (1 - p)p\)
  • \(\Pr\\{10\\} = p(1 - p)\)

Since there's no other way to return a value, it returns \(0\) and \(1\) both with probability \(1 / 2\).

The pseudo code is as follow:

C++
1
2
3
4
5
6
UNBIASED-RANDOM
    while true
        x = BIASED-RANDOM
        y = BIASED-RANDOM
        if x != y
            return x

This algorithm actually uses the equivalence of the probability of occurrence of \(01\) and \(10\), and subtly converts the unequal \(00\) and \(11\) to \(01\) and \(10\), thus eliminating the probability that its probability is not equivalent.

Each iteration is a Bernoulli trial, where "success" means that the iteration does return a value.

We can view each iteration as a Bernoulli trial, where "success" means that the iteration returns a value.

\[ \begin{aligned} \Pr\\{\text{success}\\} & = \Pr\\{0\text{ is returned}\\} + \Pr\\{1\text{ is returned}\\} \\\\ & = 2p(1 - p). \end{aligned} \]

The expected number of trials for this scenario is \(1 / (2p(1 - p))\). Thus, the expected running time of \(\text{UNBIASED-RANDOM}\) is \(\Theta(1 / (2p(1 - p))\).