11-4 Hashing and authentication

Let \(\mathcal H\) be a class of hash functions in which each hash function \(h \in \mathcal H\) maps the universe \(U\) of keys to \(\\{0, 1, \ldots, m - 1 \\}\). We say that \(\mathcal H\) is k-universal if, for every fixed sequence of \(k\) distinct keys \(\langle x^{(1)}, x^{(2)}, \ldots, x^{(k)} \rangle\) and for any \(h\) chosen at random from \(\mathcal H\), the sequence \(\langle h(x^{(1)}), h(x^{(2)}), \ldots, h(x^{(k)}) \rangle\) is equally likely to be any of the \(m^k\) sequences of length \(k\) with elements drawn from \(\\{0, 1, \ldots, m - 1 \\}\).

a. Show that if the family \(\mathcal H\) of hash functions is \(2\)-universal, then it is universal.

b. Suppose that the universe \(U\) is the set of \(n\)-tuples of values drawn from \(\mathbb Z_p = \\{0, 1, \ldots, p - 1\\}\), where \(p\) is prime. Consider an element \(x = \langle x_0, x_1, \ldots, x_{n - 1} \rangle \in U\). For any \(n\)-tuple \(a = \langle a_0, a_1, \ldots, a_{n - 1} \rangle \in U\), define the hash function \(h_a\) by

\[h_a(x) = \Bigg(\sum_{j = 0}^{n - 1} a_j x_j \Bigg) \mod p.\]

Let \(\mathcal H = \\{h_a\\}\). Show that \(\mathcal H\) is universal, but not \(2\)-universal. (\(\textit{Hint:}\) Find a key for which all hash functions in \(\mathcal H\) produce the same value.)

c. Suppose that we modify \(\mathcal H\) slightly from part (b): for any \(a \in U\) and for any \(b \in \mathbb Z_p\), define

\[h'_{ab}(x) = \Bigg(\sum\_{j = 0}^{n - 1} a_j x_j + b \Bigg) \mod p\]

and \(\mathcal h' = \\{h'_{ab}\\}\). Argue that \(\mathcal H'\) is \(2\)-universal. (\(\textit{Hint:}\) Consider fixed \(n\)-tuples \(x \in U\) and \(y \in U\), with \(x_i \ne y_i\) for some \(i\). What happens to \(h'\_{ab}(x)\) and \(h'\_{ab}(y)\) as \(a_i\) and \(b\) range over \(\mathbb Z_p\)?)

d. Suppose that Alice and Bob secretly agree on a hash function \(h\) form \(2\)-universal family \(\mathcal H\) of hash functions. Each \(h \in \mathcal H\) maps from a universe of keys \(u\) to \(\mathbb Z_p\), where \(p\) is aprime. Later, Alice sends a message \(m\) to Bob over the Internet, where \(m \in U\). She authenticates this message to Bob by also sending an authentication tag \(t = h(m)\), and Bob checks that the pair \((m, t)\) he receives indeed satisfies \(t = h(m)\). Suppose that an adversary intercepts \((m, t)\) en route and tries to fool Bob by replacing the pair \((m, t)\) with a different pair \((m', t')\). Argue that the probability that the adversary succeeds in fooling Bob into accepting \((m', t')\) is at most \(1 / p\), no matter how much computing power the adversary has, and even if the adversary knows the family \(\mathcal H\) of hash functions used.

a. The number of hash functions for which \(h(k) = h(l)\) is \(\frac{m}{m^2}|\mathcal H| = \frac{1}{m}|\mathcal H|\), therefore the family is universal.

b. For \(x = \langle 0, 0, \ldots, 0 \rangle\), \(\mathcal H\) could not be \(2\)-universal.

c. Let \(x, y \in U\) be fixed, distinct \(n\)-tuples. As \(a_i\) and \(b\) range over \(\mathbb Z_p, h'_{ab}(x)\) is equally likely to achieve every value from \(1\) to \(p\) since for any sequence \(a\), we can let \(b\) vary from \(1\) to \(p - 1\).

Thus, \(\langle h'\_{ab}(x), h'\_{ab}(y) \rangle\) is equally likely to be any of the \(p^2\) sequences, so \(\mathcal H\) is \(2\)-universal.

d. Since \(\mathcal H\) is \(2\)-universal, every pair of \(\langle t, t' \rangle\) is equally likely to appear, thus \(t'\) could be any value from \(\mathbb Z_p\). Even the adversary knows \(\mathcal H\), since \(\mathcal H\) is \(2\)-universal, then \(\mathcal H\) is universal, the probability of choosing a hash function that \(h(k) = h(l)\) is at most \(1 / p\), therefore the probability is at most \(1 / p\).