17-5 Competitive analysis of self-organizing lists with move-to-front

A self-organizing list is a linked list of \(n\) elements, in which each element has a unique key. When we search for an element in the list, we are given a key, and we want to find an element with that key.

A self-organizing list has two important properties:

  1. To find an element in the list, given its key, we must traverse the list from the beginning until we encounter the element with the given key. If that element is the \(k\)th element from the start of the list, then the cost to find the element is \(k\).
  2. We may reorder the list elements after any operation, according to a given rule with a given cost. We may choose any heuristic we like to decide how to reorder the list.

Assume that we start with a given list of \(n\) elements, and we are given an access sequence \(\sigma = \langle \sigma_1, \sigma_2, \ldots, \sigma_m \rangle\) of keys to find, in order. The cost of the sequence is the sum of the costs of the individual accesses in the sequence.

Out of the various possible ways to reorder the list after an operation, this problem focuses on transposing adjacent list elements-switching their positions in the list—with a unit cost for each transpose operation. You will show, by means of a potential function, that a particular heuristic for reordering the list, move-to-front, entails a total cost no worse than \(4\) times that of any other heuristic for maintaining the list order—even if the other heuristic knows the access sequence in advance! We call this type of analysis a competitive analysis.

For a heuristic \(H\) and a given initial ordering of the list, denote the access cost of sequence \(\sigma\) by \(C_H(\sigma)\) Let \(m\) be the number of accesses in \(\sigma\).

a. Argue that if heuristic \(\text H\) does not know the access sequence in advance, then the worst-case cost for \(\text H\) on an access sequence \(\sigma\) is \(C_H(\sigma) = \Omega(mn)\).

With the move-to-front heuristic, immediately after searching for an element \(x\), we move \(x\) to the first position on the list (i.e., the front of the list).

Let \(\text{rank}\_L(x)\) denote the rank of element \(x\) in list \(L\), that is, the position of \(x\) in list \(L\). For example, if \(x\) is the fourth element in \(L\), then \(\text{rank}\_L(x) = 4\). Let \(c_i\) denote the cost of access \(\sigma_i\) using the move-to-front heuristic, which includes the cost of finding the element in the list and the cost of moving it to the front of the list by a series of transpositions of adjacent list elements.

b. Show that if \(\sigma_i\) accesses element \(x\) in list \(L\) using the move-to-front heuristic, then \(c_i = 2 \cdot \text{rank}\_L(x) - 1\).

Now we compare move-to-front with any other heuristic \(\text H\) that processes an access sequence according to the two properties above. Heuristic \(\text H\) may transpose elements in the list in any way it wants, and it might even know the entire access sequence in advance.

Let \(L_i\) be the list after access \(\sigma_i\) using move-to-front, and let \(L_i^\*\) be the list after access \(\sigma_i\) using heuristic \(\text H\). We denote the cost of access \(\sigma_i\) by \(c_i\) for move-to-front and by \(c_i^\*\) for heuristic \(\text H\). Suppose that heuristic \(\text H\) performs \(t_i^\*\) transpositions during access \(\sigma_i\).

c. In part (b), you showed that \(c_i = 2 \cdot \text{rank}\_{L_{i - 1}}(x) - 1\). Now show that \(c_i^\* = \text{rank}\_{L_{i - 1}^\*}(x) + t_i^\*\).

We define an inversion in list \(L_i\) as a pair of elements \(y\) and \(z\) such that \(y\) precedes \(z\) in \(L_i\) and \(z\) precedes \(y\) in list \(L_i^\*\). Suppose that list \(L_i\) has \(q_i\) inversions after processing the access sequence \(\langle \sigma_1, \sigma_2, \ldots, \sigma_i \rangle\). Then, we define a potential function \(\Phi\) that maps \(L_i\) to a real number by \(\Phi(L_i) = 2q_i\). For example, if \(L_i\) has the elements \(\langle e, c, a, d, b \rangle\) and \(L_i^\*\) has the elements \(\langle c, a, b, d, e \rangle\), then \(L_i\) has 5 inversions \(((e, c), (e, a), (e, d), (e, b), (d, b))\), and so \(\Phi(L_i) = 10\). Observe that \(\Phi(L_i) \ge 0\) for all \(i\) and that, if move-to-front and heuristic \(\text H\) start with the same list \(L_0\), then \(\Phi(L_0) = 0\).

d. Argue that a transposition either increases the potential by \(2\) or decreases the potential by \(2\).

Suppose that access \(\sigma_i\) finds the element \(x\). To understand how the potential changes due to \(\sigma_i\), let us partition the elements other than \(x\) into four sets, depending on where they are in the lists just before the \(i\)th access:

  • Set \(A\) consists of elements that precede \(x\) in both \(L_{i - 1}\) and \(L_{i - 1}^\*\).
  • Set \(B\) consists of elements that precede \(x\) in \(L_{i - 1}\) and follow \(x\) in \(L_{i - 1}^\*\).
  • Set \(C\) consists of elements that follow \(x\) in \(L_{i - 1}\) and precede \(x\) in \(L_{i - 1}^\*\).
  • Set \(D\) consists of elements that follow \(x\) in both \(L_{i - 1}\) and \(L_{i - 1}^\*\).

e. Argue that \(\text{rank}\_{L_{i - 1}}(x) = |A| + |B| + 1\) and \(\text{rank}\_{L_{i - 1}^\*}(x) = |A| + |C| + 1\).

f. Show that access \(\sigma_i\) causes a change in potential of

\[\Phi(L_i) - \Phi(L_{i - 1}) \le 2(|A| - |B| + t_i^*),\]

where, as before, heuristic \(\text H\) performs \(t_i^\*\) transpositions during access \(\sigma_i\).

Define the amortized cost \(\hat c_i\) of access \(\sigma_i\) by \(\hat c_i = c_i + \Phi(L_i) - \Phi(L_{i - 1})\).

g. Show that the amortized cost \(\hat c_i\) of access \(\sigma_i\) is bounded from above by \(4c_i^\*\).

h. Conclude that the cost \(C_{\text{MTF}}(\sigma)\) of access sequence \(\sigma\) with move-to-front is at most \(4\) times the cost \(C_H(\sigma)\) of \(\sigma\) withany other heuristic \(\text H\), assuming that both heuristics start with the same list.

a. Since the heuristic is picked in advance, given any sequence of requests given so far, we can simulate what ordering the heuristic will call for, then, we will pick our next request to be whatever element will of been in the last position of the list. Continuing until all the requests have been made, we have that the cost of this sequence of accesses is \(= mn\).

b. The cost of finding an element is \(= \text{rank}\_L(x)\) and since it needs to be swapped with all the elements before it, of which there are \(\text{rank}\_L(x) - 1\), the total cost is \(2 \cdot \text{rank}\_L(x) - 1\).

c. Regardless of the heuristic used, we first need to locate the element, which is left where ever it was after the previous step, so, needs \(\text{rank}\_{L_{i - 1}}(x)\). After that, by definition, there are \(t_i\) transpositions made, so, \(c^\*\_i = \text{rank}\_{L_{i - 1}}(x) + t_i^\*\).

d. If we perform a transposition of elements \(y\) and \(z\), where \(y\) is towards the left. Then there are two cases. The first is that the final ordering of the list in \(L_i^\*\) is with \(y\) in front of \(z\), in which case we have just increased the number of inversions by \(1\), so the potential increases by \(2\). The second is that in \(L_I^\*z\) occurs before \(y\), in which case, we have just reduced the number of inversions by one, reducing the potential by \(2\).

In both cases, whether or not there is an inversion between \(y\) or \(z\) and any other element has not changed, since the transposition only changed the relative ordering of those two elements.

e. By definition, \(A\) and \(B\) are the only two of the four categories to place elements that precede \(x\) in \(L_{i - 1}\), since there are \(|A| + |B|\) elements preceding it, it's rank in \(L_{i - 1}\) is \(|A| + |B| + 1\). Similarly, the two categories in which an element can be if it precedes \(x\) in \(L^\*_{i - 1}\) are \(A\) and \(C\), so, in \(L^\*\_{i - 1}\), \(x\) has \(\text{rank} |A| + |C| + 1\).

f. We have from part d that the potential increases by \(2\) if we transpose two elements that are being swapped so that their relative order in the final ordering is being screwed up, and decreases by two if they are begin placed into their correct order in \(L^\*_i\).

In particular, they increase it by at most \(2\), since we are keeping track of the number of inversions that may not be the direct effect of the transpositions that heuristic \(H\) made, we see which ones the Move to front heuristic may of added. In particular, since the move to front heuristic only changed the relative order of \(x\) with respect to the other elements, moving it in front of the elements that preceded it in \(L_{i - 1}\), we only care about sets \(A\) and \(B\). For an element in \(A\), moving it to be behind \(A\) created an inversion, since that element preceded \(x\) in \(L^\*_i\). However, if the element were in \(B\), we are removing an inversion by placing \(x\) in front of it.

g.

\[ \begin{aligned} \hat c_i & \le 2(|A| + |B| + 1) - 1 + 2(|A| - |B| + t_i^\*) \\\\ & = 4|A| + 1 + 2 t_i^\* \\\\ & \le 4(|A| + |C| + 1 + t_i^\*) \\\\ & = 4 c_i^\*. \end{aligned} \]

h. We showed that the amortized cost of each operation under the move to front heuristic was at most four times the cost of the operation using any other heuristic. Since the amortized cost added up over all these operation is at most the total (real) cost, so we have that the total cost with movetofront is at most four times the total cost with an arbitrary other heuristic.