32-1 String matching based on repetition factors
Let \(y^i\) denote the concatenation of string \(y\) with itself \(i\) times. For example, \((\text{ab})^3 = \text{ababab}\). We say that a string \(x \in \Sigma^\*\) has repetition factor \(r\) if \(x = y ^ r\) for some string \(y \in \Sigma^\*\) and some \(r > 0\). Let \(\rho(x)\) denote the largest \(r\) such that \(x\) has repetition factor \(r\).
a. Give an efficient algorithm that takes as input a pattern \(P[1 \ldots m]\) and computes the value \(\rho(P_i)\) for \(i = 1, 2, \ldots, m\). What is the running time of your algorithm?
b. For any pattern \(P[1 \ldots m]\), let \(\rho^\*(P)\) be defined as \(\max_{1 \le i \le m} \rho(P_i)\). Prove that if the pattern \(P\) is chosen randomly from the set of all binary strings of length \(m\), then the expected value of \(\rho^\*(P)\) is \(O(1)\).
c. Argue that the following string-matching algorithm correctly finds all occurrences of pattern \(P\) in a text \(T[1 \ldots n]\) in time \(O(\rho^\*(P)n + m)\):
C++ 1 2 3 4 5 6 7 8 9 10 11 12 13 14REPETITION_MATCHER(P, T) m = P.length n = T.length k = 1 + ρ*(P) q = 0 s = 0 while s ≤ n - m if T[s + q + 1] == P[q + 1] q = q + 1 if q == m print "Pattern occurs with shift" s if q == m or T[s + q + 1] != P[q + 1] s = s + max(1, ceil(q / k)) q = 0
This algorithm is due to Galil and Seiferas. By extending these ideas greatly, they obtained a linear-time string-matching algorithm that uses only \(O(1)\) storage beyond what is required for \(P\) and \(T\).
a. Compute \(\pi\), let \(l = m - \pi[m]\), if \(m ~\text{mod}~ l = 0\) and for all \(p = m - i \cdot l > 0\), \(p - \pi[p] = l\), then \(\rho(P_i) = m / l\), otherwise \(\rho(P_i) = 1\). The running time is \(\Theta(n)\).
b.
c.
(Omit!)
本页面的全部内容在 小熊老师 - 莆田青少年编程俱乐部 0594codes.cn 协议之条款下提供,附加条款亦可能应用