Longest Common Subsequence: Difference between revisions
No edit summary |
|||
| Line 34: | Line 34: | ||
<math> | <math> | ||
c[i,j] = \begin{cases} | c[i,j] = \begin{cases} | ||
0 & i = 0 or j = 0 \\ | 0 & i = 0 or j = 0 \\ | ||
c[i-1, j-1] + 1 & x_i = y_i \\ | c[i-1, j-1] + 1 & x_i = y_i \\ | ||
Revision as of 03:53, 4 March 2024
Problem description
A sequence Z is a subsequence of another sequence X if all the elements in Z also appear in order in X. Notably, the elements do not need to be consecutive.
In the longest-common-subsequence problem (LCS), we wish to find a maximum-length common subsequence of X and Y.
Approach: dynamic programming
Consider and , and let be an LCS of and .
Property 1
If , then and is an LCS of and .
Property 2
If and , then is an LCS of and .
Property 3
If and , then is an LCS of and .
Subproblems
Let us have be the length of the LCS for and . We have
Property 4
The first line is trivial. The second line stems from property 1 and makes enough sense to me.
The third line stems from property 3. This property tells us that depending on the last , must either be the LCS of and or the LCS of and . We don't know which one it is, so we simply compare them. Whichever is greater would be the LCS.
Implementation
Based on property 4, we have the following DP algorithm
LCS-Length(X, Y):
m = X.length
n = Y.length
let c[0...m, 0...n] be a new table // cache
// Initialize c (property 4.1)
for i = 0 to n:
c[0, i] = 0
for i = 0 to m:
c[i, 0] = 0
let traceback[0...m, 0...n] be a new table
// Construct c top -> down, left -> right
for row = 1 to n:
for col = 1 to m:
if (X[col] == Y[row]):
// Property 4.2
c[col, row] = c[col - 1, row - 1] + 1
traceback[col, row] = (col - 1, row - 1)
continue
else:
// Property 4.3
left = c[col - 1, row]
top = c[col, row - 1]
if (left > top):
c[col, row] = c[col - 1, row]
traceback[col, row] = (col - 1, row)
else:
c[col, row] = c[col, row - 1]
traceback[col, row] = (col, row - 1)
return (c, traceback)
