Revision as of 03:55, 4 March 2024

Problem description

A sequence Z is a subsequence of another sequence X if all the elements in Z also appear in order in X. Notably, the elements do not need to be consecutive.

In the longest-common-subsequence problem (LCS), we wish to find a maximum-length common subsequence of X and Y.

Approach: dynamic programming

Consider $X=\langle x_{1},x_{2},\ldots ,x_{m}\rangle$ and $Y=\langle y_{1},y_{2},\ldots ,y_{n}\rangle$ , and let $Z=\langle z_{1},z_{2},\ldots ,z_{k}\rangle$ be an LCS of $X$ and $Y$ .

Property 1

If $x_{m}=y_{n}$ , then $z_{k}=x_{m}=y_{n}$ and $Z_{k-1}$ is an LCS of $X_{m-1}$ and $Y_{n-1}$ .

Property 2

If $x_{m}\neq y_{n}$ and $Z_{k}\neq x_{m}$ , then $Z$ is an LCS of $X_{m-1}$ and $Y$ .

Property 3

If $x_{m}\neq y_{n}$ and $Z_{k}\neq y_{n}$ , then $Z$ is an LCS of $X$ and $Y_{n-1}$ .

Subproblems

Let us have $c[i,j]$ be the length of the LCS for $X_{i}$ and $Y_{j}$ . We have

Property 4

$c[i,j]={\begin{cases}0&i=0orj=0\\c[i-1,j-1]+1&x_{i}=y_{i}\\max(c[i-1,j],c[i-1,j])&x_{i}\neq y_{i}\end{cases}}$

The first line is trivial. The second line stems from property 1 and makes enough sense to me.

The third line stems from property 3. This property tells us that depending on the last $Z_{k}$ , $Z$ must either be the LCS of $X_{m-1}$ and $Y$ or the LCS of $X$ and $Y_{n-1}$ . We don't know which one it is, so we simply compare them. Whichever is greater would be the LCS.

Implementation

Based on property 4, we have the following DP algorithm

LCS-Length(X, Y):
    m = X.length
    n = Y.length
    let c[0...m, 0...n] be a new table  // cache
    // Initialize c (property 4.1)
    for i = 0 to n:
        c[0, i] = 0
    for i = 0 to m:
        c[i, 0] = 0

    let traceback[0...m, 0...n] be a new table

    // Construct c top -> down, left -> right
    for row = 1 to n:
        for col = 1 to m:
            if (X[col] == Y[row]):
                // Property 4.2
                c[col, row] = c[col - 1, row - 1] + 1
                traceback[col, row] = (col - 1, row - 1)
                continue
            else:
                // Property 4.3
                left = c[col - 1, row]
                top = c[col, row - 1]
                if (left > top):
                    c[col, row] = c[col - 1, row]
                    traceback[col, row] = (col - 1, row)
                else:
                    c[col, row] = c[col, row - 1]
                    traceback[col, row] = (col, row - 1)

    return (c, traceback)

Printing shouldn't be that hard. You just trace back and record whenever they are the same value. I think.

@@ Line 34: / Line 34: @@
 <math>
- c[i,j] = \begin{cases}
+c[i,j] = \begin{cases}
 & i = 0 or j = 0 \\
      c[i-1, j-1] + 1 & x_i = y_i \\
@@ Line 89: / Line 89: @@
      return (c, traceback)
 </pre>
+Printing shouldn't be that hard. You just trace back and record whenever
+they are the same value. I think.

Anonymous

Search

Longest Common Subsequence: Difference between revisions

Namespaces

More

Page actions

Revision as of 03:55, 4 March 2024

Contents

Problem description

Approach: dynamic programming

Property 1

Property 2

Property 3

Subproblems

Property 4

Implementation

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Longest Common Subsequence: Difference between revisions

Revision as of 03:55, 4 March 2024

Problem description

Approach: dynamic programming

Property 1

Property 2

Property 3

Subproblems

Property 4

Implementation

Navigation

Wiki tools

Page tools