Revision as of 04:06, 4 March 2024

Problem description

A sequence Z is a subsequence of another sequence X if all the elements in Z also appear in order in X. Notably, the elements do not need to be consecutive.

In the longest-common-subsequence problem (LCS), we wish to find a maximum-length common subsequence of X and Y.

Overview

Runtime complexity: $O(m*n)$
Space complexity: $O(m*n)$
Approach: Dynamic programming

Approach: dynamic programming

Consider $X=\langle x_{1},x_{2},\ldots ,x_{m}\rangle$ and $Y=\langle y_{1},y_{2},\ldots ,y_{n}\rangle$ , and let $Z=\langle z_{1},z_{2},\ldots ,z_{k}\rangle$ be an LCS of $X$ and $Y$ .

Property 1

If $x_{m}=y_{n}$ , then $z_{k}=x_{m}=y_{n}$ and $Z_{k-1}$ is an LCS of $X_{m-1}$ and $Y_{n-1}$ .

Property 2

If $x_{m}\neq y_{n}$ and $Z_{k}\neq x_{m}$ , then $Z$ is an LCS of $X_{m-1}$ and $Y$ .

Property 3

If $x_{m}\neq y_{n}$ and $Z_{k}\neq y_{n}$ , then $Z$ is an LCS of $X$ and $Y_{n-1}$ .

Subproblems

Let us have $c[i,j]$ be the length of the LCS for $X_{i}$ and $Y_{j}$ . We have

Property 4

$c[i,j]={\begin{cases}0&i=0orj=0\\c[i-1,j-1]+1&x_{i}=y_{i}\\max(c[i-1,j],c[i-1,j])&x_{i}\neq y_{i}\end{cases}}$

The first line is trivial. The second line stems from property 1 and makes enough sense to me.

The third line stems from property 3. This property tells us that depending on the last $Z_{k}$ , $Z$ must either be the LCS of $X_{m-1}$ and $Y$ or the LCS of $X$ and $Y_{n-1}$ . We don't know which one it is, so we simply compare them. Whichever is greater would be the LCS.

Implementation

Based on property 4, we have the following DP algorithm

LCS-Length(X, Y):
    m = X.length
    n = Y.length
    let c[0...m, 0...n] be a new table  // cache
    // Initialize c (property 4.1)
    for i = 0 to n:
        c[0, i] = 0
    for i = 0 to m:
        c[i, 0] = 0

    let traceback[0...m, 0...n] be a new table

    // Construct c top -> down, left -> right
    for row = 1 to n:
        for col = 1 to m:
            if (X[col] == Y[row]):
                // Property 4.2
                c[col, row] = c[col - 1, row - 1] + 1
                traceback[col, row] = (col - 1, row - 1)
                continue
            else:
                // Property 4.3
                left = c[col - 1, row]
                top = c[col, row - 1]
                if (left > top):
                    c[col, row] = c[col - 1, row]
                    traceback[col, row] = (col - 1, row)
                else:
                    c[col, row] = c[col, row - 1]
                    traceback[col, row] = (col, row - 1)

    return (c, traceback)

Printing shouldn't be that hard. You just trace back and record whenever they are the same value. I think.

Optimizations

We can reduce the asymptotic space requirements since it only needs one row and column at a time (although stopping us from reconstructing the actual subsequence).

We can also remove the entire traceback array and just compare which value is used for the $c$ calculation. This halves the space requirement but doesn't change anything asymptotically.

@@ Line 3: / Line 3: @@
 In the '''longest-common-subsequence problem (LCS)''', we wish to find a maximum-length common subsequence of ''X'' and ''Y''.
+= Overview =
+* '''Runtime complexity:''' <math>O(m*n)</math>
+* '''Space complexity:''' <math>O(m*n)</math>
+* '''Approach:''' Dynamic programming
 = Approach: dynamic programming =
@@ Line 95: / Line 101: @@
 Printing shouldn't be that hard. You just trace back and record whenever
 they are the same value. I think.
+= Optimizations =
+We can reduce the asymptotic space requirements since it only needs one
+row and column at a time (although stopping us from reconstructing the
+actual subsequence).
+We can also remove the entire traceback array and just compare which
+value is used for the <math>c</math> calculation. This halves the space
+requirement but doesn't change anything asymptotically.

Anonymous

Search

Longest Common Subsequence: Difference between revisions

Namespaces

More

Page actions

Revision as of 04:06, 4 March 2024

Contents

Problem description

Overview

Approach: dynamic programming

Subproblems

Implementation

Optimizations

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Longest Common Subsequence: Difference between revisions

Revision as of 04:06, 4 March 2024

Problem description

Overview

Approach: dynamic programming

Subproblems

Implementation

Optimizations

Navigation

Wiki tools

Page tools