Longest Common Subsequence
The Longest Common Subsequence (LCS) problem is a classic computer science problem for finding the length of longest possible common subsequence from a set of subsequence to the given two sequences.
Longest Common Subsequence (LCS) problem can be solved by many different approaches.But the most efficient approach to solve the Longest Common Subsequence (LCS) problem is Dynamic programming approach which takes the O(N * M) time complexity, where N and M are the sizes of the given sequences.
Problem Statement
Given two sequences, and the task is to find the length of the longest subsequence that is present in subsequence of the two sequences.
Longest Common Subsequence Example
Below is the example of Longest Common Subsequence Problem with input- output constraint and the solution for the example using the Dynamic programming approach.
Input – Output Data for the Algorithm
- Subs_1 : This contains the first sequence.
- Subs_2 : This contains the second sequence.
- N : This contains the size of the first sequence.
- M : This contains the size of the second sequence.
- Table[] : This contains the Array to solve the problem.
- Solution : This is used to store the number required.
Input and Output of the Example
Given two sequence, Subs_1 = “ABC” and Subs_2 = “BCD” of size N = 3 and and M = 3, the task is to find the length of Longest Common Subsequence possible.
Answer: 2
A Longest Common Subsequence is BC
Steps to solve the problem
There are two cases, the last characters match or the last characters do not match.
- If it matches, then increment the length of LCS[i][j] by 1 and process Subs_1[i-1] and Subs_2[j-1].
- If it is not match, then find the max of L[i-1][j] and L[i][j-1].
Stepwise Solution of the Longest Common Subsequence Problem Example
Solution of the above example using the Dynamic programming approach.
Given data are
Sequence_1 : | A | B | C |
---|
Sequence_2 : | B | C | D |
---|
1.Create a LCS[N+1][M+1] where First column represents the String 1 and First Row represents the String 2 with additional Value( empty value) in both. And fill up in bottom up manner.
String 1 \ String 2 | Φ | A | B | C |
---|---|---|---|---|
Φ | 0 | 0 | 0 | 0 |
B | 0 | |||
C | 0 | |||
D | 0 |
2.For first element of Subs_2 = B, do the matching with each element of Subs_1. An follow the steps that are given below.
- B does not match with Subs_1[1] = A, then max of left and right cell is 0.
- B matches with Subs_1[2] = B, then 1 + diagonal cell is 1.
- B not match with Subs_1[3] = C, then max of left and right cell is 1.
String 1 \ String 2 | Φ | A | B | C |
---|---|---|---|---|
Φ | 0 | 0 | 0 | 0 |
B | 0 | 0 | 1 | 1 |
C | 0 | |||
D | 0 |
3.For element of Subs_2 = C, do the matching with each element of Subs_1. An follow the steps that are given below.
- C does not match with Subs_1[1] = A, then max of left and right cell is 0.
- C not match with Subs_1[2] = B, then max of left and right cell is 1.
- C matches with Subs_1[3] = C, then 1 + diagonal cell is 2.
String 1 \ String 2 | Φ | A | B | C |
---|---|---|---|---|
Φ | 0 | 0 | 0 | 0 |
B | 0 | 0 | 1 | 1 |
C | 0 | 0 | 1 | 2 |
D | 0 |
4.For element of Subs_3 = D, do the matching with each element of Subs_1. An follow the steps that are given below.
- D does not match with Subs_1[1] = A, then max of left and right cell is 0.
- D not match with Subs_1[2] = B, then max of left and right cell is 1.
- D not match with Subs_1[3] = C, then max of left and right cell is 2.
String 1 \ String 2 | Φ | A | B | C |
---|---|---|---|---|
Φ | 0 | 0 | 0 | 0 |
B | 0 | 0 | 1 | 1 |
C | 0 | 0 | 1 | 2 |
D | 0 | 0 | 1 | 2 |
Pseudocode for Longest Common Subsequence Problem using Dynamic programming
Here’s Pseudocode for solving the Longest Common Subsequence Problem using Dynamic programming.
function Longest_Common_Subsequence_Length(Subs_1[1..m], Subs_2[1..n]) Table = array(0..m, 0..n) for i := 0..m Table[i,0] = 0 for j := 0..n Table[0,j] = 0 for i := 1..m for j := 1..n if Subs_1[i] = Subs_2[j] //i-1 and j-1 if reading Subs_1 & Subs_2 from zero Table[i,j] := Table[i-1,j-1] + 1 else Table[i,j] := max(Table[i,j-1], Table[i-1,j]) return Table[m,n]
Implementation of Longest Common Subsequence Problem using Dynamic programming in Python
# Implementation of Longest_Common_Subsequence problem def Longest_Common_Subsequence(Subs_1, Subs_2, M, N): # 2array to store DP value Table = [[None]*(N+1) for i in range(M+1)] for i in range(M+1): for j in range(N+1): if i == 0 or j == 0 : Table[i][j] = 0 elif Subs_1[i-1] == Subs_2[j-1]: Table[i][j] = 1 + Table[i-1][j-1] else: Table[i][j] = max(Table[i][j-1], Table[i-1][j]) return Table[M][N] # Given two sequence and their size Subs_1 = "PREPTECH" Subs_2 = "PREPFORTECH" N = len(Subs_1) M = len(Subs_2) # function calling Solution = Longest_Common_Subsequence(Subs_1, Subs_2, N, M) # Printing the Solution print(Solution)