Given two strings str1 and str2, find the shortest common supersequence.
The shortest common supersequence is the shortest string that contains both str1 and str2 as subsequences.
Input: str1 = "mno", str2 = "nop"
Output: "mnop"
Explanation: The shortest common supersequence is "mnop". It contains "mno" as the first three characters and "nop" as the last three characters, thus including both strings as subsequences.
Input: str1 = "dynamic", str2 = "program"
Output: "dynprogramic"
Explanation: The shortest common supersequence is "dynamprogramic". It includes all characters from both "dynamic" and "program", with minimal overlap. For example, "dynamic" appears as "dynam...ic" and "program" appears as "...program..." within "dynamprogramic".
Input: str1 = "apple", str2 = "orange"
If we keep the “shortest” criteria aside, what can be a way to generate a supersequence given two strings. One easy way is to concat the given strings (write one after the other), this will always give us a supersequence for any pair of given strings.
This can be said as the worst case with time complexity of O(n+m), where n and are the lengths of strings str1 and str2 respectively.
If we observe, there are some common characters that we can avoid writing for both the strings separately. These common characters can’t be all the common characters. They are the characters that are common and come in the same order. In other words, they are the characters of the longest common subsequence.
In an optimum solution, the characters of the longest common subsequence are written only once and other characters are placed around them. For every character that belongs to the longest common subsequence, the non-lcs characters coming before them in the strings S1 and S2 are placed before the lcs-character in the answer string. The below figure explains this:
From the explanation above, we can see that characters of lcs need to be covered only once. Therefore, the length of the shortest Common supersequence = n + m -k, where (n and m are lengths of str1 and str2, and k is the length of the lcs string).
Now, instead of the length, we are interested in finding the shortest supersequence string itself.
When we form the DP table to calculate the longest common subsequence, we have all the information of characters that are coming in the lcs string and characters that don’t. We use this same DP table to form the shortest common supersequence.
To frame the string, we need to understand how the dp table was formed and work in the reverse process.
Now, let us see what were the conditions that we used while forming the dp array:
We will start from the right-most cell of the dp array, initially i=n and j=m. To form the string, we will work in a reverse manner.
In the beginning i=5 and j=5.
#include <bits/stdc++.h>
using namespace std;
class Solution{
public:
//Function to fund the shortest common supersequence
string shortestCommonSupersequence(string str1, string str2) {
int n = str1.size();
int m = str2.size();
// Create a DP table with dimensions (n+1) x (m+1) initialized to 0
vector<vector<int>> dp(n + 1, vector<int>(m + 1, 0));
// Initialize the first row and column of the DP table
for (int i = 0; i <= n; i++) {
dp[i][0] = 0;
}
for (int i = 0; i <= m; i++) {
dp[0][i] = 0;
}
// Fill the DP table
for (int ind1 = 1; ind1 <= n; ind1++) {
for (int ind2 = 1; ind2 <= m; ind2++) {
if (str1[ind1 - 1] == str2[ind2 - 1])
dp[ind1][ind2] = 1 + dp[ind1 - 1][ind2 - 1];
else
dp[ind1][ind2] = max(dp[ind1 - 1][ind2], dp[ind1][ind2 - 1]);
}
}
// Reconstruct the shortest supersequence from the DP table
int len = dp[n][m];
int i = n;
int j = m;
int index = len - 1;
string ans = "";
// Build the shortest supersequence by backtracking
while (i > 0 && j > 0) {
if (str1[i - 1] == str2[j - 1]) {
ans += str1[i - 1];
index--;
i--;
j--;
} else if (dp[i - 1][j] > dp[i][j - 1]) {
ans += str1[i - 1];
i--;
} else {
ans += str2[j - 1];
j--;
}
}
// Add remaining characters from str1 or str2
while (i > 0) {
ans += str1[i - 1];
i--;
}
while (j > 0) {
ans += str2[j - 1];
j--;
}
// Reverse the result since we built it backwards
reverse(ans.begin(), ans.end());
return ans;
}
};
int main() {
string s1 = "brute";
string s2 = "groot";
// Create an instance of Solution class
Solution sol;
// Print the result
cout << "The Longest Common Supersequence is " << sol.shortestCommonSupersequence(s1, s2);
}
import java.util.*;
class Solution {
// Function to find the shortest common supersequence
public String shortestCommonSupersequence(String str1, String str2) {
int n = str1.length();
int m = str2.length();
// Create a DP table initialized to 0
int[][] dp = new int[n + 1][m + 1];
// Initialize the first row and column of DP table
for (int i = 0; i <= n; i++) {
dp[i][0] = 0;
}
for (int i = 0; i <= m; i++) {
dp[0][i] = 0;
}
// Fill the DP table
for (int ind1 = 1; ind1 <= n; ind1++) {
for (int ind2 = 1; ind2 <= m; ind2++) {
if (str1.charAt(ind1 - 1) == str2.charAt(ind2 - 1))
dp[ind1][ind2] = 1 + dp[ind1 - 1][ind2 - 1];
else
dp[ind1][ind2] = Math.max(dp[ind1 - 1][ind2], dp[ind1][ind2 - 1]);
}
}
// Reconstruct the shortest supersequence from DP table
int len = dp[n][m];
int i = n;
int j = m;
int index = len - 1;
StringBuilder ans = new StringBuilder();
// Build the shortest supersequence by backtracking
while (i > 0 && j > 0) {
if (str1.charAt(i - 1) == str2.charAt(j - 1)) {
ans.append(str1.charAt(i - 1));
index--;
i--;
j--;
} else if (dp[i - 1][j] > dp[i][j - 1]) {
ans.append(str1.charAt(i - 1));
i--;
} else {
ans.append(str2.charAt(j - 1));
j--;
}
}
// Add remaining characters from str1 or str2
while (i > 0) {
ans.append(str1.charAt(i - 1));
i--;
}
while (j > 0) {
ans.append(str2.charAt(j - 1));
j--;
}
// Reverse the result since we built it backwards
ans.reverse();
return ans.toString();
}
public static void main(String[] args) {
String s1 = "brute";
String s2 = "groot";
// Create an instance of Solution class
Solution sol = new Solution();
// Print the result
System.out.println("The Longest Common Supersequence is " + sol.shortestCommonSupersequence(s1, s2));
}
}
class Solution:
# Function to find the shortest common supersequence
def shortestCommonSupersequence(self, str1, str2):
n = len(str1)
m = len(str2)
# Create a DP table initialized to 0
dp = [[0] * (m + 1) for _ in range(n + 1)]
# Initialize the first row and column of the DP table
for i in range(n + 1):
dp[i][0] = 0
for i in range(m + 1):
dp[0][i] = 0
# Fill the DP table
for ind1 in range(1, n + 1):
for ind2 in range(1, m + 1):
if str1[ind1 - 1] == str2[ind2 - 1]:
dp[ind1][ind2] = 1 + dp[ind1 - 1][ind2 - 1]
else:
dp[ind1][ind2] = max(dp[ind1 - 1][ind2], dp[ind1][ind2 - 1])
# Reconstruct the shortest supersequence from DP table
len_superseq = dp[n][m]
i = n
j = m
index = len_superseq - 1
ans = []
# Build the shortest supersequence by backtracking
while i > 0 and j > 0:
if str1[i - 1] == str2[j - 1]:
ans.append(str1[i - 1])
index -= 1
i -= 1
j -= 1
elif dp[i - 1][j] > dp[i][j - 1]:
ans.append(str1[i - 1])
i -= 1
else:
ans.append(str2[j - 1])
j -= 1
# Add remaining characters from str1 or str2
while i > 0:
ans.append(str1[i - 1])
i -= 1
while j > 0:
ans.append(str2[j - 1])
j -= 1
# Reverse the result since we built it backwards
ans.reverse()
return ''.join(ans)
s1 = "brute"
s2 = "groot"
# Create an instance of Solution class
sol = Solution()
# Print the result
print(f"The Longest Common Supersequence is {sol.shortestCommonSupersequence(s1, s2)}")
class Solution {
// Function to find the shortest common supersequence
shortestCommonSupersequence(str1, str2) {
let n = str1.length;
let m = str2.length;
// Create a DP table initialized to 0
let dp = Array.from({ length: n + 1 }, () => Array(m + 1).fill(0));
// Initialize the first row and column of DP table
for (let i = 0; i <= n; i++) {
dp[i][0] = 0;
}
for (let i = 0; i <= m; i++) {
dp[0][i] = 0;
}
// Fill the DP table
for (let ind1 = 1; ind1 <= n; ind1++) {
for (let ind2 = 1; ind2 <= m; ind2++) {
if (str1[ind1 - 1] === str2[ind2 - 1])
dp[ind1][ind2] = 1 + dp[ind1 - 1][ind2 - 1];
else
dp[ind1][ind2] = Math.max(dp[ind1 - 1][ind2], dp[ind1][ind2 - 1]);
}
}
// Reconstruct the shortest supersequence from DP table
let len = dp[n][m];
let i = n;
let j = m;
let index = len - 1;
let ans = "";
// Build the shortest supersequence by backtracking
while (i > 0 && j > 0) {
if (str1[i - 1] === str2[j - 1]) {
ans += str1[i - 1];
index--;
i--;
j--;
} else if (dp[i - 1][j] > dp[i][j - 1]) {
ans += str1[i - 1];
i--;
} else {
ans += str2[j - 1];
j--;
}
}
// Add remaining characters from str1 or str2
while (i > 0) {
ans += str1[i - 1];
i--;
}
while (j > 0) {
ans += str2[j - 1];
j--;
}
// Reverse the result since we built it backwards
return ans.split('').reverse().join('');
}
}
let s1 = "brute";
let s2 = "groot";
// Create an instance of Solution class
let sol = new Solution();
// Print the result
console.log("The Longest Common Supersequence is " + sol.shortestCommonSupersequence(s1, s2));
Q: How do we construct the actual Shortest Common Supersequence?
A: Use LCS-based backtracking: If str1[i-1] == str2[j-1], add it to SCS. Otherwise, append the character from str1 or str2 that leads to the LCS.
Q: How would you extend this problem for k strings instead of just two?
A: Use Generalized LCS DP to find the common subsequence across k strings, then construct the SCS.
Q: How would you modify the problem to allow multiple optimal SCS outputs?
A: Track multiple paths in dp[], then backtrack to generate all possible sequences.