Similar Problems

Similar Problems not available

Find The Shortest Superstring - Leetcode Solution

Companies:

LeetCode:  Find The Shortest Superstring Leetcode Solution

Difficulty: Hard

Topics: string dynamic-programming bit-manipulation array  

Problem Statement:

Given a set of n strings arr[], the task is to find the shortest string containing each string in the set as substring.

Example 1:

Input: arr[] = {"abcd", "cdef", "fgh", "de"} Output: "abcdefgh" Explanation: "abcd" + "cdef" -> "abcdef" "abcdef" + "fgh" -> "abcdefgh"

Example 2:

Input: arr[] = {"abcdef", "efgh", "ghij"} Output: "abcdefghij" Explanation: "abcdef" + "efgh" -> "abcdefgh" "abcdefgh" + "ghij" -> "abcdefghij"

Approach:

  • We start by choosing the two strings with the highest overlap
  • We merge the two strings using the overlap and add the merged string to the remaining strings
  • We repeat this process iteratively until we get a single string with all the substrings
  • We calculate the length of the final string and return it as the answer.

Algorithm:

  1. Create a 2D Matrix of size n*n to store overlaps of all the strings. matrix[i][j] will store overlap of ith and jth string.
  2. For each pair of strings(i, j), calculate the overlap matrix[i][j].
  3. Find pair of strings(i,j) with maximum overlap.
  4. Merge strings i and j using the overlapping part.
  5. Add the merged string to remaining strings array.
  6. Repeat steps 2-5 until we have only one string in the remaining strings array.
  7. Calculate the length of the final string and return it as the answer.

Python Implementation:

import sys

def find_overlaps(s1, s2): n1, n2 = len(s1), len(s2) best = (-sys.maxsize, -1) # (overlap, overlap_start)

# compare all suffixes of s1 with all prefixes of s2
for i in range(1, min(n1, n2) + 1):
    if s1[n1 - i:] == s2[:i]:
        if best[0] < i:
            best = (i, n1 - i)

# return both the overlap length and start index
return best

def shortestSuperstring(strings): N = len(strings)

overlaps = [[0] * N for _ in range(N)]

# calculate overlaps of all string pairs
for i in range(N):
    for j in range(N):
        if i == j: 
            continue
        overlaps[i][j], _ = find_overlaps(strings[i], strings[j])

# While there are more than one string in strings
while len(strings) > 1:
    # Find the pair with maximum overlap
    max_overlap = -sys.maxsize
    p, q = 0, 0
    for i in range(N):
        for j in range(N):
            if overlaps[i][j] > max_overlap:
                max_overlap = overlaps[i][j]
                p, q = i, j
    
    # Merge strings p and q using overlap
    merged = strings[p] + strings[q][max_overlap:]
    
    # Update strings and overlaps arrays
    strings = strings[:p] + [merged] + strings[p+1:q] + strings[q+1:]
    N -= 1
    overlaps = [[0] * N for _ in range(N)]
    for i in range(N):
        for j in range(N):
            if i == j: 
                continue
            overlaps[i][j], _ = find_overlaps(strings[i], strings[j])
            
return strings[0]

Time Complexity:

The time complexity of this algorithm is O(n^3) because we need to calculate overlaps of all the string pairs which takes O(n^2) time and for each string pair calculation takes O(n) time.

Space Complexity:

The space complexity of this algorithm is also O(n^2) because we are storing overlap values between each pair of strings in a 2D matrix of size n*n.

Find The Shortest Superstring Solution Code

1