Similar Problems

Similar Problems not available

Repeated Dna Sequences - Leetcode Solution

Companies:

LeetCode:  Repeated Dna Sequences Leetcode Solution

Difficulty: Medium

Topics: string sliding-window bit-manipulation hash-table  

Problem statement:

The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'.

For example, "ACGAATTCCG" is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Example 1:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT" Output: ["AAAAACCCCC", "CCCCCAAAAA"]

Example 2:

Input: s = "AAAAAAAAAAAAA" Output: ["AAAAAAAAAA"]

Approach:

We need to find all the 10-letter-long substrings that occur more than once in the DNA sequence. For this, we can follow the below steps:

  1. Initialize an empty dictionary/dictionary to keep track of the number of occurrences of each substring.

  2. Loop through the given DNA sequence s.

  3. For each 10-letter-long substring in s, check if it is already present in the dictionary. If it is present, increment its count. If it is not present, add it to the dictionary with a count of 1.

  4. After looping through the entire sequence, loop through the dictionary and add all the substrings with a count greater than 1 to the result list.

  5. Return the result list.

Code:

Time Complexity: O(N), where N is the length of the DNA sequence s. We loop through the entire sequence twice, but both the loops are of O(N) time complexity.

Space Complexity: O(N), where N is the length of the DNA sequence s. We store all the substrings in the dictionary and the result list.

class Solution:
    def findRepeatedDnaSequences(self, s: str) -> List[str]:
        substrings = {}
        result = []
        for i in range(len(s)-9):
            substring = s[i:i+10]
            if substring in substrings:
                substrings[substring] += 1
            else:
                substrings[substring] = 1
        for substring in substrings:
            if substrings[substring] > 1:
                result.append(substring)
        return result

Repeated Dna Sequences Solution Code

1