Hackerrank - Sherlock and Anagrams Solution
Two strings are anagrams of each other if the letters of one string can be rearranged to form the other string. Given a string, find the number of pairs of substrings of the string that are anagrams of each other.
For example , the list of all anagrammatic pairs is at positions respectively.
Function Description
Complete the function sherlockAndAnagrams in the editor below. It must return an integer that represents the number of anagrammatic pairs of substrings in .
sherlockAndAnagrams has the following parameter(s):
- s: a string .
Input Format
The first line contains an integer , the number of queries.
Each of the next lines contains a string to analyze.
Constraints
String contains only lowercase letters ascii[a-z].
Output Format
For each query, return the number of unordered anagrammatic pairs.
Sample Input 0
2
abba
abcd
Sample Output 0
4
0
Explanation 0
The list of all anagrammatic pairs is and at positions and respectively.
No anagrammatic pairs exist in the second query as no character repeats.
Sample Input 1
2
ifailuhkqq
kkkk
Sample Output 1
3
10
Explanation 1
For the first query, we have anagram pairs and at positions and respectively.
For the second query:
There are 6 anagrams of the form at positions and .
There are 3 anagrams of the form at positions and .
There is 1 anagram of the form at position .
Sample Input 2
1
cdcd
Sample Output 2
5
Explanation 2
There are two anagrammatic pairs of length : and .
There are three anagrammatic pairs of length : at positions respectively.
Solution in Python
from collections import Counter
def sherlockAndAnagrams(s):
count = Counter(("".join(sorted(s[j:j+i])) for i in range(1,len(s)) for j in range(0,len(s)-i+1) ))
return sum(sum(range(i)) for i in count.values())
for _ in range(int(input())):
s = input()
print(sherlockAndAnagrams(s))
Answer breakdown
Let our string be s = "ifailuhkqq"
First we find all possible substrings of s.
for i in range(1,len(s)):
for j in range(0,len(s)-i+1):
print(s[j:j+i],end=", ")
We will get the following output:
i, f, a, i, l, u, h, k, q, q, if, fa, ai, il, lu, uh, hk, kq, qq, ifa, fai, ail, ilu, luh, uhk, hkq, kqq, ifai, fail, ailu, iluh, luhk, uhkq, hkqq, ifail, failu, ailuh, iluhk, luhkq, uhkqq, ifailu, failuh, ailuhk, iluhkq, luhkqq, ifailuh, failuhk, ailuhkq, iluhkqq, ifailuhk, failuhkq, ailuhkqq, ifailuhkq, failuhkqq,
Above is all the possible substrings of our string s.
We have to find anagram substrings, which means order of our string doesn't matter. Therefore we will sort each of our substring so that we group substrings having the same characters.
Example: ifa and fai are anagrams. After sorting, both will become afi after sorting.
for i in range(1,len(s)):
for j in range(0,len(s)-i+1):
print("".join(sorted(s[j:j+i])),end=", ")
Output
i, f, a, i, l, u, h, k, q, q, fi, af, ai, il, lu, hu, hk, kq, qq, afi, afi, ail, ilu, hlu, hku, hkq, kqq, afii, afil, ailu, hilu, hklu, hkqu, hkqq, afiil, afilu, ahilu, hiklu, hklqu, hkqqu, afiilu, afhilu, ahiklu, hiklqu, hklqqu, afhiilu, afhiklu, ahiklqu, hiklqqu, afhiiklu, afhiklqu, ahiklqqu, afhiiklqu, afhiklqqu,
Let arr = list of all possible substrings
Now we will use Counter to group and count all the sorted substrings.
>>> print(Counter(arr))
Counter({'i': 2, 'q': 2, 'afi': 2, 'f': 1, 'a': 1, 'l': 1, 'u': 1, 'h': 1, 'k': 1, 'fi': 1, 'af': 1, 'ai': 1, 'il': 1, 'lu': 1, 'hu': 1, 'hk': 1, 'kq': 1, 'qq': 1, 'ail': 1, 'ilu': 1, 'hlu': 1, 'hku': 1, 'hkq': 1, 'kqq': 1, 'afii': 1, 'afil': 1, 'ailu': 1, 'hilu': 1, 'hklu': 1, 'hkqu': 1, 'hkqq': 1, 'afiil': 1, 'afilu': 1, 'ahilu': 1, 'hiklu': 1, 'hklqu': 1, 'hkqqu': 1, 'afiilu': 1, 'afhilu': 1, 'ahiklu': 1, 'hiklqu': 1, 'hklqqu': 1, 'afhiilu': 1, 'afhiklu': 1, 'ahiklqu': 1, 'hiklqqu': 1, 'afhiiklu': 1, 'afhiklqu': 1, 'ahiklqqu': 1, 'afhiiklqu': 1, 'afhiklqqu': 1})
Great!, now we have got the count of our every possible sorted substrings. All the substrings that have atleast 2 counts are the substrings which can make an anagram pair.
>>> count = {k:v for k,v in Counter(arr).items() if v>1}
>>> count
{'i': 2, 'q': 2, 'afi': 2}
So "i',"q" and "afi" are the substrings which appears more than once
Before moving to the last part. I would like to tell you that. If a substring appears k times, then the total possible anagrams of that substring will be 1+2+3+......+(k-1)
Example in string "kkkk".
Total possible anagrams of ["k","k"] will be 1+2+3 = 6, as there are 4 substrings of "k" in "kkkk".
Total possible anagrams of "kk" will be 1+2 = 3, as there are 3 substrings of "kk" in "kkkk".
Total possible anagrams of "kkk" will be 1 , as there are 2 substrings of "kkk" in "kkkk".
Total anagrams of the string "kkkk" = 6+3+1 = 10.
Notice that 1+2+3 ,1+2, 1 can be written as sum(range(4)), sum(range(3)), sum(range(2)) i.e sum(range(countOfString))
Back to our original problem.
Final part
>>> count.values()
[2,2,2]
Just add the range sum of each of these counts.
total = 0
for i in count.values():
total += sum(range(i))
print(total)
Output: 3
In short it can be written as
>>> sum(sum(i) for i in count.values())
3
That's it. If you have any doubt feel free to leave a comment below.