30.Substring with Concatenation of All Words
串联所有单词的子串
题干:
You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.
解释:
题意的描述比较简单,就是找到s中的字串集合中所有由words中每个单词都出现一次组成的字符串,返回这些字串的起始位置。 有两个需要注意的条件: 1,所有的word长度都相同 2,每个单词都只出现一次,并且不可交叉
思考:
首先既然是找所有字串,dfs肯定是可以的,但是一般不是最优的,要想在解决字符串问题中实现最优无非就是dp或者滑窗。降低时间复杂度的关键就在于滑窗中包含的字符串无需重复统计,只需要更新每次出队的即可。 我们可以根据k的余数来进行滑窗分析,每次滑动的长度是单词的长度,这样最后就一定可以不重复的找到所有的字串。
答案:
from collections import defaultdict
def findSubstring(s, words):
"""
:type s: str
:type words: List[str]
:rtype: List[int]
"""
if len(s) ==0 or len(words)==0:
return []
wordLength = len(words[0])
winSize = len(words)*wordLength
wordsSet=defaultdict(int)
res=[]
for word in words:
wordsSet[word]+=1
for i in range(wordLength):
#for each position we need to traverse the whole str
#first to get the init set of this loop
left,right = i,i+winSize
tmpSet=defaultdict(int)
for j in range(left,right,wordLength):
tmp =s[j:j+wordLength]
tmpSet[tmp]+=1
while right<=len(s):
#比较两个字典是否完全相同
if check(tmpSet,wordsSet):
res.append(left)
if right+wordLength>len(s):
break
tmpSet[s[left:left+wordLength]]-=1
tmpSet[s[right:right+wordLength]]+=1
left+=wordLength
right+=wordLength
return res
def check(tmpSet,wordsSet):
for word in wordsSet:
if word not in tmpSet:
return False
elif tmpSet[word] != wordsSet[word]:
return False
return True
答案补充:
最终结果是击败了93%,还有优化空间不过也差不多了。字串匹配还是要往滑窗上想呀,