给定一个带有嵌套括号的字符串,如 "[ this is [ hello [ who ] [what ] from the other side ] slim shady ]", 如何从嵌套括号中提取字符串?
2、解决方案
方法一:栈
使用栈来存储当前解析的字符串。当遇到一个 "[" 时,将当前栈内容压入栈中,并新建一个栈来存储接下来的内容。当遇到一个 "]" 时,将当前栈的内容弹出来,并将其与前一个栈的内容连接起来。最后,将栈中所有的内容依次弹出来,即为所要提取的字符串。
def parse(text):
stack = []
for char in text:
if char == '[':
stack.append([])
elif char == ']':
yield ''.join(stack.pop())
else:
stack[-1].append(char)
print(tuple(parse(text)))
输出:
(' who ', 'what ', ' hello from the other side ', ' this is slim shady ')
方法二:正则表达式
可以使用正则表达式来匹配嵌套括号中的字符串。正则表达式 r'[([^[]]*)]' 可以匹配非嵌套的方括号中的字符串。然后,可以使用该正则表达式来迭代地匹配字符串中的所有嵌套括号,并将匹配到的字符串添加到结果列表中。
import re
s= '[ this is [ hello [ who ] [what ] from the other [side] ] slim shady ][oh my [g[a[w[d]]]]]'
result= []
pattern= r'[([^[]]*)]'
while '[' in s:
result.extend(re.findall(pattern, s))
s= re.sub(pattern, '', s)
result= filter(None, (t.strip() for t in result))
print(result)
输出:
['who', 'what', 'side', 'd', 'hello from the other', 'w', 'this is slim shady', 'a', 'g', 'oh my']
方法三:树状结构
将嵌套括号中的字符串表示为一个树状结构,其中每个节点代表一个括号对,其子节点代表嵌套在该括号对中的字符串。然后,可以通过遍历树状结构来提取嵌套括号中的字符串。
class BracketMatch:
def __init__(self, refstr, parent=None, start=-1, end=-1):
self.parent = parent
self.start = start
self.end = end
self.refstr = refstr
self.nested_matches = []
def __str__(self):
cur_index = self.start+1
result = ""
if self.start == -1 or self.end == -1:
return ""
for child_match in self.nested_matches:
if child_match.start != -1 and child_match.end != -1:
result += self.refstr[cur_index:child_match.start]
cur_index = child_match.end + 1
else:
continue
result += self.refstr[cur_index:self.end]
return result
haystack = '[ this is [ hello [ who ] [what ] from the other side ] slim shady ]'
root = BracketMatch(haystack)
cur_match = root
for i in range(len(haystack)):
if '[' == haystack[i]:
new_match = BracketMatch(haystack, cur_match, i)
cur_match.nested_matches.append(new_match)
cur_match = new_match
elif ']' == haystack[i]:
cur_match.end = i
cur_match = cur_match.parent
else:
continue
nodes_list = root.nested_matches
while nodes_list != []:
node = nodes_list.pop(0)
nodes_list.extend(node.nested_matches)
print("Match: " + str(node).strip())
输出:
Match: this is slim shady
Match: hello from the other side
Match: who
Match: what