使用占位符匹配字符串

106 阅读2分钟

Python 中存在一个问题:如何编写一个函数 match_string(input, pattern, valid_words, date_format),该函数的功能是匹配字符串 input 和字符串 patternpattern 中包含数字、字符串和日期占位符。当 input 中的数字、字符串和日期正确匹配 pattern 中的占位符时,函数返回 True。同时,该函数需要提供详细的错误信息,在 inputpattern 不匹配时,能够指出是数字、字符串还是日期不匹配。

huake_00066_.jpg

解决方案

def match_string(input, pattern, valid_words, date_format):
  errors = []

  # 使 `input` 和 `pattern` 具有相同的格式
  regex_pattern = sub(r'#{1,3}', '(.+?)', pattern)
  if not match(regex_pattern, input):
    return 'Error: Input doesn't match pattern!'

  # 将日期格式转换成正则表达式
  date_regex = sub(r'%d', '(?P<day>\d+)', date_format)
  date_regex = sub(r'%m', '(?P<month>\d+)', date_regex)
  date_regex = sub(r'%y', '(?P<year>\d+)', date_regex)

  # 提取日期
  regex_pattern = sub(r'###', '(.+?)', pattern)
  regex_pattern = sub(r'##', '(?:.+?)', regex_pattern)
  regex_pattern = sub(r'#', '(?:.+?)', regex_pattern)
  for date in match(regex_pattern, input).groups():
    m = match(date_regex, date)
    if not m:
      errors.append('Error: %s is not a valid date!' % date)
    else:
      if int(m.group('day')) < 1 or 31 < int(m.group('day')):
        errors.append('Error: %s is not a valid day!' % m.group('day'))
      if int(m.group('month')) < 1 or 12 < int(m.group('month')):
        errors.append('Error: %s is not a valid month!' % m.group('month'))

  # 提取普通字符串
  regex_pattern = sub(r'###', '(?:.+?)', pattern)
  regex_pattern = sub(r'##', '(.+?)', regex_pattern)
  regex_pattern = sub(r'#', '(?:.+?)', regex_pattern)
  for word in match(regex_pattern, input).groups():
    if not word.strip() in valid_words:
      errors.append('Error: %s is not a valid word!' % word)

  # 提取数字
  regex_pattern = sub(r'###', '(?:.+?)', pattern)
  regex_pattern = sub(r'##', '(?:.+?)', regex_pattern)
  regex_pattern = sub(r'#', '(.+?)', regex_pattern)
  for number in match(regex_pattern, input).groups():
    if not match(r'\d+', number):
      errors.append('Error: %s is not a valid number!' % number)

  if len(errors) == 0:
    return True
  else:
    return '\n'.join(errors)

代码例子

以下是一些代码示例:

print match_string('1 is a number foo is a string 12-1-2013 is a date', '# is a number ## is a string ### is a date', ['foo'], '%m-%d-%y')
print
print match_string('1 is a number foo is a string 12-1-2013 is a date', '# is a number ## is a string ### is a date', ['foo'], '%m-%d-%y')
print
print match_string('foo is a number bar is a string 12-1-2013 is a date', '# is a number ## is a string ### is a date', ['foo'], '%m-%d-%y')
print
print match_string('1 is a number bar is a string 12-1-2013 is a date', '# is a number ## is a string ### is a date', ['foo'], '%m-%d-%y')
print
print match_string('1 is a number foo is a string January is a date', '# is a number ## is a string ### is a date', ['foo'], '%m-%d-%y')

输出结果

True

True

Error: invalid word: bar

Error: invalid word: bar

Error: invalid date format January