大锤python日记(正则表达式常用方法汇总)

64 阅读2分钟

正则表达式的常用方法

  1. .:匹配除换行符外的任何单个字符

    • 示例:
      import re
      
      pattern = r".at"
      text = "cat hat sat bat"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['cat', 'hat', 'sat', 'bat']
      
  2. []:匹配方括号中列出的任何单个字符

    • 示例:
      import re
      
      pattern = r"[bh]at"
      text = "cat hat sat bat"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['hat', 'bat']
      
  3. \w:匹配任何字母数字字符或下划线字符

    • 示例:
      import re
      
      pattern = r"\w\w\w"
      text = "cat hat sat bat"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['cat', 'hat', 'sat', 'bat']
      
  4. \s:匹配任何空格字符,包括空格、制表符和换行符

    • 示例:
      import re
      
      pattern = r"\s\w\w\w"
      text = "cat hat\nsat\tbat"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 [' hat', '\nsat', '\tbat']
      
  5. \d:匹配任何十进制数字

    • 示例:
      import re
      
      pattern = r"\d+"
      text = "my age is 27"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['27']
      
  6. [^]:匹配除括号中列出的字符之外的任何单个字符

    • 示例:
      import re
      
      pattern = r"[^bh]at"
      text = "cat hat sat bat"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['cat', 'sat']
      
      好的,以下是更多的正则表达式方法及其说明和示例:
  7. \W:匹配任何非字母数字字符或下划线字符

    • 示例:
      import re
      
      pattern = r"\W\w+"
      text = "cat! hat sat_bat"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['! hat', ' bat']
      
  8. \S:匹配任何非空白字符

    • 示例:
      import re
      
      pattern = r"\S{3}"
      text = "cat hat\nsat\tbat"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['cat', 'hat', 'sat', 'bat']
      

匹配多个字符

  1. *:匹配前面的表达式零次或多次
    • 示例:
      import re
      
      pattern = r"ab*c"
      text = "ac abc abbc abbbc"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['ac', 'abc', 'abbc', 'abbbc']
      
  2. +:匹配前面的表达式一次或多次
    • 示例:
      import re
      
      pattern = r"a+b+c"
      text = "ac abc abbc abbbc"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['abc', 'abbc', 'abbbc']
      
  3. ?:匹配前面的表达式零次或一次
    • 示例:
      import re
      
      pattern = r"ab?c"
      text = "ac abc abbc abbbc"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['ac', 'abc']
      
  4. {n}:匹配前面的表达式恰好n次
    • 示例:
      import re
      
      pattern = r"ab{2}c"
      text = "ac abc abbc abbbc"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['abbc']
      
  5. {n,}:匹配前面的表达式至少n次
    • 示例:
      import re
      
      pattern = r"ab{2,}c"
      text = "ac abc abbc abbbc"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['abbc', 'abbbc']
      
  6. {n,m}:匹配前面的表达式至少n次且不超过m次
    • 示例:
      import re
      
      pattern = r"ab{1,2}c"
      text = "ac abc abbc abbbc"
      matches = re.findall(pattern, text)
      
      print(matches)  # 输出 ['abc', 'abbc']