在Python中匹配无空格或有空格的Regex（教程）需要使用一个**regex来匹配--"没有字符或一个字符 "或 "

Regex to Match no space or one space in Python

需要使用一个**regex来匹配--"没有字符或一个字符 "或 "零或一个空格"。**如果是这样，你可以使用以下语法来匹配类似的模式。

[ ]{0,1} - 匹配无空格或1个空格
[-]? - 匹配无字符或单个连字符

让我们通过一个例子来演示它们的用法。

例子 1: 匹配字符串中的空格或无空格

如果你有一个用户名的列表，比如。

@ user_1
@John Doe 1969@
@Peter-Parker123@
@123any_other_user2@
更多用户33
更多用户2
@更多@
@最后站立@

如果你想匹配@ 后面有零或一个空格，你可以使用regex语法。(@[ ]{0,1}[A-Za-z0-9 ]+) 作为。

import re
texts = ['@ user_1  ', '@John  Doe 1969@', '@-Peter-Parker123@', '@123any_other_user2@', 'more users33', 'more users2' , '@more@', '@last standing@']
for text in texts:
    print(re.findall(r"(@[ ]{0,1}[A-Za-z0-9 ]+)", text))

结果。

['@ user']
['@John  Doe 1969']
[]
['@123any']
[]
[]
['@more']
['@last standing']

它是如何工作的？

() - 代表一个捕获组，里面的所有内容都会被提取出来。
[ ]{0,1} - 匹配无空格或1个空格
[A-Za-z0-9 ]+
- a-z（范围），匹配 "a "到 "z "范围内的字符（字符代码97到122）。
- 0-9（范围），匹配 "0 "到 "9 "范围内的字符（字符代码48到57）。

例2：匹配字母S或URL中的字母S

假设你有一个URL列表，你想只提取以http 开始，然后包含1个字母s 或根本没有s 的URL。

import re
texts = [
    'https://en.wikipedia.org/wiki/Main_Page/',
    'http://en.wikipedia.org/wiki/National_Park_Service/',
    'https://en.wikipedia.org/wiki/Hoover_Dam/',
    'http://en.wikipedia.org/wiki/United_States_Bureau_of_Reclamation/',
    'https://en.wikipedia.org/wiki/Central_African_Republic/',
    'en.wikipedia.org/wiki/Africa/',
    'ftp://en.wikipedia.org/wiki/Central_African_Republic/',
]
for text in texts:
    print(re.findall(r"(http[s]{0,1}.*)", text))

这将导致。

['https://en.wikipedia.org/wiki/Main_Page/']
['http://en.wikipedia.org/wiki/National_Park_Service/']
['https://en.wikipedia.org/wiki/Hoover_Dam/']
['http://en.wikipedia.org/wiki/United_States_Bureau_of_Reclamation/']
['https://en.wikipedia.org/wiki/Central_African_Republic/']
[]
[]

例3：匹配出现次数不超过n次的字符串

最后，如果你想找到所有包含n个空格(或任何其他字符)的字符串，那么你可以使用下一个词组：re.findall(r"([_])", text) 。

因此，让我们计算下一个URLS中_ 的数量。

import re
texts = [
'https://en.wikipedia.org/wiki/Main_Page/',
'http://en.wikipedia.org/wiki/National_Park_Service/',
'https://en.wikipedia.org/wiki/Hoover_Dam/',
'http://en.wikipedia.org/wiki/United_States_Bureau_of_Reclamation/',
'https://en.wikipedia.org/wiki/Central_African_Republic/',
'en.wikipedia.org/wiki/Africa/',
'ftp://en.wikipedia.org/wiki/Central_African_Republic/',
]
for text in texts:
    print(len(re.findall(r"([_])", text)), end=' - ')
    print(re.findall(r"([_])", text))

结果。

1 - ['_']
2 - ['_', '_']
1 - ['_']
4 - ['_', '_', '_', '_']
2 - ['_', '_']
0 - []
2 - ['_', '_']

当然，Python提供了更快的解决方案，只需计算。

text.count('_')

regex的优点是可以自定义。你不仅可以计算单个字符，还可以计算一个列表或模式。

因此，如果你想计算字符串中出现多少次_,/ 或空格，你可以使用。

import re
texts = [
'https://en.wikipedia.org/wiki/Main_Page/',
'http://en.wikipedia.org/wiki/National_Park_Service/',
'https://en.wikipedia.org/wiki/Hoover_Dam/',
'http://en.wikipedia.org/wiki/United_States_Bureau_of_Reclamation/',
'https://en.wikipedia.org/wiki/Central_African_Republic/',
'en.wikipedia.org/wiki/Africa/',
'ftp://en.wikipedia.org/wiki/Central_African_Republic/',
]
for text in texts:
    print(len(re.findall(r"([_/ ])", text)), end=' - ')
    print(re.findall(r"([_/])", text))

结果。

6 - ['/', '/', '/', '/', '_', '/']
7 - ['/', '/', '/', '/', '_', '_', '/']
6 - ['/', '/', '/', '/', '_', '/']
9 - ['/', '/', '/', '/', '_', '_', '_', '_', '/']
7 - ['/', '/', '/', '/', '_', '_', '/']
3 - ['/', '/', '/']
7 - ['/', '/', '/', '/', '_', '_', '/']