Dart regex in practice

2,134 阅读3分钟

今天来聊下Dart里的正则表达式及其应用。内容比较简单,直接代码注释来说话。

void main() {
  // 1. dart中使用`RegExp`抽象正则运算,具体用法可以自己看看代码注释。

  // 2. 使用`hasMatch()`检测是否存在token,找到返回true, 否则返回false
  var reg1 = RegExp(r'&token=(\w+)');
  var expectFound = reg1.hasMatch('func=login&token=12345');
  var expectNotFound = reg1.hasMatch('abcdef');
  assert(expectFound == true);
  assert(expectNotFound == false);

  // 3. 使用`firstMatch()`返回第一个配对结果,未找到返回null
  var expectFirstMatchFound1 = reg1.firstMatch('func=login&token=12345');
  var expectFirstMatchFound2 =
      reg1.firstMatch('func=login&token=12345&token=678910');
  var expectFirstMatchNotFound = reg1.firstMatch('func=login&token1=12345');
  assert(expectFirstMatchFound1.group(1) == '12345');
  assert(expectFirstMatchFound2.group(1) == '12345');
  assert(expectFirstMatchNotFound == null);

  // 3. 使用`allMatches()`返回所有匹配结果,未找到匹配返回null
  // 否则返回一个迭代器(Iterable), 通过for循环或者elementAt, 或者first\last来获取指定结果
  var expectAllMatchFound1 = reg1.allMatches('func=login&token=12345');
  var expectAllMatchFound2 =
      reg1.allMatches('func=login&token=12345&token=678910');
  var expectAllMatchNotFound = reg1.allMatches('func=login&token1=12345');
  assert(expectAllMatchFound1.length == 1);
  assert(expectAllMatchFound1.elementAt(0).group(1) == '12345');
  assert(expectAllMatchFound2.length == 2);
  assert(expectAllMatchFound2.elementAt(0).group(1) == '12345');
  assert(expectAllMatchFound2.elementAt(1).group(1) == '678910');
  assert(expectAllMatchNotFound == null);

  /// 实例1: 移除所有的html标签
  /// `replaceAll()`: 在输入字符串中寻找匹配串,找到则用目标字符串替代,生成新的字符串
  RegExp tagExp = RegExp(r"<[^>]*>", multiLine: true, caseSensitive: true);
  var html = '''
<html>
<head>
<body>
Hello World
</body>
</head>
</html>
''';
  var removeTagResult = html.replaceAll(tagExp, '');
  assert(removeTagResult == 'Hello World');

  /// 实例2: 替换token
  /// `replaceAllMapped()`:在输入字符串中寻找匹配串,找到则使用闭包返回串替代,生成新的字符串。
  /// `replaceAllMapped`Replace all substrings that match [from] by a string computed from the match.
  var value = 'func=login&token=12345';
  value = value.replaceAllMapped('token=([a-zA-Z0-9:_]+)', (match) {
    return 'token=new_token';
  });
  assert(value == 'func=login&token=new_token');
}

特别说明下下面这个逗比设计,绝对是脑袋被驴踢了。

var reg2 = RegExp('func=(\w+)&token=(\w+)');
var match = reg2.firstMatch('func=login&token=123');
assert(match.groupCount == 2);
assert(match.group(0) == 'func=login&token=123');
assert(match.group(1) == 'logim');
assert(match.group(2) == '123');

绝对的误人子弟,为啥groupCount=2时还能用group(2)来获取,看下注释:

  int get groupCount
  dart:core

  Returns the number of captured groups in the match.

  Some patterns may capture parts of the input that was used to compute the full match. 
  This is the number of captured groups, which is also the maximal allowed argument to the [group] method.

一般来讲正常的正则表达默认都是将匹配到的完整串也算在groupCount里(即group(0)),但是这里group(0)保留了这个逻辑,但是count又不遵守,自我打脸,使用时要注意下,没有匹配会返回null, 所有结果非null一定是匹配到了。

参考

  1. RegExp的代码注释,墙裂推荐
  2. 官方文档(代码注释转成文档而已) api.flutter.dev/flutter/dar…
  3. 忘记正则语法可以看看百度百科
  4. 在线正则表达式(检测正则语法):tool.oschina.net/regex