正则表达式正则表达式是处理字符串的强大工具，JavaScript 通过 RegExp 对象提供完整的正则表达式支持。一

正则表达式是处理字符串的强大工具，JavaScript 通过 RegExp 对象提供完整的正则表达式支持。

一、正则表达式基础

1. 创建正则表达式

JavaScript 中有两种创建正则表达式的方式：

// 字面量形式（推荐）
const regex1 = /pattern/flags;

// 构造函数形式
const regex2 = new RegExp('pattern', 'flags');

2. 常用修饰符（flags）

修饰符	描述
`i`	不区分大小写 (ignore case)
`g`	全局匹配 (global)
`m`	多行模式 (multiline)
`s`	允许`.`匹配换行符 (dotAll)
`u`	启用 Unicode 模式
`y`	粘性匹配 (sticky)

二、正则表达式语法

1. 基本匹配

const str = 'Hello World';
const regex = /Hello/;
regex.test(str); // true

2. 字符类

模式	描述
`[abc]`	匹配 a、b 或 c
`[^abc]`	匹配非 a、b、c 的字符
`[a-z]`	匹配 a 到 z 的任意小写字母
`[A-Z]`	匹配 A 到 Z 的任意大写字母
`[0-9]`	匹配 0 到 9 的数字
`\d`	匹配数字，等价于 [0-9]
`\D`	匹配非数字
`\w`	匹配单词字符 [a-zA-Z0-9_]
`\W`	匹配非单词字符
`\s`	匹配空白字符（空格、制表符等）
`\S`	匹配非空白字符
`.`	匹配除换行符外的任意字符

3. 量词

量词	描述
`*`	匹配 0 次或多次
`+`	匹配 1 次或多次
`?`	匹配 0 次或 1 次
`{n}`	匹配恰好 n 次
`{n,}`	匹配至少 n 次
`{n,m}`	匹配 n 到 m 次

4. 边界匹配

边界	描述
`^`	匹配字符串开始
`$`	匹配字符串结束
`\b`	匹配单词边界
`\B`	匹配非单词边界

5. 分组和捕获

// 捕获组
const regex = /(\d{4})-(\d{2})-(\d{2})/;
const match = regex.exec('2023-05-15');
// match[1] = '2023', match[2] = '05', match[3] = '15'

// 非捕获组 (?:...)
const nonCapturing = /(?:https?):\/\/([^/]+)/;

6. 或操作

const regex = /cat|dog/;
regex.test('I have a cat'); // true
regex.test('I have a dog'); // true

7. 前瞻和后顾

模式	名称	描述
`(?=...)`	正向肯定前瞻	匹配后面跟着...的位置
`(?!...)`	正向否定前瞻	匹配后面不跟...的位置
`(?<=...)`	反向肯定后顾	匹配前面是...的位置
`(?<!...)`	反向否定后顾	匹配前面不是...的位置

// 匹配后面跟着美元的数值
const lookahead = /\d+(?=\$)/;
lookahead.exec('100$')[0]; // '100'

// 匹配前面是美元的数值
const lookbehind = /(?<=\$)\d+/;
lookbehind.exec('$100')[0]; // '100'

三、JavaScript 正则方法

1. RegExp 对象方法

方法	描述
`test()`	测试是否匹配，返回布尔值
`exec()`	执行搜索，返回匹配结果数组

const regex = /\d+/;
regex.test('abc123'); // true

const result = regex.exec('abc123');
// result[0] = '123', index: 3, input: 'abc123'

2. String 方法中使用正则

方法	描述
`match()`	返回匹配结果数组
`matchAll()`	返回所有匹配的迭代器
`search()`	返回匹配位置的索引
`replace()`	替换匹配的子串
`split()`	使用正则分割字符串

'abc123def456'.match(/\d+/g); // ['123', '456']

'Hello World'.search(/World/); // 6

'2023-05-15'.replace(/-/g, '/'); // '2023/05/15'

'a,b, c'.split(/\s*,\s*/); // ['a', 'b', 'c']

四、实际应用场景

1. 表单验证

// 验证邮箱
function isValidEmail(email) {
  return /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email);
}

// 验证密码强度（至少8字符，含大小写和数字）
function isStrongPassword(password) {
  return /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$/.test(password);
}

2. 数据提取

// 提取URL中的域名
function getDomain(url) {
  const match = url.match(/^(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:/\n?]+)/im);
  return match ? match[1] : null;
}

// 提取文本中的所有金额
function extractAmounts(text) {
  return text.match(/\$\d+(?:\.\d{1,2})?/g) || [];
}

3. 文本处理

// 驼峰转连字符
function camelToKebab(str) {
  return str.replace(/[A-Z]/g, match => `-${match.toLowerCase()}`);
}

// 格式化数字（千位分隔符）
function formatNumber(num) {
  return num.toString().replace(/\B(?=(\d{3})+(?!\d))/g, ',');
}

4. 高级用法

// 模板字符串替换
function render(template, data) {
  return template.replace(/\{\{(\w+)\}\}/g, (match, key) => data[key] || '');
}

// 解析简单查询字符串
function parseQuery(query) {
  return Array.from(query.matchAll(/([^&=]+)=([^&]*)/g))
    .reduce((acc, [_, key, value]) => ({ ...acc, [key]: value }), {});
}

五、性能优化与最佳实践

1. 预编译正则表达式

// 不好的做法：每次调用都创建新正则
function testSomething(str) {
  return /somePattern/.test(str);
}

// 好的做法：预编译正则
const regexCache = {};
function getRegex(pattern) {
  if (!regexCache[pattern]) {
    regexCache[pattern] = new RegExp(pattern);
  }
  return regexCache[pattern];
}

2. 避免灾难性回溯

// 危险的正则：可能导致灾难性回溯
const dangerousRegex = /(a+)+$/;

// 改进版本
const safeRegex = /a+$/;

3. 使用非贪婪匹配

// 贪婪匹配（默认）
'<div>content</div>'.match(/<div>.*<\/div>/)[0]; // 整个字符串

// 非贪婪匹配
'<div>content</div>'.match(/<div>.*?<\/div>/)[0]; // '<div>content</div>'

4. 适当使用锚点提高性能

// 没有锚点，会在整个字符串中搜索
/^\d+$/.test('123'); // 明确匹配整个字符串

// 检查字符串是否以数字开头
/^\d/.test('123abc'); // true

六、ES6+ 新增特性

1. `u` 修饰符 (Unicode)

// 匹配 Unicode 字符
/^\uD83D/u.test('\uD83D\uDC2A'); // false (正确识别代理对)
/^\uD83D/.test('\uD83D\uDC2A'); // true (错误匹配)

2. `y` 修饰符 (粘性匹配)

const str = 'foo bar foo';
const regex = /foo/y;

regex.lastIndex = 0;
regex.test(str); // true (匹配第一个foo)

regex.lastIndex = 4;
regex.test(str); // false (从位置4开始匹配不到)

3. `s` 修饰符 (dotAll)

// 传统 . 不匹配换行符
/foo.bar/.test('foo\nbar'); // false

// s 修饰符允许 . 匹配换行符
/foo.bar/s.test('foo\nbar'); // true

4. 命名捕获组

const regex = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = regex.exec('2023-05-15');

console.log(match.groups.year); // '2023'
console.log(match.groups.month); // '05'
console.log(match.groups.day); // '15'

5. 后行断言

// 匹配前面是$的数字
/(?<=\$)\d+/.exec('$100')[0]; // '100'

// 匹配前面不是$的数字
/(?<!\$)\d+/.exec('€100')[0]; // '100'

七、常见问题与解决方案

1. 转义特殊字符

function escapeRegExp(str) {
  return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

const search = 'file.txt';
const regex = new RegExp(escapeRegExp(search), 'gi');

2. 多行匹配

const multilineText = `Line 1
Line 2
Line 3`;

// 匹配每行开头的"Line"
multilineText.match(/^Line/gm); // ['Line', 'Line', 'Line']

3. 复杂替换

// 将Markdown链接转换为HTML
const markdown = 'Visit [Google](https://google.com)';
markdown.replace(/\[([^\]]+)\]\(([^)]+)\)/g, '<a href="$2">$1</a>');
// 'Visit <a href="https://google.com">Google</a>'

正则表达式

一、正则表达式基础

1. 创建正则表达式

2. 常用修饰符（flags）

二、正则表达式语法

1. 基本匹配

2. 字符类

3. 量词

4. 边界匹配

5. 分组和捕获

6. 或操作

7. 前瞻和后顾

三、JavaScript 正则方法

1. RegExp 对象方法

2. String 方法中使用正则

四、实际应用场景

1. 表单验证

2. 数据提取

3. 文本处理

4. 高级用法

五、性能优化与最佳实践

1. 预编译正则表达式

2. 避免灾难性回溯

3. 使用非贪婪匹配

4. 适当使用锚点提高性能

六、ES6+ 新增特性

1. u 修饰符 (Unicode)

2. y 修饰符 (粘性匹配)

3. s 修饰符 (dotAll)

4. 命名捕获组

5. 后行断言

七、常见问题与解决方案

1. 转义特殊字符

2. 多行匹配

3. 复杂替换

1. `u` 修饰符 (Unicode)

2. `y` 修饰符 (粘性匹配)

3. `s` 修饰符 (dotAll)