什么是正则表达式
正则表达式就是处理字串的方法,他是以行为单位来进行字串的处理行为, 正则表达式通过一些特殊符号的辅助,可以让使用者轻易的达到“搜寻/删除/取代”某特定字串的处理程序。
基础正则表达式
对字符排序有影响的语系数据就会对正则表达式的结果有影响,正则表达式也需要支持工具程序来辅助才行,也就是grep
语系对正则表达式的影响
不同的语言环境,编码数据是不相同的。
LANG=C 时:0 1 2 3 4 ... A B C D ... Z a b c d ...z
LANG=zh_TW 时:0 1 2 3 4 ... a A b B c C d D ... z Z
为了避免语言环境的变化而导致的数据截取不同,有一些特殊的字符用于使用
grep的进阶选项
[root@clay ~]# grep [-A] [-B] [--color=auto] '搜寻字符串' filename
选项与参数
-A:后面可加数字,为after的意思,除了列出该行外,后续的n行也列出来
-B:后面可加数字,为berfer的意思,除了列出该行外,前面的n行也列出来
范例一:
[root@clay ~]# cat /etc/passwd | grep -n -A3 -B2 'root'
1:root:x:0:0:root:/root:/bin/bash
2-bin:x:1:1:bin:/bin:/sbin/nologin
3-daemon:x:2:2:daemon:/sbin:/sbin/nologin
4-adm:x:3:4:adm:/var/adm:/sbin/nologin
--
8-halt:x:7:0:halt:/sbin:/sbin/halt
9-mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
10:operator:x:11:0:operator:/root:/sbin/nologin
11-games:x:12:100:games:/usr/games:/sbin/nologin
12-ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
13-nobody:x:99:99:Nobody:/:/sbin/nologin
grep 在数据中查寻一个字串时,是以"整行" 为单位来进行数据的撷取的!
基础正则表达式练习
环境准备
[root@clay ~]# cat regular_express.txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However, this dress is about $ 3183 dollars.^M
GNU is free air not free beer.^M
Her hair is very beauty.^M
I can't finish the test.^M
Oh!The soup taste good.^M
motorcycle is cheap than cat .
This window is clear.
the symbol '*' is represented as start.
Oh! My god!
The gd software is a library for drafting programs.^M
You are the best is mean you are the no.1.
The world <Happy> is the same with "glad".
I like dog.
google is the best tools for search keyword.
goooooogle yes!
go! go! Let's go.
# I am VBird
例题一、搜寻特定字符串
[root@clay ~]# grep -n 'the' regular_express.txt
8:I can't finish the test.^M
12:the symbol '*' is represented as start.
15:You are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
-n:显示行号
反向选择
[root@clay ~]# grep -vn 'the' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
9:Oh!The soup taste good.^M
10:motorcycle is cheap than cat .
11:This window is clear.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
17:I like dog.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
22:
-v:取反
忽视大小写
[root@clay ~]# grep -in 'the' regular_express.txt
8:I can't finish the test.^M
9:Oh!The soup taste good.^M
12:the symbol '*' is represented as start.
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
例题二、利用中括号[] 来搜寻集合字符
如果要搜寻的字符有共同字符
[root@clay ~]# grep -n 't[ae]st' regular_express.txt
8:I can't finish the test.
9:Oh!The soup taste good.
取'oo'字符串
[root@clay ~]# grep -n 'oo' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh!The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!
如果不想要'oo'前面带'g'
[root@clay ~]# grep -n '[^g]oo' regular_express.txt
2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!
[^]:反向选择
但是为什么18,和19还是 是因为18行里面有tool,而19行也显示出来了,那是因为匹配的字符串是oo,也就是g0ooooogle
[root@clay ~]# grep -no '[^g]oo' regular_express.txt
2:foo
3:Foo
18:too
19:ooo
19:ooo
-o:查看匹配过程
oo前不需要小写字符
[root@clay ~]# grep -n '[^a-z]oo' regular_express.txt
3:Football game is not use feet only.
还可以使用前文提到的特殊字符
例题三、行首与行尾字符 ^$
以the开头的行
[root@clay ~]# grep -n '^the' regular_express.txt
12:the symbol '*' is represented as start.
以小写字母开头的行
[root@clay ~]# grep -n '^[[:lower:]]' regular_express.txt
2:apple is my favorite food.
4:this dress doesn't fit me.
10:motorcycle is cheap than cat .
12:the symbol '*' is represented as start.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
[root@clay ~]# grep -n '^[a-z]' regular_express.txt
2:apple is my favorite food.
4:this dress doesn't fit me.
10:motorcycle is cheap than cat .
12:the symbol '*' is represented as start.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
取开头不是英文字母的
[root@clay ~]# grep -n '^[^a-zA-Z]' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
21:# I am VBird
注意:^ 在[]内和[]外的意义是不一样的,在外部表示头部,在内部表示取反
以.结尾的行
[root@clay ~]# grep -n '\.$' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:Her hair is very beauty.
8:I can't finish the test.
9:Oh!The soup taste good.
10:motorcycle is cheap than cat .
11:This window is clear.
12:the symbol '*' is represented as start.
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
20:go! go! Let's go.
搜寻空白行
[root@clay ~]# grep -n '^$' regular_express.txt
22:
查看/etc/rsyslog.conf
中生效的内容
[root@clay ~]# grep -v '^$' /etc/rsyslog.conf | grep -vn '^#'
6:$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
7:$ModLoad imjournal # provides access to the systemd journal
18:$WorkDirectory /var/lib/rsyslog
20:$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
25:$IncludeConfig /etc/rsyslog.d/*.conf
28:$OmitLocalLogging on
30:$IMJournalStateFile imjournal.state
37:*.info;mail.none;authpriv.none;cron.none /var/log/messages
39:authpriv.* /var/log/secure
41:mail.* -/var/log/maillog
43:cron.* /var/log/cron
45:*.emerg :omusrmsg:*
47:uucp,news.crit /var/log/spooler
49:local7.* /var/log/boot.log
-v:代表取反
例题四、任意一个字符.与重复字符*
正则表达式中的*和bash中的*是不一样的,并不是万用字符
.(小数点):代表“一定有一个任意字符”的意思;
*(星星号):代表“重复前一个字符, 0 到无穷多次”的意思,为组合形态
任意单个字符,查找g..g的字符
[root@clay ~]# grep -n 'g..g' regular_express.txt
18:google is the best tools for search keyword.
匹配前一个字符的0次或多次
[root@clay ~]# grep -n 'o*' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:Her hair is very beauty.
8:I can't finish the test.
9:Oh!The soup taste good.
10:motorcycle is cheap than cat .
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
22:
由于*代表的是匹配前一个字符的0次或多次,因此会匹配全文。
如果是oo*则表示,一个o后面接0个或者多个o
[root@clay ~]# grep -n 'oo*' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
9:Oh!The soup taste good.
10:motorcycle is cheap than cat .
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.
15:You are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
接着往后推,ooo*代表的是oo后面接0个或者多个o
[root@clay ~]# grep -n 'ooo*' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh!The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!
如果现在需要查找一个g开头g结尾的字串,肯定不可能是g*g,道理是显然的。但是想到另外一个字符. 很容易就想到 g.*g
[root@clay ~]# grep -n 'g.*g' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
14:The gd software is a library for drafting programs.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
例题五、限定连续RE字符范围{}
如果想要限定范围就要用到 {},但是{}又有特殊的含义就只能加上转义符来使用。假设现在要寻找两个两个o的字符串
[root@clay ~]# grep -n 'o\{2\}' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh!The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!
如果是g后面接2个到5个o,最后一个字符是g
[root@clay ~]# grep -n 'go\{2,5\}g' regular_express.txt
18:google is the best tools for search keyword.
如果是两个以上的o,我们有两种方法
[root@clay ~]# grep -n 'gooo*g' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!
[root@clay ~]# grep -n 'go\{2,\}g' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!
基础正则表达式字符汇整
例题:以 ls -l 配合 grep 找出 /etc/ 下面文件类型为链接文件属性的文件名
[root@clay ~]# ls -l /etc/ | grep -n '^l'
39:lrwxrwxrwx. 1 root root 56 Jul 12 17:51 favicon.png -> /usr/share/icons/hicolor/16x16/apps/fedora-logo-icon.png
52:lrwxrwxrwx. 1 root root 22 Jul 12 17:51 grub2.cfg -> ../boot/grub2/grub.cfg
62:lrwxrwxrwx. 1 root root 11 Jul 12 17:51 init.d -> rc.d/init.d
79:lrwxrwxrwx. 1 root root 35 Jul 12 17:53 localtime -> ../usr/share/zoneinfo/Asia/Shanghai
92:lrwxrwxrwx. 1 root root 17 Jul 12 17:50 mtab -> /proc/self/mounts
101:lrwxrwxrwx. 1 root root 21 Jul 12 17:50 os-release -> ../usr/lib/os-release
119:lrwxrwxrwx. 1 root root 10 Jul 12 17:51 rc0.d -> rc.d/rc0.d
120:lrwxrwxrwx. 1 root root 10 Jul 12 17:51 rc1.d -> rc.d/rc1.d
121:lrwxrwxrwx. 1 root root 10 Jul 12 17:51 rc2.d -> rc.d/rc2.d
122:lrwxrwxrwx. 1 root root 10 Jul 12 17:51 rc3.d -> rc.d/rc3.d
123:lrwxrwxrwx. 1 root root 10 Jul 12 17:51 rc4.d -> rc.d/rc4.d
124:lrwxrwxrwx. 1 root root 10 Jul 12 17:51 rc5.d -> rc.d/rc5.d
125:lrwxrwxrwx. 1 root root 10 Jul 12 17:51 rc6.d -> rc.d/rc6.d
127:lrwxrwxrwx. 1 root root 13 Jul 12 17:51 rc.local -> rc.d/rc.local
128:lrwxrwxrwx. 1 root root 14 Jul 12 17:50 redhat-release -> centos-release
162:lrwxrwxrwx. 1 root root 14 Jul 12 17:50 system-release -> centos-release
sed工具
[root@clay ~]# sed [-nefr] [动作]
-n: 使用安静(silent)模式。在一般 sed 的用法中,所有来自 STDIN 的数据一般都会被列出到屏幕上。 但如果加上 -n 参数后,则只有经过 sed 特殊处理的那一行(或者动作)才会被列出来。
-e: 直接在命令行界面上进行 sed 的动作编辑
-f: 直接将 sed 的动作写在一个文件内, -f filename 则可以执行 filename 内的 sed 动作;
-i: 直接修改读取的文件内容,而不是由屏幕输出。
-r: sed 的动作支持的是延伸型正则表达式的语法。(默认是基础正则表达式语法)
动作说明: [n1[,n2]]function
n1, n2 :不见得会存在,一般代表“选择进行动作的行数”,举例来说,如果我的动作是需要在10到20 行之间进行的,则“10,20[动作行为] ”
function 有下面这些咚咚:
a: 新增, a 的后面可以接字串,而这些字串会在新的一行出现(目前的下一行)
c: 取代, c 的后面可以接字串,这些字串可以取代 n1,n2 之间的行!
d: 删除,因为是删除啊,所以 d 后面通常不接任何咚咚;
i: 插入, i 的后面可以接字串,而这些字串会在新的一行出现(目前的上一行);
p: 打印,亦即将某个选择的数据印出。通常 p 会与参数 sed -n 一起运行~
s: 取代,可以直接进行取代的工作哩!通常这个 s 的动作可以搭配正则表达式! 例如 1,20s/old/new/g 就是啦!
以行为单位的新增/删除功能
范例一、将/etc/passwd的内容列出并且打印行号,同时请将第2-5行删除
[root@clay ~]# nl /etc/passwd | sed '2,5d'
1 root:x:0:0:root:/root:/bin/bash
6 sync:x:5:0:sync:/sbin:/bin/sync
7 shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
8 halt:x:7:0:halt:/sbin:/sbin/halt
9 mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
10 operator:x:11:0:operator:/root:/sbin/nologin
11 games:x:12:100:games:/usr/games:/sbin/nologin
12 ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
13 nobody:x:99:99:Nobody:/:/sbin/nologin
14 systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin
15 dbus:x:81:81:System message bus:/:/sbin/nologin
16 polkitd:x:999:998:User for polkitd:/:/sbin/nologin
17 tss:x:59:59:Account used by the trousers package to sandbox the tcsd daemon:/dev/null:/sbin/nologin
18 sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
19 postfix:x:89:89::/var/spool/postfix:/sbin/nologi
删除2,最后一行
[root@clay ~]# nl /etc/passwd | sed '2,$d'
1 root:x:0:0:root:/root:/bin/bash
范例二、添加行和插入行
[root@clay ~]# nl /etc/passwd | sed '2i helloworld'
1 root:x:0:0:root:/root:/bin/bash
helloworld
2 bin:x:1:1:bin:/bin:/sbin/nologin
[root@clay ~]# nl /etc/passwd | sed '2a helloworld'
1 root:x:0:0:root:/root:/bin/bash
2 bin:x:1:1:bin:/bin:/sbin/nologin
helloworld
**范例三、在第二行后面加入两行字,例如“helloworld” 与“flyhigh”
[root@clay ~]# nl /etc/passwd | sed '2a helloworld \
> flyhigh'
1 root:x:0:0:root:/root:/bin/bash
2 bin:x:1:1:bin:/bin:/sbin/nologin
helloworld
flyhigh
范例四、以行为单位的取代与显示功能
[root@clay ~]# nl /etc/passwd | sed '2,5c hello world'
1 root:x:0:0:root:/root:/bin/bash
hello world
6 sync:x:5:0:sync:/sbin:/bin/sync
范例五、仅列出/etc/passwd文件内容的第5-7行
[root@clay ~]# cat -n /etc/passwd | sed -n '2,5p'
2 bin:x:1:1:bin:/bin:/sbin/nologin
3 daemon:x:2:2:daemon:/sbin:/sbin/nologin
4 adm:x:3:4:adm:/var/adm:/sbin/nologin
5 lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
部分数据搜寻并取代功能
[root@clay ~]# ip a|grep '^.*inet .*ens33'
inet 10.0.0.202/24 brd 10.0.0.255 scope global noprefixroute ens33
[root@clay ~]# ip a|grep '^.*inet .*ens33' | sed 's/^.*inet/ //g'
sed: -e expression #1, char 13: unknown option to `s'
[root@clay ~]# ip a|grep '^.*inet .*ens33' | sed 's/^.*inet//g'
10.0.0.202/24 brd 10.0.0.255 scope global noprefixroute ens33
[root@clay ~]# ip a|grep '^.*inet .*ens33' | sed 's/^.*inet //g'
10.0.0.202/24 brd 10.0.0.255 scope global noprefixroute ens33
[root@clay ~]# ip a|grep '^.*inet .*ens33' | sed 's/^.*inet //g' | sed 's/b.*3'
sed: -e expression #1, char 6: unterminated `s' command
[root@clay ~]# ip a|grep '^.*inet .*ens33' | sed 's/^.*inet //g' | sed 's/b.*3'//g
10.0.0.202/24
[root@clay ~]# cat /etc/man_db.conf | grep 'MAN'|sed 's/^#.*$//g' |sed '/^$/d'
MANDATORY_MANPATH /usr/man
MANDATORY_MANPATH /usr/share/man
MANDATORY_MANPATH /usr/local/share/man
MANPATH_MAP /bin /usr/share/man
MANPATH_MAP /usr/bin /usr/share/man
MANPATH_MAP /sbin /usr/share/man
MANPATH_MAP /usr/sbin /usr/share/man
MANPATH_MAP /usr/local/bin /usr/local/man
MANPATH_MAP /usr/local/bin /usr/local/share/man
MANPATH_MAP /usr/local/sbin /usr/local/man
MANPATH_MAP /usr/local/sbin /usr/local/share/man
MANPATH_MAP /usr/X11R6/bin /usr/X11R6/man
MANPATH_MAP /usr/bin/X11 /usr/X11R6/man
MANPATH_MAP /usr/games /usr/share/man
MANPATH_MAP /opt/bin /opt/man
MANPATH_MAP /opt/sbin /opt/man
MANDB_MAP /usr/man /var/cache/man/fsstnd
MANDB_MAP /usr/share/man /var/cache/man
MANDB_MAP /usr/local/man /var/cache/man/oldlocal
MANDB_MAP /usr/local/share/man /var/cache/man/local
MANDB_MAP /usr/X11R6/man /var/cache/man/X11R6
MANDB_MAP /opt/man /var/cache/man/opt
直接修改文件内容
延生正则表达式
文件格式化与相关处理
格式化打印:printf
[root@clay ~]# printf '%s\t %s\t %s\t %s\t %s\t \n' $(cat printf.txt)
name Chinese English Math Average
DmTsai 80 60 92 77.33
VBird 75 55 80 70.00
Ken 60 90 70 73.33