shell 知识点总结1.正则 ^ 开始位置 $ 结束位置 [0-9]{n} | 指明两项之间的一个选择。要匹配 |，使

1.正则

// 匹配 (xxx) xxx-xxxx 或 xxx-xxx-xxxx。（x 表示一个数字）
grep -E '^([0-9]{3}-|\([0-9]{3}\) )[0-9]{3}-[0-9]{4}$' file.txt

^ 开始位置

$ 结束位置

[0-9]{n}

| 指明两项之间的一个选择。要匹配 |，使用 |。

() 标记一个子表达式的开始和结束位置。如果当成普通字符使用，需要转义()

2.sed

// 打印第10行
sed -n '10p' file.txt

// 打印第100,200行
sed -n '100,200p file.txt'

格式：sed [-options] ['Commands'] filename

-n 取消默认的输出

3.awk

// 打印第10行
awk 'NR==10' file.txt
awk 'NR==10{print $0}' file.txt

NR 已读记录数

// 以空格为分隔符，打印所有域
awk '{for(i=1;i<=NF;i++){print $i}}' tmp.txt

NF 记录的域个数

// 转置文件
// name age
// alice 21
// ryan 30
// 转为
// name alice ryan
// age 21 30
awk '{
    for(i=1;i<=NF;i++){
        if(NR==1){
            ret[i]=$i
        }
        else{
            ret[i]=ret[i]" "$i
        }
    }
}END{
    for(j=1;j<=NF;j++){
        print ret[j]
    }
}' file.txt

格式: awk [options] 'command' files

command 由两部分组成，分别是

　　1、pattern，可以是正则表达式或者逻辑判断式

　　2、{ awk 命令 } 花括号括起来的是代码段

即 awk [options] '条件1 {动作 1} 条件 2 {动作 2} …' 文件名

4.xargs

// 读取数据，格式化后输出
➜  ~ cat aaa.txt 
111 aaa o
222 bbb p
333 ccc q
444 ddd o

➜  ~ cat aaa.txt|xargs    
111 aaa o 222 bbb p 333 ccc q 444 ddd o

➜  ~ cat aaa.txt |xargs -n2
111 aaa
o 222
bbb p
333 ccc
q 444
ddd o

5.wc、seq、for、awk、shell

// 输出文件的列数
➜  ~ columns=$(cat tmp.txt|head -n1|awk '{print NF}'）
➜  ~ columns=$(cat tmp.txt|head -n1|wc -w)

wc -l 统计行数

wc -w 统计词的数量

➜  ~ seq 1 $columns
1
2
3

seq用法：

seq [选项]... 尾数

seq [选项]... 首数尾数

seq [选项]... 首数增量尾数

// for循环
➜  ~ for i in $(seq 1 $columns)
for> do
for> echo $i
for> done
1
2
3
➜  ~ for i in {1..3}    
do
echo $i
done
1
2
3

for循环

➜  ~ awk '{print a}' a=11 aaa.txt 
11
11
11
11
➜  ~ awk -v a=11 -v b=22 '{print a,b}'  aaa.txt 
11 22
11 22
11 22
11 22
➜  ~ awk '{print "'"$columns"'"}' aaa.txt
3
3
3
3
➜  ~ awk '{print $'"$columns"'}' aaa.txt
o
p
q
o

awk引用外部变量

awk '{print a, b}' a=111 b=222 yourfile

注意, 变量位置要在 file 名之前, 否则就不能调用。

还有, 于 BEGIN{}中是不能调用这些的variable. 要用之后所讲的第二种方法才可解决.

awk –v a=111 –v b=222 '{print a,b}' yourfile

注意, 对每一个变量加一个 –v 作传递.

awk '{print " ' "$LOGNAME" ' "}' yourfile

如果想调用environment variable, 要用以上的方式调用, 方法是:

" ' " $LOGNAME " ' "

➜  ~ echo $tmp
aa a
➜  ~ echo "$tmp"
aa a
➜  ~ echo '$tmp'
$tmp

shell引号

单引号 ' ' 包围变量的值时，单引号里是什么就输出什么，即使内容种有变量和命令也会把它们原样输出。

双引号 " " 包围变量的值时，输出时会先解析黎曼的变量和命令，而不是把双引号中的变量名和命令原样输出。

如果有多个引号,连续的形成一对引号中使用其他引号,其他引号做字符串使用

// 转置文件
➜  ~ columns=$(cat tmp.txt|head -n1|wc -w)
➜  ~ for i in $(seq 1 $columns)
do
awk '{print $'"$i"'}' aaa.txt | xargs
done
111 222 333 444
aaa bbb ccc ddd
o p q o

6.sort、uniq

//统计词频
➜  ~ cat words.txt 
the day is sunny the the
the sunny is is

➜  ~ cat words.txt|xargs -n1|sort|uniq -c|sort -r|awk '{print $2,$1}' 
the 4
is 3
sunny 2
day 1

uniq -c 在每列旁边显示改行重复出现的次数

sort -r 以相反的顺序来排序

sort [-k field1[,field2]] 按照指定的列进行排序，比如 sort -k2

7.shell脚本特殊变量

➜  ~ cat shell.sh 
#! /bin/bash

echo '$0:' $0
echo '$1:' $1
echo '$2:' $2
echo '$$:' $$
echo '$#:' $#
echo '$@:' $@
echo '$?:' $?
echo '$*:' $*

➜  ~ ./shell.sh 1 2 3    
$0: ./shell.sh
$1: 1
$2: 2
$$: 26239
$#: 3
$@: 1 2 3
$?: 0
$*: 1 2 3

$* 和$ @ 的区别

for arg in "$*"
do
	echo $arg
done
➜  ~ ./shell.sh 1 2 3
1 2 3

for arg in "$@"
do
	echo $arg
done

➜  ~ ./shell.sh 1 2 3
1
2
3

$*把参数作为一个字符串整体(单字符串)返回

$@把每个参数作为一个字符串返回

8.shell 逐行读取文件内容

cat aaa.txt|while read line
do
        echo $line
done

9.shell 的 for while if

for 值 in 列表 
do 
执行语句 
done

while 条件 
do 
执行语句 
done

if [ 条件1 ] 
then 
指令1 
elif [ 条件2 ] 
then 
指令2 
else 
指令3 
fi

10.awk if-else

➜  ~ cat aaa.txt 
111 aaa o
222 bbb p
333 ccc q
444 ddd o

➜  ~ awk '{if($1==111) print $2; else print $3}' aaa.txt
aaa
p
q
o

➜  ~ cat qq.tel 
12334:13510014336
12345:12334555666
12334:12343453453
12099:13598989899
12334:12345454545
12099:12343454544

➜  ~ cat qq.tel|sort -r|awk -F ":" '{if($1!=tmp) {tmp=$1;print "["$1"]";} else{print $2}}'
[12345]
[12334]
12345454545
12343453453
[12099]
12343454544

➜  ~ cat qq.tel|sort -r|awk -F ":" '{if($1!=tmp) {tmp=$1;print "["$1"]";} print $2}'      
[12345]
12334555666
[12334]
13510014336
12345454545
12343453453
[12099]
13598989899
12343454544

注意加分号

11.tr

➜  ~ cat yuming.txt
http: //www . baidu. com/ index. html
http: / / www .baidu. com/1.html
http:/ / www . baidu. com/2. html
http: / /post . baidu. com/ index . html
http: / /mp3. baidu. com/ index. html
http:/ / www . baidu. com/3. html
http: / /post.baidu. com/2. html

cat yuming.txt|awk -F "/" '{print $3}'|tr -d " "
www.baidu.com
www.baidu.com
www.baidu.com
post.baidu.com
mp3.baidu.com
www.baidu.com
post.baidu.com

➜  ~ cat yuming.txt|awk -F "/" '{print $3}'|tr -d " "|tr "[a-z]" "[A-Z]"
WWW.BAIDU.COM
WWW.BAIDU.COM
WWW.BAIDU.COM
POST.BAIDU.COM
MP3.BAIDU.COM
WWW.BAIDU.COM
POST.BAIDU.COM

➜  ~ echo "hello world" | tr "hel" "xyz"
xyzzo worzd

➜  ~ echo "hello 123 world" | tr -c -d "0-9"
123#                                                                                                                       
➜  ~ echo "hello 123 world" | tr -d "0-9"  
hello  world

tr 命令主要用来替换字符。

它的原理是对输入的数据按字符进行替换或者删除（也只能按字符来，不能根据单词来做替换）。

tr有几个常用的参数：

-c:通过指定字符的补集来替换字符串（也就是反向匹配）
-d:删除字符

12.cut

➜  ~ echo "hello world" | cut -b 1,3
hl
➜  ~ echo "hello world" | cut -b 1-3
hel
➜  ~ echo "hello world" | cut -b 2  
e


➜  ~ echo "hello world" | cut -c 2
e
➜  ~ echo "hello world" | cut -c 1,3
hl
➜  ~ echo "hello world" | cut -c 1-3
hel

➜  ~ echo "hello,world,ok" | cut -d , -f 1,3
hello,ok
➜  ~ echo "hello,world,ok" | cut -d , -f 1-3
hello,world,ok
➜  ~ echo "hello,world,ok" | cut -d , -f 2  
world

cut命令主要用来切割字符串。

可以对输入的数据进行切割然后输出，它可以支持三种形势的切割：

按字节（bytes）进行切割 cut -b LIST
按字符进行切割 cut -c LIST
按指定的分割符进行切割 cut -d 'DELIM' -f LIST

13.sed 例子

1）替换、删除

➜  ~ cat yunm.txt 
http://www.baidu.com/more/
http://www.baidu.com/guding/more.htmlhttp://www.baidu.com/events/20060105/photomore.htmlhttp://hi.baidu.com/browse/http://www.sina.com.cn/head/www20021123am.shtmlhttp://www.sina.com.cn/head/www20041223am.shtml

// 替换 sed 's/s1/s2/g'
➜  ~ cat yunm.txt| sed 's/http:\/\//\n/g'

www.baidu.com/more/

www.baidu.com/guding/more.html
www.baidu.com/events/20060105/photomore.html
hi.baidu.com/browse/
www.sina.com.cn/head/www20021123am.shtml
www.sina.com.cn/head/www20041223am.shtml

// 删除 空行
➜  ~ cat yunm.txt| sed 's/http:\/\//\n/g'|sed '/^$/d'
www.baidu.com/more/
www.baidu.com/guding/more.html
www.baidu.com/events/20060105/photomore.html
hi.baidu.com/browse/
www.sina.com.cn/head/www20021123am.shtml
www.sina.com.cn/head/www20041223am.shtml