The Linux Command Line-WILLIAM-格式化输出

276 阅读4分钟

格式化文本输出的命令:

  • nl —Number lines.
  • fold —Wrap each line to a specified length.
  • fmt —A simple text formatter.
  • pr —Format text for printing.
  • printf —Format and print data.
  • groff —A document formatting system.

Simple Formatting Tools

nl—Number Lines

nl 命令用来对行编号,类似于命令 cat -n

[me@linuxbox ~]$ nl distros.txt | head
	1 SUSE 		10.2 	12/07/2006
	2 Fedora 	10 		11/25/2008
	3 SUSE 		11.0 	06/19/2008
	4 Ubuntu 	8.04 	04/24/2008
	5 Fedora 	8 		11/08/2007
	6 SUSE 		10.3 	10/04/2007
	7 Ubuntu 	6.10 	10/26/2006
	8 Fedora 	7 		05/31/2007
	9 Ubuntu 	7.10 	10/18/2007
	10 Ubuntu 	7.04 	04/19/2007

像cat一样,nl可以接受多个文件名作为命令行参数或标准输入。但是 nl 支持更加复杂的编号。

nl 编号时候支持 logical pages 的概念。允许 nl 编号时候重置数字序列。使用选项可以将起始编号设置为特定值,并在一定程度上设置其格式。logical page 进一步分解为 header,body 和 footer 三个部分。在每个部分,行号都会重置或者指定不同的风格。如果 制定了多个文件,nl 将他们视为单独的文本流。文本流中的每部分由下面修饰符号表示:

Table 21-1: nl Markup

MarkupMeaning
:::Start of logical-page header
::Start of logical-page body
:Start of logical-page footer

nl 处理上表中修饰元素的时候,将会在文本流中将其删除。

Table 21-2: Common nl Options

OptionMeaning
-b styleSet body numbering to style , where style is one of the following:
·a Number all lines.
· t Number only non-blank lines. This is the default.
· n None.
· pregexp Number only lines matching basic regular expression
regexp .
-f styleSet footer numbering to style . Default is n (none).
-h styleSet header numbering to style . Default is n (none).
-i numberSet page numbering increment to number . Default is 1.
-n formatSet numbering format to format , where format is one of thefollowing:
· ln Left justified, without leading zeros.
· rn Right justified, without leading zeros. This is the default.
· rz Right justified, with leading zeros.
-pDo not reset page numbering at the beginning of each logical page.
-s stringAdd string to the end of each line number to create a separator.Default is a single tab character.
-v numberSet first line number of each logical page to number . Default is 1.
-w widthSet width of the line number field to width . Default is 6.

可以在报告的脚本中加入 nl 来生成报告,将 distros-nl.sed 脚本中内容修改:

# sed script to produce Linux distributions report
1 i\
\\:\\:\\:\
\
Linux Distributions Report\
\
Name Ver. Released\
---- ---- --------\
\\:\\:
s/\([0-9]\{2\}\)\/\([0-9]\{2\}\)\/\([0-9]\{4\}\)$/\3-\1-\2/
$ a\
\\:\
\
End Of Report

上面命令中修饰符号使用了两个反斜杠,原因是 sed 通常将其解析为转义字符。执行下面命令输出报告:

[me@linuxbox ~]$ sort -k 1,1 -k 2n distros.txt | sed -f distros-nl.sed | nl

		Linux Distributions Report

		Name 	Ver. 	Released
		---- 	---- 	--------
	
	1 	Fedora 	5 		2006-03-20
	2 	Fedora 	6 		2006-10-24
	3 	Fedora 	7 		2007-05-31
	4 	Fedora 	8 		2007-11-08
	5 	Fedora 	9 		2008-05-13
	6 	Fedora 	10 		2008-11-25
	7 	SUSE 	10.1 	2006-05-11
	8 	SUSE 	10.2 	2006-12-07
	9 	SUSE 	10.3 	2007-10-04
	10 	SUSE 	11.0 	2008-06-19
	11 	Ubuntu 	6.06	2006-06-01
	12 	Ubuntu 	6.10	2006-10-26
	13 	Ubuntu 	7.04	2007-04-19
	14 	Ubuntu 	7.10	2007-10-18
	15 	Ubuntu 	8.04	2008-04-24
	16 	Ubuntu 	8.10	2008-10-30


		End Of Report

fold—Wrap Each Line to a Specified Length

[me@linuxbox ~]$ echo "The quick brown fox jumped over the lazy dog." | fold
-w 12
The quick br
own fox jump
ed over the
lazy dog.

echo 发送的文本根据 -w 后面指定的字符数量分割为不同的段。上面命令我们指定的行宽为12个字符。 如果未指定数量,则默认值为80个字符。 添加-s选项将导致折线在达到字符数量之前在最后一个可用空格处结束该行:

[me@linuxbox ~]$ echo "The quick brown fox jumped over the lazy dog." | fold
-w 12 -s
The quick
brown fox
jumped over
the lazy
dog.

fmt—A Simple Text Formatter

fmt 接受文件或者标准输入在文本流上执行格式化段落。它可以填充并连接文本中的行,同时保留空白行和缩进。

查看 fmt 命令信息 www.gnu.org/software/co…

fmt reads from the specified file arguments (or standard input if none are given), and writes to standard output.

By default, blank lines, spaces between words, and indentation are preserved in the output; successive input lines with different indentation are not joined; tabs are expanded on input and introduced on output.

fmt prefers breaking lines at the end of a sentence, and tries to avoid line breaks after the first word of a sentence or before the last word of a sentence. A sentence break is defined as either the end of a paragraph or a word ending in any of ‘.?!’, followed by two spaces or end of line, ignoring any intervening parentheses or quotes. Like TeX, fmt reads entire “paragraphs” before choosing line breaks; the algorithm is a variant of that given by Donald E. Knuth and Michael F. Plass in “Breaking Paragraphs Into Lines”, Software—Practice & Experience 11, 11 (November 1981), 11191184.

将其拷贝到文件 fmt-info.txt 文件中,现在想要将其格式更改为 50 字符一列,可以使用下面命令:

[me@linuxbox ~]$ fmt -w 50 fmt-info.txt | head
  fmt reads from the specified file arguments
  (or standard input if none are given), and
  writes to standard output.

  By default, blank lines, spaces between
  words, and indentation are preserved in the
  output; successive input lines with different
  indentation are not joined; tabs are expanded
  on input and introduced on output.

fmt 还提供选项来保留缩进:

[me@linuxbox ~]$ fmt -cw 50 fmt-info.txt
  fmt reads from the specified file arguments
(or standard input if none are given), and writes
 o standard output.
  By default, blank lines, spaces between words,
and indentation are preserved in the output;
successive input lines with different indentation
are not joined; tabs are expanded on input and
 ntroduced on output.
  fmt prefers breaking lines at the end of a
  sentence, and tries to avoid line breaks after

Table 21-3: fmt Options

OptionDescription
-cOperate in crown margin mode. This preserves the indentationof the first two lines of a paragraph. Subsequent lines are alignedwith the indentation of the second line.
-p stringFormat only those lines beginning with the prefix string . Afterformatting, the contents of string are prefixed to each reformat-ted line. This option can be used to format text in source codecomments. For example, any programming language or config-uration file that uses a # character to delineate a comment couldbe formatted by specifying -p '# ' so that only the commentswill be formatted. See the example below.
-sSplit-only mode. In this mode, lines will be split only to fit thespecified column width. Short lines will not be joined to filllines. This mode is useful when formatting text, such as code,where joining is not desired.
-uPerform uniform spacing. This will apply traditional “typewriter-style” formatting to the text. This means a single space betweenwords and two spaces between sentences. This mode is usefulfor removing justification, that is, forced alignment to both theleft and right margins.
-w widthFormat text to fit within a column width characters wide. Thedefault is 75 characters. Note: fmt actually formats lines slightlyshorter than the specified width to allow for line balancing.

上面的 -p 选项非常有趣,为了演示起作用,首先创建文件 fmt-code.txt:

[me@linuxbox ~]$ cat > fmt-code.txt
# This file contains code with comments.

# This line is a comment.
# Followed by another comment line.
# And another.

This, on the other hand, is a line of code.
And another line of code.
And another.

文件中前四行为 # 开头的注释,使用 fmt 可以格式化注释:

[me@linuxbox ~]$ fmt -w 50 -p '# ' fmt-code.txt
# This file contains code with comments.

# This line is a comment. Followed by another
# comment line. And another.

This, on the other hand, is a line of code.
And another line of code.
And another.

相邻的注释行已合并,因为空行不是以 指定的前缀开头因此保留。

pr—Format Text for Printing

pr 用来为文本标记页数:

[me@linuxbox ~]$ pr -l 15 -w 65 distros.txt


2012-12-11 18:27 distros.txt Page 1


SUSE 10.2 12/07/2006
Fedora 10 11/25/2008
SUSE 11.0 06/19/2008
Ubuntu 8.04 04/24/2008
Fedora 8 11/08/2007


2012-12-11 18:27 distros.txt Page 2


SUSE 10.3 10/04/2007
Ubuntu 6.10 10/26/2006
Fedora 7 05/31/2007
Ubuntu 7.10 10/18/2007
Ubuntu 7.04 04/19/2007

上面命令使用 -l (页的长度)以及 -w(页宽)定义 页 为 65 字符宽 以及 15 行长。

printf—Format and Print Data

printf 不接受标准输入。但是广泛使用。

printf(print formatted)起源于 C语言 现在已经被多种语言所实现,包括 shell,事实上 ,printf 已经内置于在 bash 中。

printf "format" arguments

例如:

[me@linuxbox ~]$ printf "I formatted the string: %s\n" foo
I formatted the string: foo

其中 数据类型所对应的转换说明符如下表:

Table 21-4: Common printf Data-Type Specifiers

SpecifierDescription
dFormat a number as a signed decimal integer.
oFormat an integer as an octal number.
sFormat a string.
xFormat an integer as a hexadecimal number using lowercase a–f where needed.
XSame as x , but use uppercase letters.
%Print a literal % symbol (i.e., specify “%%”).

例如 使用不同转换说明符打印 380:

[me@linuxbox ~]$ printf "%d, %f, %o, %s, %x, %X\n" 380 380 380 380 380 380
380, 380.000000, 574, 380, 17c, 17C

因为在文本中有六个占位符,因此在后面要加上六个与之对应的参数。

可以在转换说明符增加可选组件来调整输出,格式如下:

%[flags][width][.precision]conversion_specification

Table 21-5: printf Conversion-Specification Components

ComponentDescription
flagsThere are five different flags:
· # Use the alternate format for output. This varies by datatype. For o (octal number) conversion, the output is prefixedwith 0 (zero). For x and X (hexadecimal number) conversions,the output is prefixed with 0x or 0X respectively.
· 0 (zero) Pad the output with zeros. This means that the fieldwill be filled with leading zeros, as in 000380 .
· - (dash) Left-align the output. By default, printf right-alignsoutput.
· (space) Produce a leading space for positive numbers.
· + (plus sign) Sign positive numbers. By default, printf signsonly negative numbers.
widthA number specifying the minimum field width
.precisionFor floating-point numbers, specify the number of digits ofprecision to be output after the decimal point. For stringconversion, precision specifies the number of characters tooutput.

下表为一些例子:

Table 21-6: print Conversion Specification Examples

ArgumentFormatResultNotes
380"%d"380Simple formatting of an integer
380"%#x"0x17cInteger formatted as a hexa-decimal number using thealternate format flag
380"%05d"00380Integer formatted with leadingzeros (padding) and a minimumfield width of five characters
380"%05.5f"380.00000Number formatted as a floating-point number with padding and5 decimal places of precision.Since the specified minimumfield width (5) is less than theactual width of the formattednumber, the padding has noeffect.
380"%010.5f"0380.00000Increasing the minimum fieldwidth to 10 makes the paddingvisible.
380"%+d"+380The + flag signs a positivenumber.
380"%-d"380The - flag left-aligns theformatting.
abcdefghijk"%5s"abcedfghijkA string is formatted with aminimum field width.
abcdefghijk"%.5s"abcdeBy applying precision to a string, it is truncated.

printf 大多数用在脚本中用来格式化表头数据,例如,输出一些被制表符分割的字段:

[me@linuxbox ~]$ printf "%s\t%s\t%s\n" str1 str2 str3
str1 str2 str3

输出格式整齐的数字:

[me@linuxbox ~]$ printf "Line: %05d %15.3f Result: %+15d\n" 1071 3.14156295 32589
Line: 01071           3.142 Result:          +32589