1. 程式人生 > >Linux常用命令之sed

Linux常用命令之sed

UC perm wid cross 每次 orm separate 當前 命令操作

sed

NAME

  • sed - stream editor for filtering and transforming text
  • 文本流編輯,sed是一個“非交互式的”面向字符流的編輯器。能同時處理多個文件多行的內容,可以不對原文件改動,把整個文件輸入到屏幕,可以把只匹配到模式的內容輸入到屏幕上。還可以對原文件改動,但是不會再屏幕上返回結果。

SYNOPSIS

  • sed [OPTION]... {script-only-if-no-other-script} [input-file]...
  • sed的命令格式:sed [option] ‘sed command‘filename
  • sed的腳本格式:sed [option] -f ‘sed script‘filename

DESCRIPTION

Sed is a stream editor. A stream editor is used to perform basictext transformations on an input stream (a file or input from apipeline). While in some ways similar to an editor which permitsscripted edits (such as ed), sed works by making only one pass overthe input(s), and is consequently more efficient. But it is sed‘sability to filter text in a pipeline which particularly distinguishesit from other types of editors.

sed的處理流程,簡化後是這樣的

  • 讀入新的一行內容到緩存空間;
  • 從指定的操作指令中取出第一條指令,判斷是否匹配pattern;
  • 如果不匹配,則忽略後續的編輯命令,回到第2步繼續取出下一條指令;
  • 如果匹配,則針對緩存的行執行後續的編輯命令;完成後,回到第2步繼續取出下一條指令;
  • 當所有指令都應用之後,輸出緩存行的內容;回到第1步繼續讀入下一行內容;
  • 當所有行都處理完之後,結束;

動作

  • a :新增, a 的後面可以接字串,而這些字串會在新的一行出現(目前的下一行)~
  • c :取代, c 的後面可以接字串,這些字串可以取代 n1,n2 之間的行!
  • d :刪除,因為是刪除啊,所以 d 後面通常不接任何咚咚;
  • i :插入, i 的後面可以接字串,而這些字串會在新的一行出現(目前的上一行);
  • p :打印,亦即將某個選擇的數據印出。通常 p 會與參數 sed -n 一起運行~
  • s :取代,可以直接進行取代的工作哩!通常這個 s 的動作可以搭配正規表示法!例如 1,20s/old/new/g 就是啦!

-n, --quiet, --silent

  • suppress automatic printing of pattern space
  • 設定為安靜模式,不會輸出默認打印信息,除非子命令中特別指定打印選項,則只會把匹配修改的行進行打印。
# A world
# nihao
echo -e 'hello world\nnihao' | sed 's/hello/A/'

# -n選項後什麽也沒有顯示
echo -e 'hello world\nnihao' | sed -n 's/hello/A/'

# -n選項後,再加p標記,只會把匹配並修改的內容打印了出來
# A world
echo -e 'hello world\nnihao' | sed -n 's/hello/A/p'

-e script, --expression=script

  • add the script to the commands to be executed
  • 對文本內容進行多種操作,則需要執行多條子命令來進行操作。
# A B
echo -e 'hello world' | sed -e 's/hello/A/' -e 's/world/B/'

# A B
echo -e 'hello world' | sed 's/hello/A/;s/world/B/'

-f script-file, --file=script-file

  • add the contents of script-file to the commands to be executed
echo "s/hello/A/
s/world/B/" > sed.script
# 多個子命令操作寫入腳本文件,然後使用 -f 選項來指定該腳本
# A B
echo "hello world" | sed -f sed.script
  • follow symlinks when processing in place

-i[SUFFIX], --in-place[=SUFFIX]

sed默認會把輸入行讀取到模式空間,簡單理解就是一個內存緩沖區,sed子命令處理的內容是模式空間中的內容,而非直接處理文件內容。因此在sed修改模式空間內容之後,並非直接寫入修改輸入文件,而是打印輸出到標準輸出。

  • edit files in place (makes backup if SUFFIX supplied)
  • 如果需要修改輸入文件,那麽就可以指定-i選項
echo "hello world" > file.txt
# hello world
cat file.txt
# A world
sed 's/hello/A/' file.txt
# hello world
cat file.txt

# A world
sed -i 's/hello/A/' file.txt
# A world
cat file.txt

echo "hello world" > file.txt
if [ -f file.txt.bak ];then
    rm file.txt.bak
fi
# file.txt  sed.sh
ls .
# 把修改內容保存到file.txt,同時會以file.txt.bak文件備份原來未修改文件內容,以確保原始文件內容安全性,防止錯誤操作而無法恢復原來內容。
sed -i.bak 's/hello/A/' file.txt
# file.txt  file.txt.bak    sed.sh
ls .
# A world
cat file.txt
# hello world
cat file.txt.bak

-l N, --line-length=N

  • specify the desired line-wrap length for the `l‘ command

--posix

  • disable all GNU extensions.

-r, --regexp-extended

  • use extended regular expressions in the script.
  • 支持擴展正則表達式。
#A A
echo "hello world" | sed -r 's/(hello)|(world)/A/g'

-s, --separate

  • consider files as separate rather than as a single continuous long stream.

-u, --unbuffered

  • load minimal amounts of data from the input files and flush the output buffers more often

-z, --null-data

  • separate lines by NUL characters

--help

  • display this help and exit

--version

  • output version information and exit

If no -e, --expression, -f, or --file option is given, then the firstnon-option argument is taken as the sed script to interpret. Allremaining arguments are names of input files; if no input files arespecified, then the standard input is read.

COMMAND SYNOPSI

This is just a brief synopsis of sed commands to serve as a reminder to those who already know sed; other documentation (such as the texinfo document) must be consulted for fuller descriptions.

Zero-address ``commands‘‘

: label

  • Label for b and t commands.

#comment

  • The comment extends until the next newline (or the end of a -e script fragment).

}

  • The closing bracket of a { } block.

Zero- or One- address commands

=

  • Print the current line number.

a \

  • text Append text, which has each embedded newline preceded by a backslash.
  • 在定位行號後附加新文本信息

i \

  • text Insert text, which has each embedded newline preceded by a backslash.
  • 在定位行號後插入新文本信息

q [exit-code]

Immediately quit the sed script without processing any more input, except that if auto-print is not disabled the current pattern space will be printed. The exit code argument is a GNU extension.

Q [exit-code]

  • Immediately quit the sed script without processing any more input. This is a GNU extension.

r filename

  • Append text read from filename.

R filename

Append a line read from filename. Each invocation of the command reads a line from the file. This is a GNU extension.

Commands which accept address ranges

{

  • Begin a block of commands (end with a }).

b label

  • Branch to label; if label is omitted, branch to end of script.

c \

  • text Replace the selected lines with text, which has each embedded newline preceded by a backslash.

d

  • Delete pattern space.Start next cycle.
  • 刪除定位行

D

  • If pattern space contains no newline, start a normal new cycle as if the d command was issued. Otherwise, delete text in the pattern space up to the first newline, and restart cycle with the resultant pattern space, without reading a new line of input.

h H

  • Copy/append pattern space to hold space.

g G

  • Copy/append hold space to pattern space.

l

  • List out the current line in a ``visually unambiguous‘‘ form.

l width

  • List out the current line in a ``visually unambiguous‘‘ form,breaking it at width characters. This is a GNU extension.

n N

  • Read/append the next line of input into the pattern space.

p

  • Print the current pattern space.

P

  • Print up to the first embedded newline of the current pattern space.

s/regexp/replacement/

  • Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement.The replacement may contain the special character & to refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to the correspondingmatching sub-expressions in the regexp.

t label

  • If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script.

T label

  • If no s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script. This is a GNU extension.

w filename

  • Write the current pattern space to filename.

W filename

  • Write the first line of the current pattern space to filename. This is a GNU extension.

x

  • Exchange the contents of the hold and pattern spaces.

y/source/dest/

  • Transliterate the characters in the pattern space which appear in source to the corresponding character in dest.

Addresses

Sed commands can be given with no addresses, in which case the command will be executed for all input lines; with one address, in which case the command will only be executed for input lines which match that address; or with two addresses, in which case the command will be executed for all input lines which match the inclusive range of lines starting from the first address and continuing to the second address. Three things to note about address ranges: the syntax is addr1,addr2 (i.e., the addresses are separated by a comma); the line which addr1 matched will always be accepted, even if addr2 selects an earlier line; and if addr2 is a regexp, it will not be tested against the line that addr1 matched.

After the address (or address-range), and before the command, a ! may be inserted, which specifies that the command shall only be executed if the address (or address-range) does not match.

The following address types are supported:

number

  • Match only the specified line number (which increments cumulatively across files, unless the -s option is specified on the command line).

first~step

  • Match every step‘th line starting with line first. For example, ``sed -n 1~2p‘‘ will print all the odd-numbered lines in the input stream, and the address 2~5 will match every fifth line, starting with the second. first can be zero; in this case, sed operates as if it were equal to step. (This is an extension.)

$

  • Match the last line.

/regexp/

  • Match lines matching the regular expression regexp.

\cregexpc

  • Match lines matching the regular expression regexp. The c may be any character.

GNU sed also supports some special 2-address forms:

0,addr2

  • Start out in "matched first address" state, until addr2 is found. This is similar to 1,addr2, except that if addr2 matches the very first line of input the 0,addr2 form will be at the end of its range, whereas the 1,addr2 form will still be at the beginning of its range. This works only when addr2 is a regular expression.

addr1,+N

  • Will match addr1 and the N lines following addr1.

addr1,~N

  • Will match addr1 and the lines following addr1 until the next line whose input line number is a multiple of N.

Examples

#! /bin/bash

echo "1) A Storm of Swords, George R. R. Martin, 1216 
2) The Two Towers, J. R. R. Tolkien, 352 
3) The Alchemist, Paulo Coelho, 197 
4) The Fellowship of the Ring, J. R. R. Tolkien, 432 
5) The Pilgrimage, Paulo Coelho, 288 
6) A Game of Thrones, George R. R. Martin, 864" > books.txt

# 1) A Storm of Swords, George R. R. Martin, 1216 
# 2) The Two Towers, J. R. R. Tolkien, 352 
# 3) The Alchemist, Paulo Coelho, 197 
# 4) The Fellowship of the Ring, J. R. R. Tolkien, 432 
# 5) The Pilgrimage, Paulo Coelho, 288 
# 6) A Game of Thrones, George R. R. Martin, 864
sed '' books.txt

# 刪除三行,-e選項指定三個獨立的命令
# 3) The Alchemist, Paulo Coelho, 197 
# 4) The Fellowship of the Ring, J. R. R. Tolkien, 432 
# 6) A Game of Thrones, George R. R. Martin, 864
sed -e '1d' -e '2d' -e '5d' books.txt

echo -e "1d\n2d\n5d" > commands.txt 
# 1d
# 2d
# 5d
cat commands.txt
# 3) The Alchemist, Paulo Coelho, 197 
# 4) The Fellowship of the Ring, J. R. R. Tolkien, 432 
# 6) A Game of Thrones, George R. R. Martin, 864
sed -f commands.txt books.txt



echo "A Storm of Swords
George R. R. Martin
The Two Towers
J. R. R. Tolkien
The Alchemist
Paulo Coelho
The Fellowship of the Ring
J. R. R. Tolkien
The Pilgrimage
Paulo Coelho
A Game of Thrones
George R. R. Martin" > books2.txt

#*************************************************
# A Storm of Swords, George R. R. Martin
# The Two Towers, J. R. R. Tolkien
# - The Alchemist, Paulo Coelho
# The Fellowship of the Ring, J. R. R. Tolkien
# - The Pilgrimage, Paulo Coelho
# A Game of Thrones, George R. R. Martin

sed -n '
h;n;H;x
s/\n/, /
/Paulo/!b Print
s/^/- /
:Print
p' books2.txt
# 第一行是h;n;H;x這幾個命令,記得上面我們提到的 保持空間 嗎?第一個h是指將當前模式空間中的內容覆蓋到 保持空間中,n用於提前讀取下一行,並且覆蓋當前模式空間中的這一行,H將當前模式空間中的內容追加到 保持空間 中,最後的x用於交換模式空間和保持空間中的內容。因此這裏就是指每次讀取兩行放到模式空間中交給下面的命令進行處理
# 接下來是 s/\n/, / 用於將上面的兩行內容中的換行符替換為逗號
# 第三個命令在不匹配的時候跳轉到Print標簽,否則繼續執行第四個命令
# :Print僅僅是一個標簽名,而p則是print命令


# 模式空間

str="pig cow cow pig"
# 將一段文本中的pig替換成cow,並且將cow替換成horse:
# hores cow cow pig
echo ${str} | sed 's/pig/cow/;s/cow/hores/'
# hores hores hores hores
echo ${str} | sed 's/pig/cow/g;s/cow/hores/g'
# cow hores cow pig
echo ${str} | sed 's/cow/hores/;s/pig/cow/'
# cow hores hores cow
echo ${str} | sed 's/cow/hores/g;s/pig/cow/g'

# 地址匹配
list="Alice Ford, 22 East Broadway, Richmond VA
\nOrville Thomas, 11345 Oak Bridge Road, Tulsa OK
\nTerry Kalkas, 402 Lans Road, Beaver Falls PA
\nEric Adams, 20 Post Road, Sudbury MA
\nHubert Sims, 328A Brook Road, Roanoke VA
\nAmy Wilde, 334 Bayshore Pkwy, Mountain View CA
\nSal Carpenter, 73 6th Street, Boston MA"
#
echo -e ${list} | sed 'd'
# Orville Thomas, 11345 Oak Bridge Road, Tulsa OK 
# Eric Adams, 20 Post Road, Sudbury MA 
# Amy Wilde, 334 Bayshore Pkwy, Mountain View CA 
# Sal Carpenter, 73 6th Street, Boston MA
echo -e ${list} | sed '1d;3d;5d'

# 刪除該範圍內的所有行
# Alice Ford, 22 East Broadway, Richmond VA 
echo -e ${list} | sed '2,$d'

# 通過正則表達式來指定地址,刪除包含MA VA的行:
# Orville Thomas, 11345 Oak Bridge Road, Tulsa OK 
# Terry Kalkas, 402 Lans Road, Beaver Falls PA 
# Amy Wilde, 334 Bayshore Pkwy, Mountain View CA 
echo -e ${list} | sed '/MA/d;/VA/d'

# Alice Ford, 22 East Broadway, Richmond ZING 
echo -e ${list} | sed '2,$d;s/VA/ZING/'

# 使用正則匹配,刪除從包含Alice的行開始到包含Hubert的行結束的所有行
# Amy Wilde, 334 Bayshore Pkwy, Mountain View CA 
# Sal Carpenter, 73 6th Street, Boston MA
echo -e ${list} | sed '/Alice/,/Hubert/d'

# 行號和地址對是可以混用?
# Alice Ford, 22 East Broadway, Richmond VA
# Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
echo -e ${list} | sed '3,$d' 
# Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
echo -e ${list} | sed '3,$d;/VA/d'
# Alice Ford, 22 East Broadway, Richmond VA 
echo -e ${list} | sed '3,$d;2d' 
# Alice Ford, 22 East Broadway, Richmond VA
echo -e ${list} | sed '3,$d;2,/VA/d' 
# Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
echo -e ${list} | sed '3,$d;2,/VA/!d' 

# Alice Ford, 22 East Broadway, Richmond VA
# Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
# Terry Kalkas, 402 Lans Road, Beaver Falls, Pennsylvania
# Eric Adams, 20 Post Road, Sudbury, Massachusetts
echo -e ${list} | sed -n '1,4{s/ MA/, Massachusetts/;s/ PA/, Pennsylvania/;p}'

Linux常用命令之sed