hexdump-format

軟體開發 · 發表 2019-02-12 21:47:11

摘要： hexdump可以自定義顯示格式, 不過要理解其中format unit以及一些概念才能靈活使用. 在hexdump中使用format string的方式如下: $ hexdump -e '<format string>' <f...

hexdump可以自定義顯示格式, 不過要理解其中format unit以及一些概念才能靈活使用.

在hexdump中使用format string的方式如下:

$ hexdump -e '<format string>' <filename>

format unit

format string由format unit組成, 而format unit由如下部分組成:

iteration count,可選 , 一個整數, 表示每個這個format unit被應用的次數, 預設值1
byte count,可選 , 整數, 表示一次iteration處理的位元組數, 預設值1
format,必選 , 是fprintf 風格的字串, 必須使用雙引號括起來

其中iteration count和byte count使用/ 分隔.

舉例:

$ hexdump -n 16 /bin/ls -e '16/1 "%c"'
ELF
$ hexdump -n 16 /bin/ls -e '16/ "%c"'
ELF
$ hexdump -n 16 /bin/ls -e '16 "%c"'
ELF

上面三個命令的功能和輸出都是一樣的, 功能是讀取/bin/ls 的前16個位元組, 然後依次按照他們的字元含義打印出來.

可以看出當只有iteration count的時候,/ 和byte count都是可以省略的.

iteration count & byte count

下面來說說我對iteration count和byte count的理解.

使用虛擬碼描述如下:

while there is data to process:
for unit in format_string:
for i: 1->unit.iteration_count:
consume unit.byte_count byte
output like unit.format

iteration count是這個unit中的format被應用的次數

byte count是這個unit中format處理的位元組數.

舉例:

$ hexdump -n 16 /bin/ls -e '4/1 "%c" 12/1 " %02X"'
ELF 02 01 01 00 00 00 00 00 00 00 00 00

上面這個例子中的format string可以分為兩個format unit:4/1 "%c" 和12/1 " %02X"

所以就是先處理第一個unit, 意為一次處理1個byte, 當作字元輸出, 處理4次.

然後處理第二個unit, 意為一次處理1個byte, 輸出為16進位制整數, 處理12次.

為了更好地理解byte count, 再看一個例子:

$ hexdump -n 16 /bin/ls -e '3/4 " %08x"'
 464c457f 00010102 00000000 00000000

從例子中看出,3/4 確實是一次處理4個byte, 然後把這4個byte作為一個整體, 應用%08x 進行輸出, 結合位元組序, 原來的02 01 01 00 作為16進位制輸出就是00010102 .

但是3/4 不是表明iteration count為3嗎? 怎麼出現了4個部分呢?

這是因為處理完3次之後, 發現沒有其它的format string了, 再次應用format string來處理接下來的內容.

高階用法

多個format string

當有多個format string的時候, 是順序應用每個format string的, 並且每輪處理, 每個format string的偏移是相同的:

$ hexdump -n 128 -e 16/1 " %02X" "\n" -e "offset: %_ad\n" /bin/ls
 7F 45 4C 46 02 01 01 00 00 00 00 00 00 00 00 00
offset: 0
 02 00 3E 00 01 00 00 00 A0 49 40 00 00 00 00 00
offset: 16
 40 00 00 00 00 00 00 00 38 E7 01 00 00 00 00 00
offset: 32
 00 00 00 00 40 00 38 00 09 00 40 00 1D 00 1C 00
offset: 48
 06 00 00 00 05 00 00 00 40 00 00 00 00 00 00 00
offset: 64
 40 00 40 00 00 00 00 00 40 00 40 00 00 00 00 00
offset: 80
 F8 01 00 00 00 00 00 00 F8 01 00 00 00 00 00 00
offset: 96
 08 00 00 00 00 00 00 00 03 00 00 00 04 00 00 00
offset: 112

hexdump格式控制符

處理fprintf 風格中的各種輸出格式控制符之外, hexdump還有其它的控制符.

%_a[dos]

輸出當前位置離起始位置的偏移,dos 表示輸出的進位制.

$ hexdump -n 1 -s 40 -e '1/1 "%_ad"' /bin/ls 
40

%_A[dos]

類似上面的, 不過這個是在處理完資料之後的偏移.

$ hexdump -n 5 -s 40 -e '1/1 "%_Ad" 2/1 "%x" 2/1 " %02x"' /bin/ls
45

上面的format string明明有3個format unit, 卻只有一個輸出, 再次試驗:

$ hexdump -n 5 -s 40 -e '2/1 " %02x" 1/1 "%_Ad" 2/1 " %02x"' /bin/ls
 38 e745

發現凡是%_Ad 之後的內容都沒有輸出, 取而代之的是輸出這個format string處理之後的偏移量.

%_c

顯示字元, 對於ascii碼對應的轉義字元, 比如ascii為0, 則顯示\0

如果遇到的是其它的控制字元, 比如esc, 顯示\033 這中八進位制表示

$ printf "\n" | hexdump -e '"%_c"'
\n
$ printf \033 | hexdump -e '"%_c"'
033

%_p

顯示字元, 對於非列印字元, 顯示.

$ printf "non-printing:\033\n" | hexdump -e '"%_p"'
non-printing:..

%_u

顯示字元, 對於控制字元, 顯示小寫的縮寫, 比如\n 顯示成lf(line feed ).

$ printf "non-printing:\033\n" | hexdump -e '"%_u"'
non-printing:esclf

長度分類

1byte的控制序列:%_c, %_p, %_u, %c

預設4byte, 但支援1,2,4byte:%d, %i, %o, %u, %X, %x

預設8byte, 但支援4, 12byte:%E, %e, %f, %G, %g