1. 程式人生 > >Working with CSV files in Bash

Working with CSV files in Bash

V files are very common. Using them with Bash to aid in scripting can be very useful. Here are a some methods of using bash to work with CSV files. These are what I sometimes use when dealing with CSV files in bash, they may work for your CSV’s they may not. I accept no responsibility of the accuracy of these, I’m sure there are better ways of accomplishing the same tasks. If anybody finds other ways or knows better please comment and let me know. Don’t run these on production data, its always a good idea to make backups and test them before using in a production environment. Who knows how your precious data could become mangled due to a wrong number of quotes, terminations or bad regex. Jump for the post

How to count the number of elements within a CSV file.

[[email protected] bin]$ echo $(cat test.csv | sed ‘s/, /_ /g’ | tr ‘,’ ‘/n’ | wc -l)
9

Count the number of lines in a CSV file terminated with /n

[[email protected] bin]$ wc -l test.csv | cut –delimiter/=” ” -f 1
3

And another way:

[[email protected]

bin]$ echo $(expr `wc -l “test.csv” | awk ‘{print $1}’`)
3

Print the headers of a CSV file (useful if the headers contain the field names)

[[email protected] bin]$ head -1 test.csv
ID,Question,Answer

Read a line of the CSV (where the number next to the p is the line number) This is useful for looping over a file and getting the line.

[[email protected] bin]$ sed -n 2′p’ “test.csv”
1,Who Am I?,I am Owen
[[email protected] bin]$ sed -n 1′p’ “test.csv”
ID,Question,Answer
[[email protected] bin]$ sed -n 3′p’ “test.csv”
2,What blog is this from?,TheLinuxBlog

So, to loop over the CSV and read a line you can do something like the following:

[[email protected] bin]$ for i in $(seq 1 `wc -l “test.csv” | awk ‘{print $1}’`); do sed -n $i’p’ “test.csv”; done;
ID,Question,Answer
1,Who Am I?,I am Owen
2,What blog is this from?,TheLinuxBlog

Of course with the above you can offset the first line by changing seq 1 to seq 2 to start from line 2.

To split up each row (line) in the CSV at the comma, I use the following:

[[email protected] bin]$ for i in $(seq 1 `wc -l “test.csv” | awk ‘{print $1}’`); do echo $(sed -n $i’p’ “test.csv”) | tr ‘,’ ‘/n’; done;

This puts a backspace in place of a comma. The above is known not to work for fields with comma’s in them. If the data relies on comma’s in the fields you have to use some more advanced sed expressions.

Here is the CSV file I was using for this post:

[[email protected] bin]$ cat test.csv
ID,Question,Answer
1,Who Am I?,I am Owen
2,What blog is this from?,TheLinuxBlog

Let me know if this was helpful or not, as a whole this is not suppose to be a one to end all CSV tutorial but more of the techniques that can be used to aid in processing them. If any one has any thoughts or suggestions, needs any help or finds other methods of completing the same or other tasks then please comment!