利用supercsv讀寫CSV、TSV檔案
阿新 • • 發佈:2018-12-23
先簡單介紹下CSV和TSV檔案的區別:
專案需要把原有的tsv檔案資料整理一下形成更方便使用的新tsv檔案(加幾列)。涉及到tsv檔案的讀寫。其實自己實現也是很簡單的功能,不過正好有現成的工具包supercsv,就拿來用用試試。 官網地址:http://supercsv.sourceforge.net/index.html
文件可以說是清晰明瞭,網上其實也有不少用supercsv解析csv檔案的例子,不過從tsv和csv的區別就可以看出,完全一套程式碼是可以解決的,只要換個分隔符就好餓了。supercsv裡,也確實做到了。 先附上官網的例子:http://supercsv.sourceforge.net/examples_reading.html
customerNo,firstName,lastName,birthDate,mailingAddress,married,numberOfKids,favouriteQuote,email,loyaltyPoints 1,John,Dunbar,13/06/1945,"1600 Amphitheatre Parkway Mountain View, CA 94043 United States",,,"""May the Force be with you."" - Star Wars",[email protected],0 2,Bob,Down,25/02/1919,"1601 Willow Rd. Menlo Park, CA 94025 United States",Y,0,"""Frankly, my dear, I don't give a damn."" - Gone With The Wind",[email protected],123456 3,Alice,Wunderland,08/08/1985,"One Microsoft Way Redmond, WA 98052-6399 United States",Y,0,"""Play it, Sam. Play ""As Time Goes By."""" - Casablanca",[email protected],2255887799 4,Bill,Jobs,10/07/1973,"2701 San Tomas Expressway Santa Clara, CA 95050 United States",Y,3,"""You've got to ask yourself one question: ""Do I feel lucky?"" Well, do ya, punk?"" - Dirty Harry",[email protected],36
利用MapReader方式解析的程式碼:
/**
* An example of reading using CsvMapReader.
*/private static void readWithCsvMapReader() throws Exception {
ICsvMapReader mapReader = null;
try {
mapReader = new CsvMapReader(new FileReader(CSV_FILENAME), CsvPreference.STANDARD_PREFERENCE);
// the header columns are used as the keys to the Map
final String[] header = mapReader.getHeader(true);
final CellProcessor[] processors = getProcessors();
Map<String, Object> customerMap;
while( (customerMap = mapReader.read(header, processors)) != null ) {
System.out.println(String.format("lineNo=%s, rowNo=%s, customerMap=%s", mapReader.getLineNumber(),
mapReader.getRowNumber(), customerMap));
}
}
finally {
if( mapReader != null ) {
mapReader.close();
}
}}
/**
* Sets up the processors used for the examples. There are 10 CSV columns, so 10 processors are defined. Empty
* columns are read as null (hence the NotNull() for mandatory columns).
*
* @return the cell processors
*/private static CellProcessor[] getProcessors() {
final String emailRegex = "[a-z0-9\\._][email protected][a-z0-9\\.]+"; // just an example, not very robust!
StrRegEx.registerMessage(emailRegex, "must be a valid email address");
final CellProcessor[] processors = new CellProcessor[] {
new UniqueHashCode(), // customerNo (must be unique)
new NotNull(), // firstName
new NotNull(), // lastName
new ParseDate("dd/MM/yyyy"), // birthDate
new NotNull(), // mailingAddress
new Optional(new ParseBool()), // married
new Optional(new ParseInt()), // numberOfKids
new NotNull(), // favouriteQuote
new StrRegEx(emailRegex), // email
new LMinMax(0L, LMinMax.MAX_LONG) // loyaltyPoints
};
return processors;}
樣例的程式碼恐怕清楚的不能再清楚了。只需要解釋一點,分隔符是通過CsvPreference.STANDARD_PREFERENCE設定的。如果想要解析TSV檔案,只需要將這裡換成CsvPreference TAB_PREFERENCE即可。
附個原始碼吧:
/**
* Ready to use configuration that should cover 99% of all usages.
*/
public static final CsvPreference STANDARD_PREFERENCE = new CsvPreference.Builder('"' , ',',"\r\n").build();
/**
* Ready to use configuration for Windows Excel exported CSV files.
*/
public static final CsvPreference EXCEL_PREFERENCE = new CsvPreference.Builder('"' , ',' , "\n").build();
/**
* Ready to use configuration for north European excel CSV files (columns are separated by ";" instead of ",")
*/
public static final CsvPreference EXCEL_NORTH_EUROPE_PREFERENCE = new CsvPreference.Builder('"' , ';' , "\n" ).build();
/**
* Ready to use configuration for tab -delimited files.
*/
public static final CsvPreference TAB_PREFERENCE = new CsvPreference.Builder( '"', '\t', "\n").build();