1. 程式人生 > >C語言讀取CSV檔案的方法2

C語言讀取CSV檔案的方法2

在之前的文章中,我們已經介紹了利用strok()函式對CSV檔案進行解析的基本方法。本文將在此基礎上更進一步,我們要將一個用CSV檔案儲存的表格資料放進一個二維陣列中。首先來看看作為示例的一個小型的CSV檔案內容:


我們特別約定,CSV檔案的第一行的第一個數字表示正式的資料檔案一共包含的行數,第二個數字則表示資料檔案所包含的列數,注意這裡的行數和列數是不包括第一行本身的。

下面我們給出了讀取該CSV檔案的示例程式:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, const char * argv[]) {
    
    char file_name[] = "test_data.csv";
    FILE *fp;
    fp = fopen(file_name, "r");
    
    if (!fp) {
        fprintf(stderr, "failed to open file for reading\n");
        return 1;
    }
    
    
    int idx = 0;
    size_t len = 0;
    char * line = NULL;
    char * element;
    
    getline(&line, &len, fp);
    size_t lines_size = atoi(strtok(line, ","));
    size_t items_size = atoi(strtok(NULL, ","));
    
    double **array;
    array = malloc(lines_size * sizeof(double*));
    if(NULL == array){
        fprintf(stderr, "failed to allocate memory\n");
        return 1;
    }

    while ((getline (&line, &len, fp)) != -1) {
        
        array[idx] = (double*) malloc (sizeof(double) * items_size);
        element = strtok(line, ",");
        for(int j=0; j<items_size; j++){
            array[idx][j] = atof(element);
            element = strtok(NULL, ",");
        }
        idx++;
    }
    
    fclose (fp);
    
    //Test code
    int row = 0;
    for (; row<lines_size; row++) {
        printf ("array[%d][] =", row);
        
        for (int col=0; col<items_size; col++)
            printf (" %f", array[row][col]);
        
        printf("\n");
    }
    
    //free every line
    for (row = 0; row < lines_size; row++)
    {
        free(*(array + row));
        array[row] = NULL;
    }
    
    free(array);
    array = NULL;

    return 0;
}

上述程式碼中特別值得關注的地方有兩個:1)二維陣列的動態建立和釋放;2)我們在讀取檔案是用到了getline()函式。這是早期版本的C語言中並沒包含的一個函式(彼時我們一般用fgets()來代替)。The latest and most trendy function for reading a string of text is getline(). It’s a new C library function, having appeared around 2010 or so. 該函式的原型如下:

getline(&buffer,&size,stdin);

各個引數的意思為:
  • &buffer

    is the address of the first character position where the input string will be stored. It’s not the base address of the buffer, but of the first character in the buffer. This pointer type (a pointer-pointer or the ** thing) causes massive confusion.

  • &size is the address of the variable that holds the size of the input buffer, another pointer. 

  • stdin is the input file handle. So you could use getline() to read a line of text from a file, but when stdin is specified, standard input is read.

最後執行上述程式,其輸出結果如下:

array[0][] = 85.000000 67.000000 34.000000
array[1][] = 27.000000 82.000000 100.000000
array[2][] = 1.000000 23.000000 98.000000
array[3][] = 35.000000 62.000000 78.000000
array[4][] = 21.000000 45.000000 66.000000

(全文完)