C++ 記憶體資料結構與二進位制檔案之間的序列化和反序列化

阿新 • • 發佈：2019-02-01

應用場景

許多後端檢索server啟動時候需要從檔案載入到記憶體中構建索引，這個過程往往會消耗比較多的時間，這樣會造成sever啟動消耗比較多的時間，在存在多臺伺服器的時候會更加明顯。
我們可以將夠構建索引的過程獨立成一個單獨的程序，此程序實現的功能是根據原始檔案構建索引結構，並將索引結構序列化到本地二進位制檔案，Server在啟動的時候只需要讀取二進位制檔案就可以構造出索引結構，可以大大提高啟動速度。

示例程式碼

io.hpp ,對std::ifstream 以及std::ofstream 的封裝，提供從vector序列化到二進位制檔案和從二進位制檔案反序列化到vector等介面

#ifndef IO_HPP
#define IO_HPP

#include <string>
#include <vector>
#include <fstream>

class FileReader
{
public:
    FileReader(const std::string& filename)
        : input_stream(filename,std::ios::binary)
    {
    }

    /* Read count objects of type T into pointer dest */ 

    template <typename T> void ReadInto(T *dest, const std::size_t count)
    {
        static_assert(std::is_trivially_copyable<T>::value,
                      "bytewise reading requires trivially copyable type");

        if (count == 0)
            return;

        const auto &result = input_stream.read(reinterpret_cast 
<char *>(dest), count * sizeof(T));
        const std::size_t bytes_read = input_stream.gcount();

        if (bytes_read != count * sizeof(T) && !result)
        {
            return;
        }
    }

    template <typename T> void ReadInto(std::vector<T> &target)
    {
        ReadInto(target.data(), target.size());
    }

    template <typename T> void ReadInto(T &target) 
    {
         ReadInto(&target, 1); 
    }

    template <typename T> T ReadOne()
    {
        T tmp;
        ReadInto(tmp);
        return tmp;
    }

    std::uint32_t ReadElementCount32() 
    { 
        return ReadOne<std::uint32_t>(); 
    }
    std::uint64_t ReadElementCount64() 
    { 
        return ReadOne<std::uint64_t>(); 
    }

    template <typename T> void DeserializeVector(std::vector<T> &data)
    {
        const auto count = ReadElementCount64();
        data.resize(count);
        ReadInto(data.data(), count);
    }

private:
    std::ifstream input_stream;
};

class FileWriter
{
public:
    FileWriter(const std::string& filename)
        : output_stream(filename,std::ios::binary)
    {
    }

    /* Write count objects of type T from pointer src to output stream */
    template <typename T> void WriteFrom(const T *src, const std::size_t count)
    {
        static_assert(std::is_trivially_copyable<T>::value,
                      "bytewise writing requires trivially copyable type");

        if (count == 0)
            return;

        const auto &result =
            output_stream.write(reinterpret_cast<const char *>(src), count * sizeof(T));
    }

    template <typename T> void WriteFrom(const T &target) 
    { 
        WriteFrom(&target, 1); 
    }

    template <typename T> void WriteOne(const T tmp) 
    { 
        WriteFrom(tmp); 
    }

    void WriteElementCount32(const std::uint32_t count) 
    { 
        WriteOne<std::uint32_t>(count); 
    }
    void WriteElementCount64(const std::uint64_t count) 
    { 
        WriteOne<std::uint64_t>(count); 
    }

    template <typename T> void SerializeVector(const std::vector<T> &data)
    {
        const auto count = data.size();
        WriteElementCount64(count);
        return WriteFrom(data.data(), count);
    }

private:
    std::ofstream output_stream;
};

#endif

binary_io.cpp

#include "io.hpp"
#include <iostream>

struct Data
{
    int a;
    double b;

    friend std::ostream& operator<<(std::ostream& out,const Data& data)
    {
        out << data.a << "," << data.b;
        return out;
    }
};

template<typename T>
void printData(const std::vector<T>& data_vec)
{
    for (const auto data : data_vec)
    {
        std::cout << "{" << data << "} ";
    }
    std::cout << std::endl;
}
template<typename T>
void serializeVector(const std::string& filename,const std::vector<T>& data_vec)
{
    FileWriter file_writer(filename);
    file_writer.SerializeVector<T>(data_vec);
}

template<typename T>
void deserializeVector(const std::string& filename,std::vector<T>& data_vec)
{
    FileReader file_reader(filename);
    file_reader.DeserializeVector<T>(data_vec);
}

int main()
{
    std::vector<Data> vec1 = {{1,1.1},{2,2.2},{3,3.3},{4,4.4}};
    std::cout << "before write to binary file.\n";
    printData(vec1);
    const std::string filename = "vector_data";
    std::cout << "serialize vector to binary file.\n";
    serializeVector<Data>(filename,vec1);
    std::vector<Data> vec2;
    deserializeVector<Data>(filename,vec2);
    std::cout << "vector read from binary file.\n";
    printData(vec2);
    return 0;
}

編譯程式碼

g++ -std=c++11 binary_io.cpp -o binary_io

執行程式

./binary_io

執行結果

程式將記憶體中vector 資料寫入二進位制檔案，並從二進位制檔案中反序列化到一個新的vector。可以看到序列化前和序列化後的結果一致。

注意

序列化到檔案的資料結構需要滿足 is_trivially_copyable。std::is_trivially_copyable 在c++11 引入，TriviallyCopyable型別物件有以下性質

每個拷貝建構函式是trivial 或者是deleted
每個移動建構函式是trivial 或者是deleted
每個拷貝賦值運算子是trivial 或者是deleted
每個移動賦值運算子是trivial 或者是deleted
以上至少有一個是non-deleted
解構函式是trivial 並且non-deleted

對於is_trivially_copyable 型別物件的性質，解釋如下

Objects of trivially-copyable types are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read(). In general, a trivially copyable type is any type for which the underlying bytes can be copied to an array of char or unsigned char and into a new object of the same type, and the resulting object would have the same value as the original

只有滿足trivially-copyable的物件才可以保證序列化到二進位制檔案後，從二進位制檔案反序列化到記憶體後的值保持不變。

C++ 記憶體資料結構與二進位制檔案之間的序列化和反序列化

應用場景

示例程式碼

注意

C++ 記憶體資料結構與二進位制檔案之間的序列化和反序列化

C語言資料結構與演算法之深度、廣度優先搜尋

《C++程式設計:資料結構與程式設計方法》電子書下載 -（百度網盤高清版PDF格式）

（C語言-資料結構與演算法）還原二叉樹

（c++）資料結構與演算法之圖：鄰接矩陣、深度廣度遍歷、構造最小生成樹（prim、kruskal演算法）

資料結構與演算法-最短路徑Dijkstra和Floy演算法

資料結構與演算法---動態規劃( 9宮格數字序列對應的字母組合)

java資料結構與演算法第4章棧和佇列

Javascript之資料結構與演算法的二叉樹和二叉搜尋樹實現

C# 二進位制讀寫與序列化和反序列化

Android 從零學資料結構與演算法（3）——HashMap和LinkedHashMap

考研資料結構與演算法----單鏈表的建立和讀取（1）

【轉】【UNITY3D 遊戲開發之五】Google-protobuf與FlatBuffers資料的序列化和反序列化

【UNITY3D 遊戲開發之五】Google-protobuf與FlatBuffers資料的序列化和反序列化

C# Json序列化和反序列化

Python學習心得(五) random生成驗證碼、MD5加密、pickle與json的序列化和反序列化

C#中怎樣實現序列化和反序列化

Serializable 指示一個類可以序列化；ICloneable支持克隆，即用與現有實例相同的值創建類的新實例（接口）；ISerializable允許對象控制其自己的序列化和反序列化過程（接口）

XML檔案的序列化和反序列化

C#XML的序列化和反序列化

C++ 記憶體資料結構與二進位制檔案之間的序列化和反序列化

應用場景

示例程式碼

注意

相關推薦