• 我們想法:
    1. 能不能將多個硬碟,對映成一個邏輯的硬碟,那樣我們程式就不用關心複雜的地址問題了,也不用關係是哪個device了? DM-raid技術RAID全稱為獨立磁碟冗餘陣列(Redundant Array of Independent Disks)
    2. 將某個地址段的資料進行加密,只有授權方式才可訪問,比如FDE DM-crypt技術
    3. 訪問儲存介質上的資料時,校驗下是否被篡改過。DM-verity技術。

總結一下:DM就是Device-Mapper的縮寫,也就說上述的想法都可以基於Device Mapper實現,Device Mapper可不僅僅實現了這些,還包括LVM2DM-multipach等。

  • 什麼是Device Mapper?
    1. Device Mapper Linux 2.6核心中提供的一種將物理塊裝置的對映虛擬(邏輯)塊裝置的框架機制,在該機制下,開發者可以很方便的根據自己的需要制定實現儲存資源的管理策略,比如過濾、IO重定向、dm-verity這種hash tree的校驗、raid多個磁碟管理等。
    2. Device Mapper在核心中是以一個塊設備註冊的。對映到Device Mapper框架上的物理裝置,通過對映時所選擇的方式進行IO處理。
    3. Device mapper本質功能就是根據對映關係和target driver描述的IO處理規則,將IO請求從邏輯裝置mapped device轉發相應的target device上。

舉例:我將system分割槽以dm-verity target方式對映到devicemapper上,當用戶程式訪問system資料時,要通過device mapper的規則後才能轉發到system分割槽上。

 Device Mapper處於LinuxStorage Stack位置:

 


簡單點:

 

 

  • 理論部分:

Device Mapper在核心中的體系架構:


 

從上圖就可以看出,Device Mapper有三部分組成,分別有Mapped DeviceMapping TableTarget Device,說此圖的時候必須說一下核心設計的哲學,核心設計經常將一個框架實現,給使用者態提供儘量少、簡單介面來下發策略,核心根據使用者態下發策略執行相應機制。Device Mapper機制也不例外:

Mapped Device:又稱MD,注意不是DMMD是一個邏輯的抽象裝置,使用者態可以通過IOCTL訪問操作,它通過Mapping Table描述的對映關係與Target Device建立對映關係。

Mapping Table:描述了Target DeviceMapped Device的對映關係,其中最核心的是其指定了這種對映關係使用了何種Target Driver

Target Driver:其實嚴格的說,這不是Device Mapper框架的一部分,因為Target Driver以外掛的方式插入Device Mapper的統一框架定義的一組介面上,允許開發者根據實際的需要定製自己的IO處理規則,Device Mapper目前支援的Target Driverlinearraidveritymulipathsnapshotmirrorcryptcacheerathin等。

Target Device:目標裝置,Target device 表示的是 mapped device所對映的物理空間段,對 mapped device所表示的邏輯裝置來說,就是該邏輯裝置對映到的一個物理裝置。

Device mapper中這三個物件和 target driver外掛一起構成了一個可迭代的裝置樹。在該樹型結構中的頂層根節點是最終作為邏輯裝置向外提供的 mapped device,葉子節點是 target device 所表示的底層物理裝置,Device-Mapper的對映模型:

單一型:單個 mapped device target device組成,每個target device都是被mapped device獨佔的,只能被一個 mapped device使用


 

一對多型:多個 target device對映到一個Mapped device上。


 

組合型:一個 mapped device又可以作為它上層 mapped device target device被使用,該層次在理論上可以在 device mapper 架構下無限迭代下去。


 

總結一下:

一個Device Target只能對映到一個Mapped Device,不可以對映到兩個或多個Mapped Device,如果不這樣,那麼當訪問這個Device Target時,DeviceMapper框架不知道選擇哪個Mapped Device,這樣將系統很糾結,系統表示做不到。

,可以多個Device Target對映一個MappedDevice上,你訪問多個不同Device Target資料時,需要經過Mapped Device相同的IO策略,Device Mapper框架表示我不糾結,可以很好的處理,我按照你的對映關係可以找到你就可以。

邏輯裝置也可以對映到Mapped Device上。DeviceMapper表示我看到的都是對映關係,對映表讓我怎麼處理我就怎麼處理,不管你是邏輯裝置還是真實的物理裝置。

  • 程式碼實現部分:

Kernel目錄下Documentation/device-mapper找到Device-Mapper相關文件。

kernel目錄下drivers/md/找到相關實現的code

主要的資料結構,

mapped_deviceMapped device抽象一個device

struct mapped_device {

       struct srcu_struct io_barrier; //SRCUhttp://www.wowotech.net/kernel_synchronization/linux2-6-23-RCU.html

        struct mutexsuspend_lock;

        atomic_t holders;

        atomic_t open_count;

        /*

         * The currentmapping.

         * Usedm_get_live_table{_fast} or take suspend_lock for

         * dereference.

         */

        struct dm_table *map;//Mapping Table

        struct list_headtable_devices;

        struct mutextable_devices_lock;

        unsigned long flags;

        struct request_queue*queue;

        unsigned type;//Type of table and mapped_device's mempool

        /* Protect queue andtype against concurrent access. */

        struct mutextype_lock;

        struct target_type*immutable_target_type;

        struct gendisk *disk;

        char name[16];

        void *interface_ptr;

        /*

         * A list of ios thatarrived while we were suspended.

         */

        atomic_t pending[2];

        wait_queue_head_twait;

        struct work_structwork;

        struct bio_listdeferred;

        spinlock_tdeferred_lock;

        /*

         * Processing queue(flush)

         */

        structworkqueue_struct *wq;

        /*

         * io objects areallocated from here.

         */

        mempool_t *io_pool;

        struct bio_set *bs;

        /*

         * Event handling.

         */

        atomic_t event_nr;

        wait_queue_head_teventq;

        atomic_t uevent_seq;

        struct list_headuevent_list;

        spinlock_tuevent_lock; /* Protect access to uevent_list */

        /*

         * freeze/thawsupport require holding onto a super block

         */

        struct super_block*frozen_sb;

        struct block_device*bdev;

        /* forced geometrysettings */

        struct hd_geometrygeometry;

        /* kobject andcompletion */

        structdm_kobject_holder kobj_holder;

        /* zero-length flushthat will be cloned and submitted to targets */

        struct bio flush_bio;

        struct dm_statsstats;

};

dm_table Device Mapper中的MappingTable的抽象。

struct dm_table {

        struct mapped_device *md;

        unsigned type;

        /* btree table */

        unsigned int depth;

        unsigned intcounts[MAX_DEPTH]; /* in nodes */

        sector_t*index[MAX_DEPTH];

        unsigned intnum_targets;

        unsigned intnum_allocated;

        sector_t *highs;

        struct dm_target *targets;

        struct target_type*immutable_target_type;

        unsignedintegrity_supported:1;

        unsigned singleton:1;

        /*

         * Indicates the rwpermissions for the new logical

         * device.  This should be a combination of FMODE_READ

         * and FMODE_WRITE.

         */

        fmode_t mode;

        /* a list of devicesused by this table */

        struct list_headdevices;

        /* events get handedup using this callback */

        void (*event_fn)(void*);

        void *event_context;

        struct dm_md_mempools*mempools;

        struct list_headtarget_callbacks;

};

dm_target結構具體描述了 mapped_device和某個 target device的對映關係,Dm_target結構具體記錄該結構對應 target device所對映的 mapped device邏輯區域的開始地址和範圍,同時還包含指向具體 target device相關操作的 target_type結構的指標,而在dm_table結構中將這些 dm_target按照 B樹的方式組織起來方便 IO請求對映時的查詢操作

struct dm_target {

        struct dm_table*table;

  struct target_type *type;              //開發者可以定製的device target部分

        /* target limits */

        sector_t begin;

        sector_t len;

        /* If non-zero,maximum size of I/O submitted to a target. */

        uint32_t max_io_len;

        /*

         * A number ofzero-length barrier bios that will be submitted

         * to the target forthe purpose of flushing cache.

         *

         * The bio number canbe accessed with dm_bio_get_target_bio_nr.

         * It is aresponsibility of the target driver to remap these bios

         * to the realunderlying devices.

         */

        unsignednum_flush_bios;

        /*

         * The number ofdiscard bios that will be submitted to the target.

         * The bio number canbe accessed with dm_bio_get_target_bio_nr.

         */

        unsignednum_discard_bios;

        /*

         * The number ofWRITE SAME bios that will be submitted to the target.

         * The bio number canbe accessed with dm_bio_get_target_bio_nr.

         */

        unsignednum_write_same_bios;

        /*

         * The minimum numberof extra bytes allocated in each bio for the

         * target touse.  dm_per_bio_data returns the datalocation.

         */    

        unsignedper_bio_data_size;

        /*

         * If defined, thisfunction is called to find out how many

         * duplicate biosshould be sent to the target when writing

         * data.

         */

        dm_num_write_bios_fnnum_write_bios;

        /* target specificdata */

    void *private;                  //表示具體的target device的域是dm_target中的private

        /* Used to provide anerror string from the ctr */

        char *error;

        /*

         * Set if this targetneeds to receive flushes regardless of

         * whether or not itsunderlying devices have support.

         */

        boolflush_supported:1;

        /*

         * Set if this targetneeds to receive discards regardless of

         * whether or not itsunderlying devices have support.

         */

        booldiscards_supported:1;

        /*

         * Set if the targetrequired discard bios to be split

         * on max_io_lenboundary.

         */

        boolsplit_discard_bios:1;

        /*

         * Set if this targetdoes not return zeroes on discarded blocks.

         */

        booldiscard_zeroes_data_unsupported:1;

};

開發者可以定製的device target部分,Target_type結構主要包含指向具體 target device相關操作,主要包含了 target device對應的 target driver外掛的名字、定義的構建和刪除該型別target device的方法、該類target device對應的IO請求重對映和結束IO的方法等

struct target_type {

        uint64_t features;

        const char *name;

        struct module*module;

        unsigned version[3];

        dm_ctr_fn ctr;

        dm_dtr_fn dtr;

        dm_map_fn map;

        dm_map_request_fnmap_rq;

        dm_endio_fn end_io;

        dm_request_endio_fnrq_end_io;

        dm_presuspend_fnpresuspend;

        dm_postsuspend_fnpostsuspend;

        dm_preresume_fnpreresume;

        dm_resume_fn resume;

        dm_status_fn status;

        dm_message_fnmessage;

        dm_ioctl_fn ioctl;

        dm_merge_fn merge;

        dm_busy_fn busy;

        dm_iterate_devices_fniterate_devices;

        dm_io_hints_fnio_hints;

        /* For internaldevice-mapper use. */

        struct list_headlist;

};

資料結構關係:


 

  • DM裝置建立流程分析: