1. 程式人生 > >Linux下使用inotify實現檔案監控

Linux下使用inotify實現檔案監控

1、需求

工程中需要對某個資料夾下的檔案進行監控,檔案、目錄發生變化後需要進行處理;

普通的方法是通過迴圈不停遍歷資料夾,但檔案數量較多時,將導致判定時間較長,並且無法區分檔案使用正在使用;

萬幸是Linux2.6後提供了一種inotify 對檔案系統進行監控,通過觸發的方式告訴你檔案的變化,從而代替以往迴圈遍歷的方式;

2、介面

2.1 介面說明

1) int inotify_init(void);

建立 inotify 例項,該介面返回一個檔案描述符,失敗返回-1,跟socket一樣,可以通過errno獲取錯誤型別;

DESCRIPTION
       inotify_init()  initializes a new inotify instance and returns a file descriptor associated with a new inotify
       event queue.

       If flags is 0, then inotify_init1() is the same as inotify_init().  The following values can be  bitwise  ORed
       in flags to obtain different behavior:

       IN_NONBLOCK Set the O_NONBLOCK file status flag on the new open file description.  Using this flag saves extra
                   calls to fcntl(2) to achieve the same result.

       IN_CLOEXEC  Set the close-on-exec (FD_CLOEXEC) flag on the new file descriptor.  See the  description  of  the
                   O_CLOEXEC flag in open(2) for reasons why this may be useful.

RETURN VALUE
       On  success,  these  system calls return a new file descriptor.  On error, -1 is returned, and errno is set to
       indicate the error.

2)int inotify_add_watch(int fd, const char *pathname, uint32_t mask);

加入需要監控的目錄到 inotify 例項,並設定監控型別 mask;成功後返回該目錄的Watch file descriptor,後續用於在事件中進行辨識;

DESCRIPTION
       inotify_add_watch()  adds a new watch, or modifies an existing watch, for the file whose location is specified
       in pathname; the caller must have read permission for this file.  The fd argument is a file descriptor  refer-
       ring  to the inotify instance whose watch list is to be modified.  The events to be monitored for pathname are
       specified in the mask bit-mask argument.  See inotify(7) for a description of the bits  that  can  be  set  in
       mask.

       A successful call to inotify_add_watch() returns the unique watch descriptor associated with pathname for this
       inotify instance.  If pathname was not previously being watched by  this  inotify  instance,  then  the  watch
       descriptor  is  newly  allocated.  If pathname was already being watched, then the descriptor for the existing
       watch is returned.

       The watch descriptor is returned by later read(2)s from the inotify file descriptor.  These reads  fetch  ino-
       tify_event  structures (see inotify(7)) indicating file system events; the watch descriptor inside this struc-
       ture identifies the object for which the event occurred.

RETURN VALUE
       On success, inotify_add_watch() returns a non-negative watch descriptor.  On error -1 is returned and errno is
       set appropriately.

3)int inotify_rm_watch(int fd, int wd);  移除事件監控;

DESCRIPTION
       inotify_rm_watch() removes the watch associated with the watch descriptor wd from the inotify instance associ-
       ated with the file descriptor fd.

       Removing a watch causes an IN_IGNORED event to be generated for this watch descriptor.  (See inotify(7).)

RETURN VALUE
       On success, inotify_rm_watch() returns zero, or -1 if an error occurred (in which case, errno is set appropri-
       ately).

2.2 事件型別

通過 sys/inotify.h 檔案可以看出 mask標誌如何設定,裡面包含了常用的增、刪、改動作:

/* Supported events suitable for MASK parameter of INOTIFY_ADD_WATCH.  */
#define IN_ACCESS    0x00000001 /* File was accessed. 檔案被訪問 */
#define IN_MODIFY    0x00000002 /* File was modified. 檔案被修改 */
#define IN_ATTRIB    0x00000004 /* Metadata changed. 檔案屬性發生變化 */
#define IN_CLOSE_WRITE   0x00000008 /* Writtable file was closed. 以可寫的方式開啟後關閉了檔案 */
#define IN_CLOSE_NOWRITE 0x00000010 /* Unwrittable file closed. 以非可寫的方式開啟後關閉了檔案 */
#define IN_CLOSE     (IN_CLOSE_WRITE | IN_CLOSE_NOWRITE) /* Close. 上述兩者的集合 */
#define IN_OPEN      0x00000020 /* File was opened. 檔案被開啟 */
#define IN_MOVED_FROM    0x00000040 /* File was moved from X. 檔案移出監控目錄 */
#define IN_MOVED_TO      0x00000080 /* File was moved to Y. 檔案移入監控目錄 */
#define IN_MOVE      (IN_MOVED_FROM | IN_MOVED_TO) /* Moves. 上述兩者的集合 */
#define IN_CREATE    0x00000100 /* Subfile was created. 監控目錄下新建了子檔案、子目錄 */
#define IN_DELETE    0x00000200 /* Subfile was deleted. 監控目錄下刪除了子檔案、子目錄 */
#define IN_DELETE_SELF   0x00000400 /* Self was deleted. 監控目錄被刪除 */
#define IN_MOVE_SELF     0x00000800 /* Self was moved. 監控目錄被移動 */
/* All events which a program can wait on.  */
#define IN_ALL_EVENTS    (IN_ACCESS | IN_MODIFY | IN_ATTRIB | IN_CLOSE_WRITE  \
              | IN_CLOSE_NOWRITE | IN_OPEN | IN_MOVED_FROM        \
              | IN_MOVED_TO | IN_CREATE | IN_DELETE           \
              | IN_DELETE_SELF | IN_MOVE_SELF)

2.3 事件結構

/* Structure describing an inotify event.  */
struct inotify_event
{
  int wd;       /* Watch descriptor.  */
  uint32_t mask;    /* Watch mask.  */
  uint32_t cookie;  /* Cookie to synchronize two events.  */
  uint32_t len;     /* Length (including NULs) of name.  */
  char name __flexarr;  /* Name.  */
};

事件的獲取是為檔案描述符可讀後,讀取下來的內容,結構大概如下所示:


wd是在inotify例項底下的子監控事件標識,mask為事件掩碼,cookie為兩個事件的關聯,len則是詳細名稱;

注意name是一個柔性陣列,表示後續追加了不定長的事件名稱;

.eg: 

在inotify例項(fd=1)監控了目錄a(wd=2)、b(wd=3);

現在fd=1收到事件wd=2,mask=IN_CREATE,cookie=0,name=“1”,表示在a目錄底下建立了檔案1

現在fd=1收到事件wd=3,mask=IN_MOVED_FROM,cookie=0x1234,name=“2”,事件wd=3,mask=IN_MOVED_TO,cookie=0x1234,name=“3

表示在b目錄底下發生了檔案2被重新命名為檔案3;

3、例項

inotify的例項是檔案描述符,操作起來跟socket區別不大,所以對於Linux來講(inotify也只在Linux支援),可以使用epoll、select進行IO複用處理;

也就是說可以使用libevent網路庫完美結合進行程式設計,使用內部bufferevent機制提供緩衝管理(可以參考《Linux下使用bufferevent實現tcp代理功能》),極大簡化程式碼工作;

先看一下內部的一個結構體 struct string,用於開闢一個足夠大的buffer對inotify事件進行維護使用:

#define SIZE_IEVENT sizeof(struct inotify_event)

struct string
{
    char str[SIZE_IEVENT + 1024];
    size_t len;
};

其次是main函式,內部建立了base例項、bev例項,同時呼叫了inotify的API申請出fd,託管到bev中:

通過 bufferevent_setcb(bev, on_recv, NULL, NULL, &string);  設定一個fd可讀時的回撥函式,即當有事件來時,呼叫on_recv進行處理;

int main(int argc, char *argv[])
{
    int fd = 0;
    struct bufferevent *bev = NULL;
    struct event_base *base = NULL;

    struct string string = {{0}, 0}; 

    if ( argc < 2 ) { 
        printf("%s <path>\n", argv[0], argv[1]);
        exit(EXIT_FAILURE);
    }   

    base = event_base_new();
    assert(base);

    fd = inotify_init();

    inotify_add_watch(fd, argv[1], 
            IN_CREATE | 
            IN_DELETE | 
            IN_MOVED_FROM | 
            IN_MOVED_TO | 
            IN_CLOSE_WRITE);

    bev = bufferevent_socket_new(base, fd, 0); 
    assert(bev);

    bufferevent_setwatermark(bev, EV_READ, SIZE_IEVENT, 0); 
    bufferevent_setcb(bev, on_recv, NULL, NULL, &string);
    bufferevent_enable(bev, EV_READ);

    event_base_dispatch(base);

    return EXIT_SUCCESS;
}

接著看一下核心函式on_recv,當fd有事件來時(可能一次多個事件),bufferevent先幫我們把資料收到緩衝區了,

然後觸發我們的on_recv回撥函式,我們只需在裡面使用 bufferevent_read從緩衝區取出事件到pstr就行了。

由於事件中name是不定長的,所以就有了以下的迴圈處理:收sizeof(struct inotify_event)前32位元組內容,再根據pevent->len 獲取後續的name內容。

void on_recv(struct bufferevent *bev, void *args)
{
    size_t length = 0;
    struct string *pstr = (struct string *)args;
    struct inotify_event *pevent = (struct inotify_event *)pstr->str;

    while ( 1 ) {
        length = evbuffer_get_length(bufferevent_get_input(bev));
        if ( pstr->len == 0 ) {
            if ( length < SIZE_IEVENT ) {
                printf("Retry head\n");
                return;
            }
            pstr->len += bufferevent_read(bev, pevent, SIZE_IEVENT);
            assert(pstr->len == SIZE_IEVENT);
        }
        else {
            if ( length < pevent->len ) {
                printf("Retry body\n");
                return;
            }
            pstr->len += bufferevent_read(bev, pevent->name, pevent->len);
            assert(pstr->len == pevent->len + SIZE_IEVENT);
            pstr->len = 0;

            /* Done */
            display(pevent);
        }
    }
    return;
}
void display(struct inotify_event *pevent)
{
#define __display(mask, type) if ( mask & (type) ) { \
    printf("%-15s, ", #type); \
}
    __display(pevent->mask, IN_CREATE);
    __display(pevent->mask, IN_DELETE);
    __display(pevent->mask, IN_CLOSE_WRITE);
    __display(pevent->mask, IN_MOVED_FROM);
    __display(pevent->mask, IN_MOVED_TO);

    printf("%s\n", pevent->name);
}
其他依賴的標頭檔案如 #include <sys/inotify.h>、#include <event2/event.h>、 #include <event2/bufferevent.h> 就不細說了

程式執行起來後,簡單使用命令進行批量建立、重新命名、刪除:

mkdir a1 a2 a3 a4 a5; rename a b a*; rm -rf *

執行結果如下:


4、結論

使用inotify機制進行檔案監控在實時性方面確實比掃描方式優秀,但在使用inotify介面還需要注意以下幾點:

1)inotify_event僅告知你wd和子事件的名稱,需要自己維護一個<wd,path>資料結構來方便全路徑補齊;

2)inotify_add_watch 僅對一級目錄進行監控(當前API的一個槽點),若需要自己建立多級目錄則需要繼續再對子目錄進行 add_watch;

3)同上述問題,多級目錄請做足 mkdir -p a/b/c/d/e/f 的測試,在對a目錄 add_watch的時候,需要再配合掃描來保證多層級目錄不會丟失;

4)重新命名時候,大多情況需要你判定cookie不為零,自己對前後兩個事件進行關聯才能判定;

5)對於大壓力的事件(瞬間幾十萬事件的),需要關注一下 /proc/sys/fs/inotify/max_* 底下的配置,免得核心佇列滿導致事件丟失;

6)其他(好像沒想起別的坑來了,fd的一些處理也得注意一下);

對於運維不想涉及到c語言程式設計的話,可以搜尋一下inotifywait工具在shell中的使用;

參考文章:

[1] https://www.ibm.com/developerworks/cn/linux/l-ubuntu-inotify/

[2] http://www.infoq.com/cn/articles/inotify-linux-file-system-event-monitoring