1. 程式人生 > >Performs recursive(遞迴) glob(全域性) with given suffix and rootdir,使用os.walk(rootdir)和filename.endswith(s

Performs recursive(遞迴) glob(全域性) with given suffix and rootdir,使用os.walk(rootdir)和filename.endswith(s

在使用CityScapes資料集的時候,它的訓練集裡面有18個子資料夾分別是來自18個城市的圖片,相應的訓練集的標籤集裡面也有18個子資料夾。我們就要將這些訓練圖片全部拿出來,所以就用到了檔案的遞迴來得到所有的圖片

import os
def recursive_glob(rootdir='.', suffix=''):
    """Performs recursive glob with given suffix and rootdir
        :param rootdir is the root directory
        :param suffix is the suffix to be searched
    """
    return [os.path.join(looproot, filename)
        for looproot, _, filenames in os.walk(rootdir)
        for filename in filenames if filename.endswith(suffix)]

root='/home/zzp/SSD_ping/my-root-path/My-core-python/DATA/CityScapes'
split="train"
images_base = os.path.join(root, 'leftImg8bit', split)
files[split] = recursive_glob(rootdir=images_base, suffix='.png')

os.walk(rootdir)這個函式講解如下:

def walk(top, topdown=True, onerror=None, followlinks=False):
    """Directory tree generator.

    For each directory in the directory tree rooted at top (including top
    itself, but excluding '.' and '..'), yields a 3-tuple

        dirpath, dirnames, filenames

    dirpath is a string, the path to the directory.(是一個str型別)  
    dirnames is a list of the names of the subdirectories in dirpath (excluding '.' and '..').(該列表裡面顯示的是在dirpath目錄下的資料夾,如果沒有資料夾那就是空列表)
    filenames is a list of the names of the non-directory files in dirpath(也就是說filenames裡面只能是在dirpath目錄下的檔案,如果沒有檔案那麼是空列表).
    Note that the names in the lists are just names, with no path components.
    To get a full path (which begins with top) to a file or directory in
    dirpath, do os.path.join(dirpath, name).

    Example:

    import os
    from os.path import join, getsize
    for root, dirs, files in os.walk('python/Lib/email'):
        print(root, "consumes", end="")
        print(sum([getsize(join(root, name)) for name in files]), end="")
        print("bytes in", len(files), "non-directory files")
        if 'CVS' in dirs:
            dirs.remove('CVS')  # don't visit CVS directories

    """
    pass

 那麼得到的三個引數都是以dirpath為起點的,那麼這個引數是怎麼變化的呢,才能遍歷(gloab)全部的檔案?

其實很簡單:首先dirpath的值是裡面的引數top也就是上面的root,接下來就是一層一層的索引,直到是空列表,轉到下一個目錄繼續索引。詳情可以自己複製上面的程式碼,自己製作一個目錄跑一下程式碼看結果,很容易的看出檢索機制。

filename.endswith(suffix)函式講解如下:

filename是一個str型別的,endswith是這個型別的一個內建函式,當結尾是指定的suffix(字尾)時,返回True;否則返回False。

所以上述程式碼的目的是:

 得到a資料夾裡面的所有以.png結尾的檔案的完整路徑,然後返回一個列表,裡面的值就是每個檔案的完整路徑

import os
a='/home/zzp/SSD_ping/my-root-path/My-core-python/DATA/CityScapes/leftImg8bit'
b=[]
for looproot, _, filenames in os.walk(a):
    for filename in filenames:
        if filename.endswith('.png'):
            b.appand(os.path.join(looproot,filename)