1. 程式人生 > >python學習日記15 獲取檔案路徑 URL路徑及在讀fashion_MNIST中的應用

python學習日記15 獲取檔案路徑 URL路徑及在讀fashion_MNIST中的應用

參考檔案連結
https://docs.python.org/3.5/library/filesys.html
https://docs.python.org/3.5/library/os.path.html
https://docs.python.org/3.5/library/urllib.request.html?highlight=url#urllib.request.pathname2url
https://docs.python.org/3.5/library/pathlib.html?highlight=iterdir#pathlib.Path.iterdir

import os
import urllib
from pathlib import Path

fpath=os.path.abspath(file)#模組中__file__代表本模組檔案
print(‘filepath’,fpath)

filepath D:\Julie1\keras\pathteset.py

pathname =os.path.dirname(fpath)
print(‘path.dirname’,pathname)

path.dirname D:\Julie1\keras

print(‘path.cwd:’,Path.cwd())
print(‘os.getcwd():’,os.getcwd())

path.cwd: D:\Julie1\keras
os.getcwd(): D:\Julie1\keras

下載檔案時
from tensorflow.python.keras.utils import get_file
使用了
https://tensorflow.google.cn/api_docs/python/tf/keras/utils
get_file(…): Downloads a file from a URL if it not already in the cache.
因此,必須使用地址格式轉換

urllib.request.pathname2url(pathname)
url =urllib.request.pathname2url(pathname)
#pathname = urllib.url2pathname(url)
print('path url: ',url)

path url: ///D:/Julie1/keras

學習tensorflow的官方教程時,第一個例子用keras進行Fashion-MNIST分類。首先讀取檔案
https://tensorflow.google.cn/tutorials/keras/basic_classification

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

如果下載成功,檔案會儲存在
C:\Users\username(windows使用者名稱).keras\datasets

例如:
from tensorflow import keras
imdb = keras.datasets.imdb
(train_data, train_labels), (test_data, test_labels)=imdb.load_data(num_words=10000)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
17465344/17464789 [==============================] - 7s 0us/step

但此處
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
下載失敗,因此自己改編一下檔案的讀寫
在https://github.com/zalandoresearch/fashion-mnist/tree/master/data中,下載資料檔案

編寫load模組load_data.py

from tensorflow.python.keras.utils import get_file
import gzip
import numpy as np
import os
import urllib
def load_data():
fpath=os.path.abspath(file)
pathname =os.path.dirname(fpath)
url =urllib.request.pathname2url(pathname)
base =“file:”+url+"/fashionMNIST_data/"
#“file:///D:/fashionMNIST_data/”
files = [
‘train-labels-idx1-ubyte.gz’,
‘train-images-idx3-ubyte.gz’,
‘t10k-labels-idx1-ubyte.gz’,
‘t10k-images-idx3-ubyte.gz’
]
paths = []
for fname in files:
paths.append(get_file(fname, origin=base + fname))
with gzip.open(paths[0], ‘rb’) as lbpath:
y_train = np.frombuffer(lbpath.read(), np.uint8, offset=8)
with gzip.open(paths[1], ‘rb’) as imgpath:
x_train = np.frombuffer(
imgpath.read(), np.uint8, offset=16).reshape(len(y_train), 28, 28)
with gzip.open(paths[2], ‘rb’) as lbpath:
y_test = np.frombuffer(lbpath.read(), np.uint8, offset=8)
with gzip.open(paths[3], ‘rb’) as imgpath:
x_test = np.frombuffer(
imgpath.read(), np.uint8, offset=16).reshape(len(y_test), 28, 28)
return (x_train, y_train), (x_test, y_test)