1. 程式人生 > >Python之numpy教程(五):篩選、排序、集合函式、讀取存入資料

Python之numpy教程(五):篩選、排序、集合函式、讀取存入資料

1.用布林型陣列進行篩選

import numpy as np
import numpy.random
arr = np.random.randn(100)
arr
輸出100個隨機數:
array([-0.84570456, -2.21743968,  2.48971398,  1.57138679,  0.1645484 ,
       -0.00618139,  0.55144822,  0.70877084,  0.83862826, -1.47160326,
       -0.50499305, -1.52486585, -0.08403235, -0.48313017,  0.73283641,
        0.59872726,  0.05932988, -1.28312722,  1.37144712, -0.52774171,
        0.07949287, -1.25879195,  1.31256872,  0.31025061,  0.69700033,
       -1.37906378, -0.57683916, -0.66151576, -0.6215851 , -0.96214685,
       -1.97455008, -0.5725854 ,  1.54771953,  0.10434949,  1.18676295,
       -1.3877092 ,  0.97231658, -2.13417302,  0.07059074,  0.40872163,
        0.93872577, -0.62218374,  1.56875898,  1.50472097, -0.57749041,
       -0.83776864, -1.82338058, -0.95860292,  0.59427145,  0.02685388,
       -0.15122058, -0.28583306, -1.71298474,  0.01341369, -0.70516054,
        0.86404614, -0.42701139, -0.25847577, -0.78713731,  0.41052537,
        0.67961828, -1.18338025, -0.96648004, -2.22403128, -2.37807866,
        1.65531665,  0.93905314,  1.36454143,  0.55153089,  0.44957141,
       -0.78701216, -0.96467054,  0.53427677,  0.80850105,  1.87113103,
        0.0755421 ,  1.33436598, -0.82354346,  0.7945044 , -0.07721165,
       -0.07193151, -1.95614647,  0.13234494,  0.13054731, -2.10556319,
        0.40520846,  1.69259913,  0.27619833,  0.21597633,  0.33204544,
        2.60113181, -0.0873115 , -1.09422245, -0.84380081, -0.12965254,
        1.8090488 ,  1.12106681, -0.02869555,  0.45762089, -0.37615294])

計算正值的數量:

(arr > 0).sum() #正值的數量
輸出:
51

2.anyall對布林型陣列非常有用,可以測試陣列中是否存在一個或多個True
bools = np.array([False, False, True, False])
bools.any()
輸出:
True
bools.all()
輸出:
False

3.用sort函式進行排序
arr = np.random.randn(8)
arr
輸出:
array([ 1.24989935,  0.26355977, -0.50860306, -1.54681062,  0.28423382,
        1.37361039,  0.66252208,  0.96364101])
arr.sort()
arr
輸出:
array([-1.54681062, -0.50860306,  0.26355977,  0.28423382,  0.66252208,
        0.96364101,  1.24989935,  1.37361039])

4.sort函式也可以在某個軸上進行排序,0是列,1是行
arr = np.random.randn(5,3)
arr
輸出:
array([[ 0.80755401, -0.54385431, -1.18145348],
       [ 0.69971235, -0.45852225, -1.71633618],
       [-0.45109238,  1.24928254,  0.23480012],
       [-0.05216242, -0.35804026,  0.03701942],
       [-0.42148283,  0.26845095, -0.45013768]])
arr.sort(1)
arr
輸出:
array([[-1.18145348, -0.54385431,  0.80755401],
       [-1.71633618, -0.45852225,  0.69971235],
       [-0.45109238,  0.23480012,  1.24928254],
       [-0.35804026, -0.05216242,  0.03701942],
       [-0.45013768, -0.42148283,  0.26845095]])

5.利用排序選定特定位置
large_arr = np.random.randn(1000)
large_arr.sort()
large_arr[int(0.05*len(large_arr))] #5%分位數
輸出:
-1.4970312664301417

6.用unique函式唯一化
names = np.array(['Bob','Joe','Will','Bob','Will','Joe','Joe'])
names
輸出:
array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], 
      dtype='<U4')
np.unique(names)
輸出:
array(['Bob', 'Joe', 'Will'], 
      dtype='<U4')
ints = np.array([3,3,3,2,2,1,1,4,4])
np.unique(ints)
輸出:
array([1, 2, 3, 4])

7.用np.in1d函式測試一個數組中的值在另一個數組中的成員資格,返回一個布林型陣列:
values = np.array([6,0,0,3,2,5,6])
np.in1d(values,[2,3,6])

8.陣列的集合函式部分總結如下:


9.陣列的檔案輸入輸出

arr = np.arange(10)
np.save('some_array',arr)
如果檔案路徑末尾沒有副檔名.npy,則該副檔名會被自動加上。

10.這時候你一定想知道你的資料儲存在了哪裡,以下辦法可以查詢當前工作路徑

import os
os.getcwd()

11.載入資料:
np.load('some_array.npy')
輸出:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
12.使用np.savez函式可以將多個數組儲存到一個壓縮檔案中,將陣列以關鍵詞引數的形式傳入即可:
np.savez('array_archive.npz',a=arr,b=arr)
arch = np.load('array_archive.npz')
arch['b']
輸出:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


13.建立個array_ex.txt檔案(預設工作目錄),裡面填充如下資料:

1,2,3,4
2,3,4,5
4,5,6,7
1,2,3,4

np.loadtxt函式讀取它

arr = np.loadtxt('array_ex.txt',delimiter=',')
arr
輸出:
array([[ 1.,  2.,  3.,  4.],
       [ 2.,  3.,  4.,  5.],
       [ 4.,  5.,  6.,  7.],
       [ 1.,  2.,  3.,  4.]])