Python之numpy教程(五):篩選、排序、集合函式、讀取存入資料
阿新 • • 發佈:2019-02-09
1.用布林型陣列進行篩選
import numpy as np
import numpy.random
arr = np.random.randn(100)
arr
輸出100個隨機數:
array([-0.84570456, -2.21743968, 2.48971398, 1.57138679, 0.1645484 , -0.00618139, 0.55144822, 0.70877084, 0.83862826, -1.47160326, -0.50499305, -1.52486585, -0.08403235, -0.48313017, 0.73283641, 0.59872726, 0.05932988, -1.28312722, 1.37144712, -0.52774171, 0.07949287, -1.25879195, 1.31256872, 0.31025061, 0.69700033, -1.37906378, -0.57683916, -0.66151576, -0.6215851 , -0.96214685, -1.97455008, -0.5725854 , 1.54771953, 0.10434949, 1.18676295, -1.3877092 , 0.97231658, -2.13417302, 0.07059074, 0.40872163, 0.93872577, -0.62218374, 1.56875898, 1.50472097, -0.57749041, -0.83776864, -1.82338058, -0.95860292, 0.59427145, 0.02685388, -0.15122058, -0.28583306, -1.71298474, 0.01341369, -0.70516054, 0.86404614, -0.42701139, -0.25847577, -0.78713731, 0.41052537, 0.67961828, -1.18338025, -0.96648004, -2.22403128, -2.37807866, 1.65531665, 0.93905314, 1.36454143, 0.55153089, 0.44957141, -0.78701216, -0.96467054, 0.53427677, 0.80850105, 1.87113103, 0.0755421 , 1.33436598, -0.82354346, 0.7945044 , -0.07721165, -0.07193151, -1.95614647, 0.13234494, 0.13054731, -2.10556319, 0.40520846, 1.69259913, 0.27619833, 0.21597633, 0.33204544, 2.60113181, -0.0873115 , -1.09422245, -0.84380081, -0.12965254, 1.8090488 , 1.12106681, -0.02869555, 0.45762089, -0.37615294])
計算正值的數量:
(arr > 0).sum() #正值的數量
輸出:
51
2.any和all對布林型陣列非常有用,可以測試陣列中是否存在一個或多個True
bools = np.array([False, False, True, False])
bools.any()
輸出:True
bools.all()
輸出:
False
3.用sort函式進行排序
arr = np.random.randn(8)
arr
輸出:
array([ 1.24989935, 0.26355977, -0.50860306, -1.54681062, 0.28423382, 1.37361039, 0.66252208, 0.96364101])
arr.sort()
arr
輸出:
array([-1.54681062, -0.50860306, 0.26355977, 0.28423382, 0.66252208, 0.96364101, 1.24989935, 1.37361039])
4.sort函式也可以在某個軸上進行排序,0是列,1是行。
arr = np.random.randn(5,3)
arr
輸出:
array([[ 0.80755401, -0.54385431, -1.18145348], [ 0.69971235, -0.45852225, -1.71633618], [-0.45109238, 1.24928254, 0.23480012], [-0.05216242, -0.35804026, 0.03701942], [-0.42148283, 0.26845095, -0.45013768]])
arr.sort(1)
arr
輸出:
array([[-1.18145348, -0.54385431, 0.80755401], [-1.71633618, -0.45852225, 0.69971235], [-0.45109238, 0.23480012, 1.24928254], [-0.35804026, -0.05216242, 0.03701942], [-0.45013768, -0.42148283, 0.26845095]])
5.利用排序選定特定位置:
large_arr = np.random.randn(1000)
large_arr.sort()
large_arr[int(0.05*len(large_arr))] #5%分位數
輸出:
-1.4970312664301417
6.用unique函式唯一化
names = np.array(['Bob','Joe','Will','Bob','Will','Joe','Joe'])
names
輸出:
array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4')
np.unique(names)
輸出:
array(['Bob', 'Joe', 'Will'], dtype='<U4')
ints = np.array([3,3,3,2,2,1,1,4,4])
np.unique(ints)
輸出:
array([1, 2, 3, 4])
7.用np.in1d函式測試一個數組中的值在另一個數組中的成員資格,返回一個布林型陣列:
values = np.array([6,0,0,3,2,5,6])
np.in1d(values,[2,3,6])
8.陣列的集合函式部分總結如下:
9.陣列的檔案輸入輸出
arr = np.arange(10)
np.save('some_array',arr)
如果檔案路徑末尾沒有副檔名.npy,則該副檔名會被自動加上。
10.這時候你一定想知道你的資料儲存在了哪裡,以下辦法可以查詢當前工作路徑:
import os
os.getcwd()
11.載入資料:
np.load('some_array.npy')
輸出:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])12.使用np.savez函式可以將多個數組儲存到一個壓縮檔案中,將陣列以關鍵詞引數的形式傳入即可:
np.savez('array_archive.npz',a=arr,b=arr)
arch = np.load('array_archive.npz')
arch['b']
輸出:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
13.建立個array_ex.txt檔案(預設工作目錄),裡面填充如下資料:
1,2,3,4
2,3,4,5
4,5,6,7
1,2,3,4
用np.loadtxt函式讀取它
arr = np.loadtxt('array_ex.txt',delimiter=',')
arr
輸出:
array([[ 1., 2., 3., 4.], [ 2., 3., 4., 5.], [ 4., 5., 6., 7.], [ 1., 2., 3., 4.]])