python效能優化
1。去除不必要的顯式for迴圈,使用向量化計算。
1 import time 2 import numpy as np 3 4 5 def for_time(): 6"""Make a array, len = 1000000, use for loop add one.""" 7start = time.time() 8list_data = np.arange(0, 10000000, 1) 9for i in range(1000000): 10list_data[i] += 1 11print 'for loop used time: ', time.time() - start 12 13 14 def vector_time(): 15"""make a array, use vector calculation add one.""" 16start = time.time() 17list_data = np.arange(0, 10000000, 1) 18list_data += 1 19print 'vector calculation used time: ', time.time() - start 20 21 22 if __name__ == '__main__': 23for_time() 24vector_time()
for loop used time: 0.359999895096
vector calculation used time: 0.0160000324249
2. 使用多程序,開核。
1 import multiprocessing 2 3 4 def use_pool(func, args): 5pool = multiprocessing.Pool(processes=2) 6res = pool.map(func, args) 7pool.close() 8pool.join() 9return res
3.使用sklearn.extenals.joblib 擴充套件庫
1 from sklearn.externals.joblib import Parallel, delayed 2 3 4 def parallel(func, arg): 5Parallel(-1)(delayed(func)(i) for i in arg)
4. 使用bottleneck庫。
該庫基於Cpython實現,著眼於高效能。