1. 程式人生 > >一個簡單的計時器對比各種可迭代物件定義方式的速度區別

一個簡單的計時器對比各種可迭代物件定義方式的速度區別

一個簡單的計時器對比各種可迭代物件定義方式的速度區別

前情介紹: 如果對迭代器和生成器不瞭解,可以先看這兩篇

  • python隨用隨學20200221-生成器中的send(),throw()和close()方法
  • python中的迭代器和生成器

初始版本

import time

reps = 1000
repslist = range(reps)


def timer(func, *pargs, **kargs):
start = time.clock()
for i in repslist:
ret = func(*pargs, **kargs)
elapsed = time.clock() - start
return (elapsed, ret)

這個是初始版本的計時器.

我們先來做個測試跑一遍

from timer import timer
import sys

reps = 100000
repslist = range(reps)

def forloop():
res = []
for x in repslist:
res.append(abs(x))
return res

def listComp():
return [abs(x) for x in repslist]

def mapCall():
return list(map(abs,repslist))

def genExpr():
return list(abs(x) for x in repslist)

def genFunc():
def gen():
for x in repslist:
yield abs(x)
return list(gen())

print(sys.version)

for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = timer(test)
print('-'*33)
print('%-9s:%.5f => [%s...%s]'%(test.__name__,elapsed,result[0],result[-1]))

得到的結果如下:

C:\Anaconda3\python.exe C:/Users/Brady/PycharmProjects/FAQ/literor.py
3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :11.40492 => [0...99999]
---------------------------------
listComp :7.58494 => [0...99999]
---------------------------------
mapCall :4.28971 => [0...99999]
---------------------------------
genExpr :10.49181 => [0...99999]
---------------------------------
genFunc :10.76498 => [0...99999]

從結果中可以看出來:

  • map比列表解析式快,而且兩者都比for迴圈要快得多.
  • 生成器表示式和函式速度居中

如果我們採用自定義函式而非內建函式的話,得到的結果就更有意思了:

from timer import timer
import sys

reps = 100000
repslist = range(reps)

def forloop():
res = []
for x in repslist:
res.append(x+10)
return res

def listComp():
return [x+10 for x in repslist]

def mapCall():
return list(map(lambda x:x+10,repslist))

def genExpr():
return list(x+10 for x in repslist)

def genFunc():
def gen():
for x in repslist:
yield x+10
return list(gen())

print(sys.version)

for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = timer(test)
print('-'*33)
print('%-9s:%.5f => [%s...%s]'%(test.__name__,elapsed,result[0],result[-1]))

我們得到的結果如下:

3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :26.69562 => [10...100009]
---------------------------------
listComp :16.46341 => [10...100009]
---------------------------------
mapCall :19.51527 => [10...100009]
---------------------------------
genExpr :10.53358 => [10...100009]
---------------------------------
genFunc :10.85899 => [10...100009]

Process finished with exit code 0

說實話這個結果有點不好解釋了...貌似打臉了...

於是我又跑了一遍...得到的結果如下:

3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :11.92378 => [10...100009]
---------------------------------
listComp :7.27866 => [10...100009]
---------------------------------
mapCall :12.92113 => [10...100009]
---------------------------------
genExpr :10.50988 => [10...100009]
---------------------------------
genFunc :10.56482 => [10...100009]

Process finished with exit code 0

這個結果比較符合我們的預期...

  • 在自定義函式下,map的速度比for迴圈要慢
  • 列表解析式速度是最塊的.
  • 生成器表示式的速度比列表解析式要慢,但是與生成器函式差不多.

進階版本

這個結果主要是由於python直譯器的實現造成的.

同時也說明一個問題... 我們的計時器不夠科學...

於是下面我們來優化一下我們的計時器.

  • 考慮平臺的相容性,在類unix系統中,time.time可以提供更好的解析
  • 由於隨機的系統載入可能引起的波動,我們在測試中取最短時間比取總執行時間要更可靠.

改版後的計時器

import time
import sys

if sys.platform[:3]=='win':
timefunc = time.clock
else:
timfunc = time.time


def trace(*args):
"""
used for debuging
:param args:
:return:
"""
pass

def timer(func,*pargs,**kargs):
_reps = kargs.pop('_reps',1000)
trace(func,pargs,kargs,_reps)
repslist = range(_reps)
start = timefunc()
for i in repslist:
ret = func(*pargs,**kargs)
elapsed = timefunc()-start
return (elapsed,ret)


def best(func,*pargs,**kargs):
_reps = kargs.pop('_reps',50)
best=2**32
for i in range(_reps):
(time,ret)=timer(func,*pargs,_reps=1,**kargs)
if time <best: best=time
return (best,ret)

改版後的測試程式碼

from timer import timer
from timer import best
import sys

reps = 100000
repslist = range(reps)

def forloop():
res = []
for x in repslist:
res.append(x+10)
return res

def listComp():
return [x+10 for x in repslist]

def mapCall():
return list(map(lambda x:x+10,repslist))

def genExpr():
return list(x+10 for x in repslist)

def genFunc():
def gen():
for x in repslist:
yield x+10
return list(gen())

print(sys.version)

for tester in (timer,best):
print(f'<{tester.__name__}>')
for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = tester(test)
print('-'*35)
print('%-9s:%.5f => [%s...%s]'%(test.__name__,elapsed,result[0],result[-1]))

來看一下結果

3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
<timer>
-----------------------------------
forloop :11.18427 => [10...100009]
-----------------------------------
listComp :7.33068 => [10...100009]
-----------------------------------
mapCall :13.33474 => [10...100009]
-----------------------------------
genExpr :11.25375 => [10...100009]
-----------------------------------
genFunc :11.03975 => [10...100009]
<best>
-----------------------------------
forloop :0.00904 => [10...100009]
-----------------------------------
listComp :0.00525 => [10...100009]
-----------------------------------
mapCall :0.01133 => [10...100009]
-----------------------------------
genExpr :0.00845 => [10...100009]
-----------------------------------
genFunc :0.00785 => [10...100009]

從執行的最快速度來看的話,完全符合我們上面的結論.

  • 列表解析式的速度是最快的
  • map函式比正常的for迴圈要慢
  • 生成器表示式比for迴圈要快,速度與生成器函式差不太多.

「結論:」

其實這篇文章寫來純粹是為了好玩的. 既然選擇了python...就別太糾結執行速度了,畢竟python只負責貌美如花...

python程式碼的優化,首先考慮的是可讀性和簡單性,其次實在閒的蛋疼了再去優化效能.