1. 程式人生 > >Python進階——OpenCV之Core Operations

Python進階——OpenCV之Core Operations

文章目錄


時隔一個月,續接上一篇,接著學習Core Operations。中間研究了下怎麼用Python+opencv實現錄屏,耽擱了一個星期時間,不過也鞏固了第一篇的內容。
opencv的 Core Operations操作主要是跟numpy模組有關,因此還提前看了一下numpy模組的用法,關於這個模組的介紹有很多,這裡就不對numpy做過多的說明了。

影象基本操作

訪問並修改畫素值

>>> import cv2
>>> import numpy as np
>>> img = cv2.imread('messi5.jpg')
>>> px = img[100,100]
>>> print px
[157 166 200]

# accessing only blue pixel,opencv影象儲存為大端格式:BGR
>>> blue = img[100,100,0]
>>> print blue
157
>>>
green = img[100,100,1] >>> print green 166 >>> red = img[100,100,2] >>> print red 200 # modify the pixel values >>> img[100,100] = [255,255,255] >>> print img[100,100] [255 255 255]

Numpy 是經過優化的快速矩陣計算庫,單獨讀寫某一個畫素點速度很慢,以上幾個畫素操作方法,其實更適合操作一個影象區域。如果要操作單個畫素點,推薦使用array.item() and array.itemset()

# accessing RED value
>>> img.item(10,10,2)
59
# modifying RED value
>>> img.itemset((10,10,2),100)
>>> img.item(10,10,2)
100

訪問影象的屬性

影象的屬性主要包括影象的行、列、畫素的通道數、影象的型別、畫素的個數等。以下幾個函式主要訪問影象的屬性。

# img.shape屬性返回影象的行、列、顏色通道數(如果是彩色影象)
# 如果是灰度影象,此屬性只返回影象的行、列大小
>>> print img.shape
(342, 548, 3)

# 影象的總畫素個數
>>> print img.size
562248

#影象每一個畫素資料型別
>>> print img.dtype
uint8
#img.dtype is very important while debugging because a large number of errors in OpenCV-Python code is caused by invalid datatype.

設定影象區域

典型操作,例如人眼檢測,最好先進行人臉檢測,然後在檢測到的人臉範圍內進行人眼檢測,眼睛總是在臉上,因此先進行臉部檢測,可以大大縮小眼睛檢測的範圍。從而提高人眼檢測速度。
影象的區域操作同樣使用numpy

# 將影象的一個區域複製到另一個區域
>>> roi = img[280:340, 330:390]
>>> img[273:333, 100:160] = roi

影象分割與合併

>>> b,g,r = cv2.split(img)
>>> img = cv2.merge((b,g,r))
#切片操作
>>> b = img[:,:,0]
>>> img[:,:,2] = 0

cv2.split()函式是一個耗時操作,謹慎使用。

畫影象邊框

cv2.copyMakeBorder()函式用於為影象畫邊框 ,函式的引數說明如下:

  • src - input image
  • top, bottom, left, right - border width in number of pixels in corresponding directions
  • borderType - Flag defining what kind of border to be added. It can be following types:
    • cv2.BORDER_CONSTANT - Adds a constant colored border. The value should be given as next argument.
    • cv2.BORDER_REFLECT - Border will be mirror reflection of the border elements, like this : fedcba|abcdefgh|hgfedcb
    • cv2.BORDER_REFLECT_101 or cv2.BORDER_DEFAULT - Same as above, but with a slight change, like this : gfedcb|abcdefgh|gfedcba
    • cv2.BORDER_REPLICATE - Last element is replicated throughout, like this: aaaaaa|abcdefgh|hhhhhhh
    • cv2.BORDER_WRAP - Can’t explain, it will look like this : cdefgh|abcdefgh|abcdefg
  • value - Color of border if border type is cv2.BORDER_CONSTANT
import cv2
import numpy as np
from matplotlib import pyplot as plt

BLUE = [255,0,0]

img1 = cv2.imread('opencv_logo.png')

replicate = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REPLICATE)
reflect = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REFLECT)
reflect101 = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_REFLECT_101)
wrap = cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_WRAP)
constant= cv2.copyMakeBorder(img1,10,10,10,10,cv2.BORDER_CONSTANT,value=BLUE)

plt.subplot(231),plt.imshow(img1,'gray'),plt.title('ORIGINAL')
plt.subplot(232),plt.imshow(replicate,'gray'),plt.title('REPLICATE')
plt.subplot(233),plt.imshow(reflect,'gray'),plt.title('REFLECT')
plt.subplot(234),plt.imshow(reflect101,'gray'),plt.title('REFLECT_101')
plt.subplot(235),plt.imshow(wrap,'gray'),plt.title('WRAP')
plt.subplot(236),plt.imshow(constant,'gray'),plt.title('CONSTANT')

plt.show()

以上操作後畫出的邊框示例如下:
在這裡插入圖片描述

影象的數學操作

主要學習 cv2.add(), cv2.addWeighted()兩個函式

影象疊加

numpy相加為取模計算
opecv的add函式為飽和計算

>>> x = np.uint8([250])
>>> y = np.uint8([10])

>>> print cv2.add(x,y) # 250+10 = 260 => 255
[[255]]

>>> print x+y          # 250+10 = 260 % 256 = 4
[4]

影象融合

影象的融合公式:g(x) = (1-a)f0(x) + af1(x);a的取值範圍是0—1;
cv2.addWeighted()函式的影象融合:g(x) = (1-a)f0(x) + af1(x) + b

img1 = cv2.imread('ml.png')
img2 = cv2.imread('opencv_logo.jpg')

dst = cv2.addWeighted(img1,0.7,img2,0.3,0)

cv2.imshow('dst',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()

融合影象示例:
在這裡插入圖片描述

影象位操作

影象位操作主要包括:AND、OR、 NOT、 XOR

# Load two images
img1 = cv2.imread('messi5.jpg')
img2 = cv2.imread('opencv_logo.png')

# I want to put logo on top-left corner, So I create a ROI
rows,cols,channels = img2.shape
roi = img1[0:rows, 0:cols ]

# Now create a mask of logo and create its inverse mask also
img2gray = cv2.cvtColor(img2,cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 10, 255, cv2.THRESH_BINARY)
mask_inv = cv2.bitwise_not(mask)

# Now black-out the area of logo in ROI
img1_bg = cv2.bitwise_and(roi,roi,mask = mask_inv)

# Take only region of logo from logo image.
img2_fg = cv2.bitwise_and(img2,img2,mask = mask)

# Put logo in ROI and modify the main image
dst = cv2.add(img1_bg,img2_fg)
img1[0:rows, 0:cols ] = dst

cv2.imshow('res',img1)
cv2.waitKey(0)
cv2.destroyAllWindows()

位操作後圖像示例:
在這裡插入圖片描述

Python OpenCV程式碼檢測與速度優化

  • cv2.getTickCount:獲得當前的時鐘tick數
  • cv2.getTickFrequency:獲得時鐘頻率,即每秒的tick數
img1 = cv2.imread('messi5.jpg')
e1 = cv2.getTickCount()
for i in xrange(5,49,2):
    img1 = cv2.medianBlur(img1,i)
e2 = cv2.getTickCount()
t = (e2 - e1)/cv2.getTickFrequency()
print t
# Result I got is 0.521107655 seconds
  • cv2.useOptimized():檢測是否開啟優化
  • cv2.setUseOptimized():設定是否優化
# check if optimization is enabled
In [5]: cv2.useOptimized()
Out[5]: True

In [6]: %timeit res = cv2.medianBlur(img,49)
10 loops, best of 3: 34.9 ms per loop

# Disable it
In [7]: cv2.setUseOptimized(False)

In [8]: cv2.useOptimized()
Out[8]: False

In [9]: %timeit res = cv2.medianBlur(img,49)
10 loops, best of 3: 64.1 ms per loop

本篇比較麻煩的就是位操作了,分析好久,還沒完全弄明白;有待更新。