1. 程式人生 > >影象卷積:Image Convolutions

影象卷積:Image Convolutions

1. Convolutions

Convolutions is a technique for general signal processing. People studying electrical/electronics will tell you the near infinite sleepless nights these convolutions have given them. Entire books have been written on this topic. And the questions and theorems that need to be proved are [insurmountable]. But for computer vision, we'll just deal with some simple things.

The Kernel

A convolution lets you do many things, like calculate derivatives, detect edges, apply blurs, etc. A very wide variety of things. And all of this is done with a "convolution kernel".

The convolution kernel is a small matrix. This matrix has numbers in each cell and has an anchor point:

The convolution kernel

This kernel slides over an image and does its thing. The "anchor" point is used to determine the position of the kernel with respect to the image.

The transformation

The anchor point starts at the top-left corner of the image and moves over each pixel sequentially. At each position, the kernel overlaps a few pixels on the image. Each overlapping pair of numbers is multiplied and added. Finally, the value at the current position is set to this sum.

Here's an example:

An example of the transformation

The matrix on the left is the image and the one on the right is the kernel. Suppose the kernel is at the highlighted position. So the '9' of the kernel overlaps with the '4' of the image. So you calculate their product: 36. Next, '3' of the kernel overlaps the '3' of the image. So you multiply: 9. Then you add it to 36. So you get a sum of 36+9=45. Similarly, you do for all the remaining 7 overlapping values. You'll get a total sum. This sum is stored in place of '2' (in the image).

Speed optimizations

The most direct way to compute a convolution would be to use multiple for loops. But that causes a lot of repeated calculations. And as the size of the image and kernel increases, the time to compute the convolution increases too (quite drastically).

Techniques haves been developed to calculate convolutions rapidly. One such technique is using the Discrete Fourier Transform. It converts the entire convolution operation into a simple multiplication. Fortunately, you don't need to know the math to do this in OpenCV. It automatically decides whether to do it in frequency domain (after the DFT) or not.

Problematic corners and edges

The kernel is two dimensional. So you have problems when the kernel is near the edges or corners. Here's an example: If the kernel (in the above example) is on the top right position, the '0' of the kernel will be over the '3' in the image. But the '1' will be outside the image. So we have no idea what to do with it. Two things are possible:

  • Ignore the ones -or-
  • Do something about the edges Usually people choose to do something about it. They create extra pixels near the edges. There are a few ways to create extra pixels:
  • Set a constant value for these pixels
  • Duplicate edge pixels
  • Reflect edges (like a mirror effect)
  • Warp the image around (copy pixels from the other end)

This usually fixes the problems that might arise.

Summary

You learned a powerful technique that can be used for a lot of different purposes. We'll see a few of those next.

2. Image convolution examples

A convolution is very useful for signal processing in general. There is a lot of complex mathematical theory available for convolutions. For digital image processing, you don't have to understand all of that. You can use a simple matrix as an image convolution kernel and do some interesting things!

Simple box blur

Here's a first and simplest. This convolution kernel has an averaging effect. So you end up with a slight blur. The image convolution kernel is:

The convolution kernel for a simple blur

Note that the sum of all elements of this matrix is 1.0. This is important. If the sum is not exactly one, the resultant image will be brighter or darker.

Here's a blur that I got on an image:

After a simple blur done with a convolution

A simple blur done with convolutions

Gaussian blur

Gaussian blur has certain mathematical properties that makes it important for computer vision. And you can approximate it with an image convolution. The image convolution kernel for a Gaussian blur is:

Here's a result that I got:

Result of gaussian blur with a convolution

Line detection with image convolutions

With image convolutions, you can easily detect lines. Here are four convolutions to detect horizontal, vertical and lines at 45 degrees:

Convolution kernels for line detectionI looked for horizontal lines on the house image. The result I got for this image convolution was:

Detecting horizontal lines with a convolution

Edge detection

The above kernels are in a way edge detectors. Only thing is that they have separate components for horizontal and vertical lines. A way to "combine" the results is to merge the convolution kernels. The new image convolution kernel looks like this:

The edge detection convolution kernel

Below result I got with edge detection:

Edge detection with convolutions

The Sobel Edge Operator

The above operators are very prone to noise. The Sobel edge operators have a smoothing effect, so they're less affected to noise. Again, there's a horizontal component and a vertical component.

The sobel operator's convolution kernel

On applying this image convolution, the result was:

Result of the horizontal sobel operator

The laplacian operator

The laplacian is the second derivative of the image. It is extremely sensitive to noise, so it isn't used as much as other operators. Unless, of course you have specific requirements.

The kernel for the laplacian operator

Here's the result with the convolution kernel without diagonals:

The result of convolution with with the laplacian operator

The Laplacian of Gaussian

The laplacian alone has the disadvantage of being extremely sensitive to noise. So, smoothing the image before a laplacian improves the results we get. This is done with a 5x5 image convolution kernel.

The kernel for the laplacial of gaussian operation

The result on applying this image convolution was:

The result of applying the laplacian of gaussian operator

Summary

You got to know about some important operations that can be approximated using an image convolution. You learned the exact convolution kernels used and also saw an example of how each operator modifies an image. I hope this helped!


from: http://aishack.in/tutorials/convolutions/

相關推薦

影象Image Convolutions

1. Convolutions Convolutions is a technique for general signal processing. People studying electrical/electronics will tell you the n

三維全景影象Spherical CNNs(Code)

         卷積神經網路(CNN)可以很好的處理二維平面圖像的問題。然而,對球面影象進行處理需求日益增加。例如,對無人機、機器人、自動駕駛汽車、分子迴歸問題、全球天氣和氣候模型的全方位視覺處理問題。         將球形訊號的平面投影作為卷積神經網路的輸入的這種Too

多通道影象基礎知識介紹

轉:https://blog.csdn.net/williamyi96/article/details/77648047 1.對於單通道影象+單卷積核做卷積 Conv layers包含了conv,pooling,relu三種層。以python版本中的VGG16模型中的faster_rcnn_

深度學習影象後的尺寸計算公式

輸入圖片大小 W×W Filter大小 F×F 步長 S padding的畫素數 P 於是我們可以得出: N = (W − F + 2P )/S+1 輸出圖片大小為 N×N 如:輸入影象為5*5*3,Filter為3*3*3,在zero pad 為1,步長 S=1 (可先忽略這條

影象、相關以及在MATLAB中的操作

原文:http://www.cnblogs.com/zjutzz/p/5661543.html 影象卷積、相關以及在MATLAB中的操作 區分卷積和相關 影象處理中常常需要用一個濾波器做空間濾波操作。空間濾波操作有時候也被叫做卷積濾波,或者乾脆叫卷積(離散的卷積,不是微

多通道影象

#1.三通道卷積 ##1.1通過設定不同filter 我們在進行影象處理的時候回遇到有色彩的影象,一般都是RGB,三個通道。 這個時候原始矩陣就變成了三維的,他們分別是原來的兩個維度寬width和高he

【Shader特效8】著色器濾鏡、影象與濾波、數字影象處理

##說在開頭: PhotoShop和特效相機中有許多特效的濾鏡。片元著色器時基於片元為單位執行的,完全可以實現特殊的濾鏡效果。要想實現這些濾鏡效果還需要簡單的瞭解《數字影象處理》中的影象卷積與濾波的一些

一維訊號影象的區別

基礎概念:   卷積神經網路(CNN):屬於人工神經網路的一種,它的權值共享的網路結構顯著降低了模型的複雜度,減少了權值的數量。卷積神經網路不像傳統的識別演算法一樣,需要對資料進行特徵提取和資料重建,可以直接將圖片作為網路的輸入,自動提取特徵,並且對圖形的變形等具有高度不變形。在語音分析和影象識

影象網路模型彙總

New weights files: NASNet, DenseNet  fchollet released this on 16 Jan Assets12 densenet121_weights_tf_dim_ordering_tf_kernels.

神經網路CNN(1)——影象與反捲(後,轉置

1.前言    傳統的CNN網路只能給出影象的LABLE,但是在很多情況下需要對識別的物體進行分割實現end to end,然後FCN出現了,給物體分割提供了一個非常重要的解決思路,其核心就是卷積與反捲積,所以這裡就詳細解釋卷積與反捲積。     對於1維的卷積,公式(離散

[大神貼]如何成為一個很厲害的神經網路

什麼是卷積神經網路?又為什麼很重要? 卷積神經網路(Convolutional Neural Networks, ConvNets or CNNs)是一種在影象識別與分類領域被證明特別有效的神經網路。卷積網路已經成功地識別人臉、物體、交通標誌,應用在機器人和無人車等載具。

深度學習---影象與反捲(最完美的解釋)

動態圖1.前言   傳統的CNN網路只能給出影象的LABLE,但是在很多情況下需要對識別的物體進行分割實現end to end,然後FCN出現了,給物體分割提供了一個非常重要的解決思路,其核心就是卷積與反捲積,所以這裡就詳細解釋卷積與反捲積。    對於1維的卷積,公式(離散

【轉】影象與濾波及高斯模糊(gauss blur)的一些知識點

對非影象邊界的畫素的操作比較簡單。假設我們對I的第四個畫素3做區域性平均。也就是我們用2,3和7做平均,來取代這個位置的畫素值。也就是,平均會產生一副新的影象J,這個影象在相同位置J (4) = (I(3)+I(4)+I(5))/3 = (2+3+7)/3 = 4。同樣,我們可以得到J(3) = (I(2)+

影象操作的手動實現(基於opencv的C++編譯環境)

        opencv環境下有自帶的filter2D()函式可以實現影象的卷積,自己寫一個卷積函式函式貌似是沒事找事。。。。好吧,事實是這是我們計算機視覺課程上的一項作業。我們很多演算法過程僅僅只呼叫別人寫好的介面,即使原理我們已經清楚,但是真正編寫程式碼的時候很多細節

關於影象與caffe中實現

影象卷積及Caffe中的卷積實現   原創內容,轉載請註明出處。    本文簡單介紹了影象卷積相關的知識以及Caffe中的卷積實現方法,寫作過程中參考了很多很讚的資料,有興趣的讀者可以從【參考資料】檢視。    博文中的錯誤和不足之處還望各位讀者指正。 什麼是卷積?

OpenCV下利用傅立葉變換和逆變換實現影象演算法,並附自己對於核/模板核算子的理解!

學過訊號與系統的人都知道,卷積運算一般是轉化成頻率乘積再求逆來計算,因為這樣可以減少計算量,提高程式碼的效率。 影象卷積操作廣泛應用在影象濾波技術中。 影象卷積運算中一個重要概念是卷積核算子,它是模板核算子的一種,模板核算子實際上就是一個視窗矩陣,用這個視窗按畫素點滑動去

cuda影象

#include    <wb.h> #define wbCheck(stmt) do {                                                    \         cudaError_t err = st

影象的理解

本部落格談談對以下兩個問題的理解:  1. 為何影象的卷積是對應元素相乘並求和;  2 為何影象的卷積可以實現影象的模糊或銳化的作用。 問題一:            先借助別人的部落格,說明下影象卷積的操作:                     1.1 影象卷積的

opencv for python (13) 影象影象平滑(平均、高斯模糊、中值模糊、雙邊濾波)

影象卷積 卷積函式 cv2.filter2D(img,-1,kernel) 第一個引數是原影象 第二個引數目標影象的所需深度。如果是負數,則與原影象深度相同 第三個引數是卷積核心 import cv2 import numpy as np

神經網路_影象解釋

學習卷積神經網路一段時間了,記錄下關於卷積神經網路中影象卷積的原理。 互相學習交流。 1、人工神經網路 首先看下人工神經網路感知器的原理圖,這個不是重點,但是卷積神經網路由此而來,所以擷取材料如下: