影象卷積：Image Convolutions

阿新 • • 發佈：2019-01-17

1. Convolutions

Convolutions is a technique for general signal processing. People studying electrical/electronics will tell you the near infinite sleepless nights these convolutions have given them. Entire books have been written on this topic. And the questions and theorems that need to be proved are [insurmountable]. But for computer vision, we'll just deal with some simple things.

The Kernel

A convolution lets you do many things, like calculate derivatives, detect edges, apply blurs, etc. A very wide variety of things. And all of this is done with a "convolution kernel".

The convolution kernel is a small matrix. This matrix has numbers in each cell and has an anchor point:

The convolution kernel

This kernel slides over an image and does its thing. The "anchor" point is used to determine the position of the kernel with respect to the image.

The transformation

The anchor point starts at the top-left corner of the image and moves over each pixel sequentially. At each position, the kernel overlaps a few pixels on the image. Each overlapping pair of numbers is multiplied and added. Finally, the value at the current position is set to this sum.

Here's an example:

An example of the transformation

The matrix on the left is the image and the one on the right is the kernel. Suppose the kernel is at the highlighted position. So the '9' of the kernel overlaps with the '4' of the image. So you calculate their product: 36. Next, '3' of the kernel overlaps the '3' of the image. So you multiply: 9. Then you add it to 36. So you get a sum of 36+9=45. Similarly, you do for all the remaining 7 overlapping values. You'll get a total sum. This sum is stored in place of '2' (in the image).

Speed optimizations

The most direct way to compute a convolution would be to use multiple for loops. But that causes a lot of repeated calculations. And as the size of the image and kernel increases, the time to compute the convolution increases too (quite drastically).

Techniques haves been developed to calculate convolutions rapidly. One such technique is using the Discrete Fourier Transform. It converts the entire convolution operation into a simple multiplication. Fortunately, you don't need to know the math to do this in OpenCV. It automatically decides whether to do it in frequency domain (after the DFT) or not.

Problematic corners and edges

The kernel is two dimensional. So you have problems when the kernel is near the edges or corners. Here's an example: If the kernel (in the above example) is on the top right position, the '0' of the kernel will be over the '3' in the image. But the '1' will be outside the image. So we have no idea what to do with it. Two things are possible:

Ignore the ones -or-
Do something about the edges Usually people choose to do something about it. They create extra pixels near the edges. There are a few ways to create extra pixels:
Set a constant value for these pixels
Duplicate edge pixels
Reflect edges (like a mirror effect)
Warp the image around (copy pixels from the other end)

This usually fixes the problems that might arise.

Summary

You learned a powerful technique that can be used for a lot of different purposes. We'll see a few of those next.

2. Image convolution examples

A convolution is very useful for signal processing in general. There is a lot of complex mathematical theory available for convolutions. For digital image processing, you don't have to understand all of that. You can use a simple matrix as an image convolution kernel and do some interesting things!

Simple box blur

Here's a first and simplest. This convolution kernel has an averaging effect. So you end up with a slight blur. The image convolution kernel is:

The convolution kernel for a simple blur

Note that the sum of all elements of this matrix is 1.0. This is important. If the sum is not exactly one, the resultant image will be brighter or darker.

Here's a blur that I got on an image:

After a simple blur done with a convolution

A simple blur done with convolutions

Gaussian blur

Gaussian blur has certain mathematical properties that makes it important for computer vision. And you can approximate it with an image convolution. The image convolution kernel for a Gaussian blur is:

Here's a result that I got:

Result of gaussian blur with a convolution

Line detection with image convolutions

With image convolutions, you can easily detect lines. Here are four convolutions to detect horizontal, vertical and lines at 45 degrees:

Convolution kernels for line detection I looked for horizontal lines on the house image. The result I got for this image convolution was:

Detecting horizontal lines with a convolution

Edge detection

The above kernels are in a way edge detectors. Only thing is that they have separate components for horizontal and vertical lines. A way to "combine" the results is to merge the convolution kernels. The new image convolution kernel looks like this:

The edge detection convolution kernel

Below result I got with edge detection:

Edge detection with convolutions

The Sobel Edge Operator

The above operators are very prone to noise. The Sobel edge operators have a smoothing effect, so they're less affected to noise. Again, there's a horizontal component and a vertical component.

The sobel operator's convolution kernel

On applying this image convolution, the result was:

Result of the horizontal sobel operator

The laplacian operator

The laplacian is the second derivative of the image. It is extremely sensitive to noise, so it isn't used as much as other operators. Unless, of course you have specific requirements.

The kernel for the laplacian operator

Here's the result with the convolution kernel without diagonals:

The result of convolution with with the laplacian operator

The Laplacian of Gaussian

The laplacian alone has the disadvantage of being extremely sensitive to noise. So, smoothing the image before a laplacian improves the results we get. This is done with a 5x5 image convolution kernel.

The kernel for the laplacial of gaussian operation

The result on applying this image convolution was:

The result of applying the laplacian of gaussian operator

Summary

You got to know about some important operations that can be approximated using an image convolution. You learned the exact convolution kernels used and also saw an example of how each operator modifies an image. I hope this helped!

from: http://aishack.in/tutorials/convolutions/

影象卷積：Image Convolutions

1. Convolutions

The Kernel

The transformation

Speed optimizations

Problematic corners and edges

Summary

2. Image convolution examples

Simple box blur

Gaussian blur

Line detection with image convolutions

Edge detection

The Sobel Edge Operator

The laplacian operator

The Laplacian of Gaussian

Summary

影象卷積：Image Convolutions

三維卷積：全景影象Spherical CNNs（Code）

多通道影象卷積基礎知識介紹

深度學習影象卷積後的尺寸計算公式

影象卷積、相關以及在MATLAB中的操作

多通道影象卷積

【Shader特效8】著色器濾鏡、影象卷積與濾波、數字影象處理

一維訊號卷積與影象卷積的區別

影象卷積網路模型彙總

卷積神經網路CNN（1）——影象卷積與反捲積（後卷積，轉置卷積）

[大神貼]卷積：如何成為一個很厲害的神經網路

深度學習---影象卷積與反捲積（最完美的解釋）

【轉】影象卷積與濾波及高斯模糊(gauss blur)的一些知識點

影象卷積操作的手動實現（基於opencv的C++編譯環境）

關於影象卷積與caffe中卷積實現

OpenCV下利用傅立葉變換和逆變換實現影象卷積演算法,並附自己對於卷積核/模板核算子的理解!

cuda影象卷積

影象卷積的理解

opencv for python (13) 影象卷積及影象平滑（平均、高斯模糊、中值模糊、雙邊濾波）

卷積神經網路_影象卷積解釋

影象卷積：Image Convolutions

1. Convolutions

The Kernel

The transformation

Speed optimizations

Problematic corners and edges

Summary

2. Image convolution examples

Simple box blur

Gaussian blur

Line detection with image convolutions

Edge detection

The Sobel Edge Operator

The laplacian operator

The Laplacian of Gaussian

Summary

相關推薦