1. 程式人生 > >A tutorial on binary descriptors – part 2 – The BRIEF descriptor(轉)

A tutorial on binary descriptors – part 2 – The BRIEF descriptor(轉)

A tutorial on binary descriptors – part 2 – The BRIEF descriptor

Following the previous posts that provided both an introduction to patch descriptors in general and specifically to binary descriptors, it’s time to talk about the individual binary descriptors in more depth. This post will talk about the BRIEF[1] descriptor and the following post will talk about ORB[2], BRISK[3] and FREAK[4].

As you may recall from the previous post, a binary descriptor is composed out of three parts:

  1. A sampling pattern: where to sample points in the region around the descriptor.
  2. Orientation compensation: some mechanism to measure the orientation of the keypoint and rotate it to compensate for rotation changes.
  3. Sampling pairs: which pairs to compare when building the final descriptor.

Recall that to build the binary string representing a region around a keypoint we need to go over all the pairs and for each pair (p1,p2) – if the intensity at point p1 is greater than the intensity at point p2, we write 1 in the binary string and 0 otherwise.

Presented in 2010[1], BRIEF was the first binary descriptor published. It does not have an elaborate sampling pattern or an orientation compensation mechanism, which makes it easier to understand, thus also a good choice for the first descriptor to explain about.

As we’ll see next, BRIEF takes only the information at single pixels location to build the descriptor, so to make him less sensitive to noise we first smooth it by a Gaussian filter.

Now, as we mentioned earlier, BRIEF does not have a sampling pattern thus pairs can be chosen at any point on the SxS patch. To build a BRIEF descriptor of length n, we need to determine n pairs (Xi,Yi). Denote by X and Y the vectors of point Xi and Yi, respectively.

In [1] the authors consider five methods to determine the vectors X and Y:

  1. X and Y are randomly uniformly sampled.
  2. X and Y are randomly sampled using a Gaussian distribution, meaning that locations that are closer to the center of the patch are preferred.
  3. X and Y are randomly sampled using a Gaussian distribution where first X is sampled with a standard deviation of 0.04*S^2 and then the Yi’s are sampled using a Gaussian distribution – Each Yi is sampled with mean Xi and standard deviation of 0.01 * S^2.
  4. X and Y are randomly sampled from discrete location of a coarse polar gird.
  5. For each i, Xi is (0, 0) and Yi takes all possible values on a coarse polar grid.

I hope the following figures, which illustrates examples of the five sampling strategies will help clear up the definitions:

BRIEF - illustration of the five sampling pattern

BRIEF – illustration of the five sampling pattern

The following figure presents recognition rates using all the five sampling strategies. We can see the recognition rates are about the same, expect for the fifth sampling strategy that shows worse performance:

BRIEF descriptor - performance of the five sampling patterns

BRIEF descriptor – performance of the five sampling patterns

As with all the binary descriptors, BRIEF’s distance measure is the number of different bits between two binary strings which can also be computed as the sum of the XOR operation between the strings.

The next post will talk about ORB[2] which extends BRIEF by introducing an orientation compensation mechanism and learns the sampling pairs instead of using a random choice.

Gil.

References:

[1] Calonder, Michael, et al. “Brief: Binary robust independent elementary features.” Computer Vision–ECCV 2010. Springer Berlin Heidelberg, 2010. 778-792.‏

[2] Rublee, Ethan, et al. “ORB: an efficient alternative to SIFT or SURF.” Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.‏

[3] Leutenegger, Stefan, Margarita Chli, and Roland Y. Siegwart. “BRISK: Binary robust invariant scalable keypoints.” Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.‏

[4] Alahi, Alexandre, Raphael Ortiz, and Pierre Vandergheynst. “Freak: Fast retina keypoint.” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.‏