影象處理與計算機視覺:基礎,經典以及最近發展(5)計算機視覺
Last update: 2012-6-7
這一章是計算機視覺部分,主要側重在底層特徵提取,視訊分析,跟蹤,目標檢測和識別方面等方面。對於自己不太熟悉的領域比如攝像機標定和立體視覺,僅僅列出上google上引用次數比較多的文獻。有一些剛剛出版的文章,個人非常喜歡,也列出來了。
本章的下載地址:
1. Active Appearance Models
活動表觀模型和活動輪廓模型基本思想來源Snake,現在在人臉三維建模方面得到了很成功的應用,這裡列出了三篇最初最經典的文章。對這個領域有興趣的可以從這三篇文章開始入手。
[1998 ECCV] ActiveAppearance Models
[2001 PAMI] ActiveAppearance Models
2. Active Shape Models
[1995 CVIU]Active ShapeModels-Their Training and Application
3. Background modeling andsubtraction
背景建模一直是視訊分析尤其是目標檢測中的一項關鍵技術。雖然最近一直有一些新技術的產生,demo效果也很好,比如基於dynamical texture的方法。但最經典的還是Stauffer等在1999年和2000年提出的GMM方法,他們最大的貢獻在於不用EM去做高斯擬合,而是採用了一種迭代的演算法,這樣就不需要儲存很多幀的資料,節省了buffer。Zivkovic在2004年的ICPR和PAMI上提出了動態確定高斯數目的方法,把混合高斯模型做到了極致。這種方法效果也很好,而且易於實現。在OpenCV中有現成的函式可以呼叫。在背景建模大家族裡,無引數方法(2000 ECCV)和Vibe方法也值得關注。
[1997 PAMI] PfinderReal-Time Tracking of the Human Body
[1999 CVPR] Adaptivebackground mixture models for real-time tracking
[1999 ICCV] WallflowerPrinciples and Practice of Background Maintenance
[2000 ECCV] Non-parametricModel for Background Subtraction
[2000 PAMI] LearningPatterns of Activity Using Real-Time Tracking
[2002 PIEEE] Backgroundand foreground modeling using nonparametric kernel density estimation forvisual surveillance
[2004 ICPR] Improvedadaptive Gaussian mixture model for background subtraction
[2004 PAMI] Recursiveunsupervised learning of finite mixture models
[2006 PRL] Efficientadaptive density estimation per image pixel for the task of backgroundsubtraction
[2011 TIP] ViBe AUniversal Background Subtraction Algorithm for Video Sequences
4. Bag of Words
詞袋,在這方面暫時沒有什麼研究。列出三篇引用率很高的文章,以後逐步解剖之。
[2003 ICCV] Video Google AText Retrieval Approach to Object Matching in Videos
[2004 ECCV] VisualCategorization with Bags of Keypoints
[2006 CVPR] Beyond bags offeatures Spatial pyramid matching for recognizing natural scene categories
5. BRIEF
BRIEF是BinaryRobust Independent Elementary Features的簡稱,是近年來比較受關注的特徵描述的方法。ORB也是基於BRIEF的。
[2010 ECCV] BRIEF BinaryRobust Independent Elementary Features
[2011 ICCV] ORB anefficient alternative to SIFT or SURF
[2012 PAMI] BRIEFComputing a Local Binary Descriptor Very Fast
6. Camera Calibration and StereoVision
非常不熟悉的領域。僅僅列出了十來篇重要的文獻,供以後學習。
[1979 Marr] AComputational Theory of Human Stereo Vision
[1985] Computationalvision and regularization theory
[1987 IEEE] A versatilecamera calibration technique for high-accuracy 3D machine vision metrologyusing off-the-shelf TV cameras and lenses
[1987] ProbabilisticSolution of Ill-Posed Problems in Computational Vision
[1988 PIEEE] Ill-PosedProblems in Early Vision
[1989 IJCV] KalmanFilter-based Algorithms for Estimating Depth from Image Sequences
[1990 IJCV] RelativeOrientation
[1990 IJCV] Usingvanishing points for camera calibration
[1992 ECCV] Cameraself-calibration Theory and experiments
[1992 IJCV] A theory ofself-calibration of a moving camera
[1992 PAMI] Cameracalibration with distortion models and accuracy evaluation
[1994 IJCV] TheFundamental Matrix Theory, Algorithms, and Stability Analysis
[1994 PAMI] a stereomatching algorithm with an adaptive window theory and experiment
[1999 ICCV] Flexiblecamera calibration by viewing a plane from unknown orientations
[1999 IWAR] Markertracking and hmd calibration for a video-based augmented reality conferencingsystem
[2000 PAMI] A flexible newtechnique for camera calibration
7. Color and Histogram Feature
這裡面主要來源於影象檢索,早期的影象檢測基本基於全域性的特徵,其中最顯著的就是顏色特徵。這一部分可以和前面的Color知識放在一起的。
[1995 SPIE] Similarity ofcolor images
[1996 PR] IMAGE RETRIEVALUSING COLOR AND SHAPE
[1996] comparing imagesusing color coherence vectors
[1997 ] Image IndexingUsing Color Correlograms
[2001 TIP] An EfficientColor Representation for Image Retrieval
[2009 CVIU] Performanceevaluation of local colour invariants
8. Deformable Part Model
大紅大熱的DPM,在OpenCV中有一個專門的topic講DPM和latent svm
[2008 CVPR] ADiscriminatively Trained, Multiscale, Deformable Part Model
[2010 CVPR] Cascade ObjectDetection with Deformable Part Models
[2010 PAMI] ObjectDetection with Discriminatively Trained Part-Based Models
9. Distance Transformations
距離變換,在OpenCV中也有實現。用來在二值影象中尋找種子點非常方便。
[1986 CVGIP] DistanceTransformations in Digital Images
[2008 ACM] 2D EuclideanDistance Transform Algorithms A Comparative Survey
10. Face Detection
最成熟最有名的當屬Haar+Adaboost
[1998 PAMI] NeuralNetwork-Based Face Detection
[2002 PAMI] Detectingfaces in images a survey
[2002 PAMI] Face Detectionin Color Images
[2004 IJCV] RobustReal-Time Face Detection
11. Face Recognition
不熟悉,簡單羅列之。
[1991] Face RecognitionUsing Eigenfaces
[2000 PAMI] AutomaticAnalysis of Facial Expressions The State of the Art
[2000] Face Recognition ALiterature Survey
[2006 PR] Face recognitionfrom a single image per person A survey
[2009 PAMI] Robust FaceRecognition via Sparse Representation
12. FAST
用機器學習的方法來提取角點,號稱很快很好。
[2006 ECCV] Machinelearning for high-speed corner detection
[2010 PAMI] Faster andBetter A Machine Learning Approach to Corner Detection
13. Feature Extraction
這裡的特徵主要都是各種不變性特徵,SIFT,Harris,MSER等也屬於這一類。把它們單獨列出來是因為這些方法更流行一點。關於不變性特徵,王永明與王貴錦合著的《影象區域性不變性特徵與描述》寫的還不錯。Mikolajczyk在2005年的PAMI上的文章以及2007年的綜述是不錯的學習材料。
[1989 PAMI] On thedetection of dominant points on digital curves
[1997 IJCV] SUSAN—A NewApproach to Low Level Image Processing
[2004 IJCV] MatchingWidely Separated Views Based on Affine Invariant Regions
[2004 IJCV] Scale &Affine Invariant Interest Point Detectors
[2005 PAMI] A performanceevaluation of local descriptors
[2006 IJCV] A Comparisonof Affine Region Detectors
[2007 FAT] Local InvariantFeature Detectors - A Survey
[2011 IJCV] Evaluation ofInterest Point Detectors and Feature Descriptors
14. Feature Matching
[2012 PAMI] LDAHashImproved Matching with Smaller Descriptors
15. Harris
雖然過去了很多年,Harris角點檢測仍然廣泛使用,而且基於它有很多變形。如果仔細看了這種方法,從直觀也可以感覺到這是一種很穩健的方法。
[1988 Harris] A combinedcorner and edge detector
16. Histograms of OrientedGradients
HoG方法也在OpenCV中實現了:HOGDescriptor。
[2005 CVPR] Histograms ofOriented Gradients for Human Detection
NavneetDalalThesis.pdf
17. Image Distance
[1993 PAMI] ComparingImages Using the Hausdorff Distance
18. Image Stitching
影象拼接,另一個相關的詞是Panoramic。在Computer Vision: Algorithms and Applications一書中,有專門一章是討論這個問題。這裡的兩面文章一篇是綜述,一篇是這方面很經典的文章。
[2006 Fnd] Image Alignmentand Stitching A Tutorial
[2007 IJCV] AutomaticPanoramic Image Stitching using Invariant Features
19. KLT
KLT跟蹤演算法,基於Lucas-Kanade提出的配准算法。除了三篇很經典的文章,最後一篇給出了OpenCV實現KLT的細節。
[1981] An Iterative ImageRegistration Technique with an Application to Stereo Vision full version
[1994 CVPR] Good Featuresto Track
[2004 IJCV] Lucas-Kanade 20 Years On A Unifying Framework
Pyramidal Implementationof the Lucas Kanade Feature Tracker OpenCV
20. Local Binary Pattern
LBP。OpenCV的Cascade分類器也支援LBP,用來取代Haar特徵。
[2002 PAMI]Multiresolution gray-scale and rotation Invariant Texture Classification withLocal Binary Patterns
[2004 ECCV] FaceRecognition with Local Binary Patterns
[2006 PAMI] FaceDescription with Local Binary Patterns
[2011 TIP]Rotation-Invariant Image and Video Description With Local Binary PatternFeatures
21. Low-Level Vision
關於Low level vision的兩篇很不錯的文章
[1998 TIP] A generalframework for low level vision
[2000 IJCV] LearningLow-Level Vision
22. Mean Shift
均值漂移演算法,在跟蹤中非常流行的方法。Comaniciu在這個方面做出了重要的貢獻。最後三篇,一篇是CVIU上的top download文章,一篇是最新的PAMI上關於Mean Shift的文章,一篇是OpenCV實現的文章。
[1995 PAMI] Mean shift,mode seeking, and clustering
[2002 PAMI] Mean shift arobust approach toward feature space analysis
[2003 CVPR] Mean-shiftblob tracking through scale space
[2009 CVIU] Objecttracking using SIFT features and mean shift
[2012 PAMI] Mean ShiftTrackers with Cross-Bin Metrics
OpenCV Computer VisionFace Tracking For Use in a Perceptual User Interface
23. MSER
這篇文章發表在2002年的BMVC上,後來直接錄用到2004年的IVC上,內容差不多。MSER在Sonka的書裡面也有提到。
[2002 BMVC] Robust WideBaseline Stereo from Maximally Stable Extremal Regions
[2003] MSER AuthorPresentation
[2004 IVC] Robustwide-baseline stereo from maximally stable extremal regions
[2011 PAMI] Are MSERFeatures Really Interesting
24. Object Detection
首先要說的是第一篇文章的作者,Kah-Kay Sung。他是MIT的博士,後來到新加坡國立任教,極具潛力的一個老師。不幸的是,他和他的妻子都在2000年的新加坡空難中遇難,讓人唏噓不已。
最後一篇文章也是Fua課題組的,作者給出的demo效果相當好。
[1998 PAMI] Example-basedlearning for view-based human face detection
[2000 CVPR] A Statistical Method for 3D Object Detection Applied to Faces and Cars
[2003 IJCV] Learning theStatistics of People in Images and Video
[2011 PAMI] Learning toDetect a Salient Object
[2012 PAMI] A Real-TimeDeformable Detector
25. Object Tracking
跟蹤也是計算機視覺中的經典問題。粒子濾波,卡爾曼濾波,KLT,mean shift,光流都跟它有關係。這裡列出的是傳統意義上的跟蹤,尤其值得一看的是2008的Survey和2003年的Kernel based tracking。
[2003 PAMI] Kernel-basedobject tracking
[2007 PAMI] TrackingPeople by Learning Their Appearance
[2008 ACM] Object TrackingA Survey
[2008 PAMI] Segmentationand Tracking of Multiple Humans in Crowded Environments
[2011 PAMI] Hough Forestsfor Object Detection, Tracking, and Action Recognition
[2011 PAMI] Robust ObjectTracking with Online Multiple Instance Learning
[2012 IJCV] PWP3DReal-Time Segmentation and Tracking of 3D Objects
26. OCR
一個非常成熟的領域,已經很好的商業化了。
[1992 IEEE] Historical reviewof OCR research and development
Video OCR A Survey andPractitioner's Guide
27. Optical Flow
光流法,視訊分析所必需掌握的一種演算法。
[1981 AI] DetermineOptical Flow
[1994 IJCV] Performance ofoptical flow techniques
[1995 ACM] The Computationof Optical Flow
[2004 TR] TutorialComputing 2D and 3D Optical Flow
[2005 BOOK] Optical FlowEstimation
[2008 ECCV] LearningOptical Flow
[2011 IJCV] A Database andEvaluation Methodology for Optical Flow
28. Particle Filter
粒子濾波,主要給出的是綜述以及1998 IJCV上的關於粒子濾波發展早期的經典文章。
[1998 IJCV] CONDENSATION—ConditionalDensity Propagation for Visual Tracking
[2002 TSP] A tutorial onparticle filters for online nonlinear non-Gaussian Bayesian tracking
[2002 TSP] Particlefilters for positioning, navigation, and tracking
[2003 SPM] particle filter
29. Pedestrian and Human detection
仍然是綜述類,關於行人和人體的運動檢測和動作識別。
[1999 CVIU] Visualanalysis of human movement_ A survey
[2001 CVIU] A Survey ofComputer Vision-Based Human Motion Capture
[2005 TIP] Image changedetection algorithms a systematic survey
[2006 CVIU] a survey ofavdances in vision based human motion capture
[2007 CVIU] Vision-basedhuman motion analysis An overview
[2007 IJCV] PedestrianDetection via Periodic Motion Analysis
[2007 PR] A survey ofskin-color modeling and detection methods
[2010 IVC] A survey onvision-based human action recognition
[2012 PAMI] PedestrianDetection An Evaluation of the State of the Art
30. Scene Classification
當相機越來越傻瓜化的時候,自動場景識別就非常重要。這是比拼誰家的Auto功能做的比較好的時候了。
[2001 IJCV] Modeling theShape of the Scene A Holistic Representation of the Spatial Envelope
[2001 PAMI] Visual WordAmbiguity
[2007 PAMI] A ThousandWords in a Scene
[2010 PAMI] EvaluatingColor Descriptors for Object and Scene Recognition
[2011 PAMI] CENTRIST AVisual Descriptor for Scene Categorization
31. Shadow Detection
[2003 PAMI] Detectingmoving shadows-- algorithms and evaluation
32. Shape
關於形狀,主要是兩個方面:形狀的表示和形狀的識別。形狀的表示主要是從邊緣或者區域當中提取不變性特徵,用來做檢索或者識別。這方面Sonka的書講的比較系統。2008年的那篇綜述在這方面也講的不錯。至於形狀識別,最牛的當屬J Malik等提出的Shape Context。
[1993 PR] IMPROVED MOMENTINVARIANTS FOR SHAPE DISCRIMINATION
[1993 PR] PatternRecognition by Affine Moment Invariants
[1996 PR] IMAGE RETRIEVALUSING COLOR AND SHAPE
[2001 SMI] Shape matchingsimilarity measures and algorithms
[2002 PAMI] Shape matchingand object recognition using shape contexts
[2004 PR] Review of shaperepresentation and description techniques
[2006 PAMI] IntegralInvariants for Shape Matching
[2008] A Survey of ShapeFeature Extraction Techniques
33. SIFT
關於SIFT,實在不需要介紹太多,一萬多次的引用已經說明問題了。SURF和PCA-SIFT也是屬於這個系列。後面列出了幾篇跟SIFT有關的問題。
[1999 ICCV] Objectrecognition from local scale-invariant features
[2000 IJCV] Evaluation ofInterest Point Detectors
[2003 CVIU] Speeded-UpRobust Features (SURF)
[2004 CVPR] PCA-SIFT AMore Distinctive Representation for Local Image Descriptors
[2004 IJCV] DistinctiveImage Features from Scale-Invariant Keypoints
[2010 IJCV] ImprovingBag-of-Features for Large Scale Image Search
[2011 PAMI] SIFTflow DenseCorrespondence across Scenes and its Applications
34. SLAM
Simultaneous Localization and Mapping, 同步定位與建圖。
SLAM問題可以描述為: 機器人在未知環境中從一個未知位置開始移動,在移動過程中根據位置估計和地圖進行自身定位,同時在自身定位的基礎上建造增量式地圖,實現機器人的自主定位和導航。
[2002 PAMI] SimultaneousLocalization and Map-Building Using Active Vision
[2007 PAMI] MonoSLAMReal-Time Single Camera SLAM
35. Texture Feature
紋理特徵也是物體識別和檢索的一個重要特徵集。
[1973] Textural featuresfor image classification
[1979 ] Statistical andstructural approaches to texture
[1996 PAMI] Texturefeatures for browsing and retrieval of image data
[2002 PR] Brief review ofinvariant texture analysis methods
[2012 TIP] Color LocalTexture Features for Color Face Recognition
36. TLD
Kadal創立了TLD,跟蹤學習檢測同步進行,達到穩健跟蹤的目的。他的兩個導師也是大名鼎鼎,一個是發明MSER的Matas,一個是Mikolajczyk。他還創立了一個公司TLDVision s.r.o. 這裡給出了他的系列文章,最後一篇是剛出來的PAMI。
[2009] Online learning ofrobust object detectors during unstable tracking
[2010 CVPR] P-N LearningBootstrapping Binary Classifiers by Structural Constraints
[2010 ICIP] FACE-TLDTRACKING-LEARNING-DETECTION APPLIED TO FACES
[2012 PAMI]Tracking-Learning-Detection
37. Video Surveillance
前面兩個是兩個很有名的視訊監控系統,裡面包含了很豐富的資訊量,比如CMU的那個系統裡面的背景建模演算法也是相當簡單有效的。最後一篇是比較近的綜述。
[2000 CMU TR] A System forVideo Surveillance and Monitoring
[2000 PAMI] W4-- real-timesurveillance of people and their activitie
[2008 MVA] The evolutionof video surveillance an overview
38. Viola-Jones
Haar+Adaboost的弱弱聯手,組成了最強大的利器。在OpenCV裡面有它的實現,也可以選擇用LBP來代替Haar特徵。
[2001 CVPR] Rapid objectdetection using a boosted cascade of simple features
[2004 IJCV] RobustReal-time Face Detection