NavneetDalal关于HOG行人识别的博士论文

一般编程问题

下载此实例

开发语言：Others
实例大小：19.22M
下载次数：3
浏览次数：167
发布时间：2020-07-24
实例类别：一般编程问题
发布人：robot666
文件格式：.pdf
所需积分：2

网友评论举报投诉收藏该页

下载此实例

实例介绍

【实例简介】
HOG行人识别方法的发明者NavneetDalal写的博士论文，比他在IEEE发表的文章详细的多，是学习HOG算法的最佳教材。
Resume Cette these propose une solution pour la detection de personnes et de classes d'objet dans des images et videos. Le but principal est de developper des representations robustes et discrim- inantes de formes visuelles, qui permettent de decider si un objet de la classe apparait dans une region de l'image. Les decisions sont basees sur des vecteurs de descripteurs visuels de di mension levee extraits des regions. Afin d'avoir une comparaison objective des differents en- sembles de descripteurs, nous apprenons une regle de decision pour chaque ensemble avec un algorithme de type machine a vecteur de support lincaire Pilote enticrement par les donnees, notre approche se base sur des descripteurs d'apparence et de mouvement de bas niveau sans utiliser de modele explicite pour lobjct a detecter Dans la plupart des cas nous nous concen trons sur la detection de personnes -classe difficile, frequente et particulierement interessante dans applications come Analyse de film et de video, la detection de pietons pour la conduite assistee ou la surveillance Cependant, notre methode ne fait pas d'hypothese forte sur la classe a reconnaitre et elle donne egalement des resultats satisfaisants pour d'autres classes comme le voitures, les motocyclettes, les vaches et les moutons Nous apportons quatre contributions principales au domaine de la reconnaissance visuelle D'abord, nous presentons des descripteurs visuels pour la detection d'objets dans les images statiques: les grilles d histogrammes d'orientations de gradients dimage(en anglais, HOG Histogrammes of Oriented Gradients). Les histogrammes sont evalues sur une grille de blocs atiaux, avec une forte normalisation locale. Cette structure assure a la fois une bonne car- acterisation de la forme visuclle locale de objet et la robustesse aux petites variations de po sition, d'orientation spatiale, d'illumination locale et de couleur. Nous montrons que la combi naison de gradients peu lisses, une quantification fine de orientation et relativement grossiere de l'espace, une normalisation forte de l'intensite, et une methode evoluee de re-apprentissage des cas difficiles permet de reduire le taux de faux positifs par un a deux ordres de grandeur par rapport aux methodes precedentes Deuxiemement, afin de detecter des personnes dans les videos, nous proposons plusieurs descripteurs de mouvement bases sur le flot optique. Ces descripteurs sont incorpores dans I'approche precedente. Analogues aux HOG statiques, ils substituent aux gradients d'image statique les differences spatiales du flot optique dense. L'utilisation de differences minimise Iinfluence du mouvement de la camera et du fond sur les detections. Nous evaluons plusieurs variations de cette approche, qui codent soit les frontieres de mouvement(motion boundaries), soit les mouvements relatifs des paires de regions adjacentes L'incorporation du mouvement reduit le taux de faux positifs d'un ordre de grandeur par rapport a l'approche precedente Troisiemement, nous proposons une methode generale pour combiner les detections multi- oles basees sur I'algorithme"mean shift" pour estimer des maxima de densite a base de noyaux L'approche tient compte du nombre, de la confiance et de l'echelle relative des detections Finalement, nous presentons un travail en cours sur la fagon de creer de un detecteur de personnes a partir de plusieurs detecteurs de parties-en occurrence le visage, la tete, le torse, Abstract This thesis targets the detection of humans and other object classes in images and videos. Our focus is on developing robust feature extraction algorithms that encode image regions as high dimensional feature vectors that support high accuracy object /non-object decisions. To test our feature sets we adopt a relatively simple learning framework that uses linear Support Vector Machines to classify each possible image region as an object or as a non-object. The approach is data-driven and purely bottom-up using low-level appearance and motion vectors to detect objects. As a test case we focus on person detection as people are one of the most challenging object classes with many applications, for example in film and video analysis, pedestrian de tection for smart cars and vidco surveillance. Nevertheless we do not make any strong class specific assumptions and the resulting object detection framework also gives state-of-the-art performance for many other classes including cars, motorbikes, cows and sheep This thesis makes four main contributions. Firstly, we introduce grids of locally normalised Histograms of Oriented Gradients(HOg)as descriptors for object detection in static images The HoG descriptors are computed over dense and overlapping grids of spatial blocks, with image gradient orientation features extracted at fixed resolution and gathered into a high locations and directions, and significant changes in image illumination and colour, while e dimensional feature vector. They are designed to be robust to small changes in image contou maining highly discriminative for overall visual form. We show that unsmoothed gradients, fine orientation voting, moderately coarse spatial binning, strong normalisation and overlap- ping blocks are all needed for good performance. Secondly, to detect moving humans in videos, we propose descriptors based on oriented histograms of differential optical flow. These are sim- ilar to static HOG descriptors, but instead of image gradients, they are based on local differ entials of dense optical flow. They encode the noisy optical flow estimates into robust feature vectors in a manner that is robust to the overall camera motion. Several variants are proposed some capturing motion boundaries while others encode the relative motions of adjacent image regions. Thirdly, we propose a general method based on kernel density estimation for fusing multiple overlapping detections, that takes into account the number of detections, their confi dence scores and the scales of the detections. Lastly, we present work in progress on a parts bascd approach to person dctection that first detects local body parts like heads, torso, and legs and then fuses them to create a global overall person detector Acknowledgements First and foremost I want to express my gratitude and thanks to my thesis directors, Bill Triggs and Cordelia Schmid. I feel Bill has been an ideal adviser. He has been a great mentor, a col laborator, a guide, and a friend. This thesis would not have been completed without his com mitment and diligent efforts which not only influenced the content of the thesis but also the language in which it has been conveyed. He not only showed how to exhaustively explore dif ferent methodologies and analyse results, but also imbibed in me the need for perfection not only when performing research but also when communicating the results. I am also grateful to Cordelia who always encouraged my initiatives to make the PhD research more applicable. I thank her for offering me the opportunity to do a Phd on such an interesting and challeng ing topic and providing me the platform for this thesis. She has also been very supportive for letting me take time out of my thesis and venture into writing a book. I can never forget the support Bill and Cordelia showed towards the fag end of my thesis when they learned about my job offer and the constraint of finishing the remaining part of research and the dissertation n two months time. I would promptly receive detailed reviews, comments and corrections of the chapters despite the amount of work at their end, reflecting their commitment and devotion Thanks to my thesis rapporteurs, Martial Hebert and Luc Van Gool, for earnestly reading the thesis and their detailed comments on the manuscript in a relatively short time. I would like to thank Shai Avidan, my thesis examiner, for his interest in my work and James Crowly, the thesis committee president, for making his time available for me I am also grateful to Radu Horaud, my MSc thesis adviser. Radu has always been very open and a great support. He encouraged me to go to Siemens Corporate Research for summer nternship and collaborate with Dorin Comaniciu, from whose association I have tremendously benefited, mostly from Dorin's combination of mathematical rigour, his intuitive ideas and his original view I would like to thank all my friends and colleagues from LEAr and MoVI for making IN- RIA a fun and interesting place to work Much respect to my office mates Ankur, Matthijs and Guillaume for putting up with me during these past three years. I shared many interesting discussions and work time fun with them. The help received from them both personally and professionally is also immense, and Matthijs aide has not stopped pouring even after my shift from Grenoble. I also want to thank Srikumar, Gyuri, Eric, Joost, Jakob, Vittorio, Marcin, Di ane, Stan, Cristi, Joao, David, Magda, Marta, Edmond, Remi, Federic, Dana, Jianguo, and Pau for the lively working atmosphere and interesting lunchtime discussions. I would also like to thank anne for bearing with me for all the administrative goof ups I can never forget my good friends Peter, Irina, Trixi and Markus for making my stay at Grenoble so much more pleasant. They introduced me to the French culture and language 8 though none of them are french themselves! They were like a second family to me and were always ready to help, guide, and"feed"me. I still can not help missing delicious Tiramisu that Irina used to bake I cannot end without thanking my parents and my brother, for their absolute confidence in me. My final words go to Priyanka, my wife. In my line of work spouses suffer the most. Thank ou for putting up with my late hours, many spoiled weekends and vacations, and above all for staying by my side A conclusion is simply the place where someone got tired of thinki Arthur block 【实例截图】
【核心代码】

标签：

实例下载地址