Machine learning; Pattern recognition; Artificial vision; Object recognition; Object detection; Scene interpretation; Complex prior knowledge
Dubout Charles, Fleuret Francois (2014), Adaptive Sampling for Large Scale Boosting, in Journal of Machine Learning Research
, 15, 1431-1453.
Dubout Charles, Fleuret Francois (2013), Accelerated Training of Linear Object Detectors, in Proceedings of theIEEE international conference on Computer Vision and Pattern Recognition Workshops
, Portland, OregonIEEE, New-York.
Dubout Charles, Fleuret Francois (2013), Deformable Part Models with Individual Part Scaling, in Proceedings of the British Machine Vision Conference
, BMVA, England.
Dubout Charles, Fleuret Francois (2012), Exact Acceleration of Linear Object Detectors, in Proceedings of the European Conference on Computer Vision
, FirenzeSpringer, Berlin Heidelberg.
Dubout C., Fleuret F. (2011), Boosting with Maximum Adaptive Sampling, in Proceedings of the international conference on Neural Information Processing Systems
, Granada, SpainProceedings of the Neural Information Processing Systems Conference (NIPS), n.a..
Fleuret F., Li T., Dubout C., Wampler E. K., Yantis S., Geman D. (2011), Comparing machines and humans on a visual categorization test, in Proceedings of the National Academy of Sciences (PNAS)
, 108(43), 17621-1762.
Dubout C., Fleuret F. (2011), Tasting Families of Features for Image Classification, in Proceedings of the IEEE International Conference on Computer Vision
, Barcelona, SpainProceedings of the IEEE International Conference on Computer Vision (ICCV), n.a..
Fleuret F., Abbet P., Dubout C., Lefakis L., The MASH project, in Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge
, Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge , n.a..
The goal of the research in machine learning has been so far to avoid the necessity for a detailed hand-designed prior knowledge. This proposal takes the opposite stance and aims precisely at combining a very large number of heuristics in a statistical framework. It is an attempt at bootstrapping an academic interest for the development of structurally complex priors for applied machine learning.We define an heuristic to be any feature extractor, an algorithm processing the available raw signal to produce values relevant to the problem at hand. This purposely general definition encompasses techniques spanning from simple rules to symbolic modeling or unsupervised locally trained predictors. We assume that high performance can only be achieved by combining thousands of such hand-designed modules, and propose to develop these heuristics in an open and web-based collaborative framework similar to the successful development process of open-source software and collaborative encyclopedia. As many contributors will be involved, the resulting system will benefit from the completeness andredundancy of as many idiosyncratic viewpoints.Although of potential interest to many applications, we study this approach for the problem of scene interpretation. Given an image with every day objects and furniture or an outdoor landscape, the goal is to detect and identify as many objects as possible. Solving this task requires to design aclassifier able to recognize the identity of an isolated object. We propose to combine a multitude of feature extractors capturing different aspects of the image to create this multi-class predictor.The research to be tackle in this project can be structured in two main axis:1. The development of a machine learning technique to aggregate the heuristic responses for object classifcation. We will have to handle the classical difficulties arising in such situation of very large dimension, with the additional difficulty of dealing with an highly heterogeneous feature space. Our initial choice will be forests of decision trees, which have many properties, statistical and algorithmic, desirable for that project.2. The study of feedback tools to help and motivate the design of heuristics. Such tools will have to produce a ranking of the heuristics to provide the contributors with both a good estimate of their own progress, a good perception of their overall ranking in the community of contributors, and an assistance in figuring what are the main weakness of the current set of heuristics.Performance will be measured on real images such as the Caltech 256 or the VOC dataset to allow for comparison with state-of-the-art techniques, and on computer-generated images, which can be produced along an exhaustive labeling with very limited resources.The approach described here aims at creating a novel research field of "heuristic mining". Instead of exploiting a fixed set of homogeneous features, trying to cope with ist incompleteness, research will be focused on developing strategies to motivate the development of very large feature sets.