Bagautdinov T, Wu C, Saragih J, Fua P, Sheikh Y (2018), Modeling Facial Geometry using Compositional VAEs, in Conference on Computer Vision and Pattern Recognition
Bagautdinov T., Alahi A., Fleuret F., Fua P., Savarese S. (2017), Social Scene Understand- ing: End-To-End Multi-Person Action Localization and Collective Activity Recognition, in Conference on Computer Vision and Pattern Recognition
Bagautdinov T., Fleuret F., Fua P. (2015), Probability Occupancy Maps for Occluded Depth Images, in Conference on Computer Vision and Pattern Recognition
Tracking people and recovering their 3D motion is one of the most difficult and challenging problems in Computer Vision. Today, there is great interest in capturing complex motions solely by analyzing video sequences, both because cameras are becoming ever cheaper and more prevalent and because there are so many potential applications. These include athletic training, surveillance, entertainment, and electronic publishing.During the first year of the ongoing project we have focused on using a single Kinect depth camera. We have shown that we could successfully extend our earlier multi-camera approach to computing probabilities that people are present in the scene at any given time using input from a single depth-camera and in such a way that occlusions are correctly handled. During the remainder of the ongoing project, we will work on linking these detections across time and inferring temporally consistent 3D poses. In the project continuation we are now requesting, we will return to using an ordinary video camera while applying the lessons learned to achieve the same result, but without requiring the additional depth information. This will allow us to operate in settings where depth-cameras do not work well, such as outdoors or at depth ranges beyond their capabilities.The approach we propose to investigate relies on the idea that, instead of using multiple cameras to resolve ambiguities as we did in earlier work, we can use images taken by a single camera over time to achieve the same result by enforcing temporal continuity.