Variational methods have revolutionized the field of computer vision and image analysis. In particular, these methods have proven useful in performing segmentation in images and image sequences, as well as in the video-based tracking of objects. Our objective in this project is to advance the state-of-the art of variational methods in this context. Our specific goal is to develop and implement a variational framework which deals simultaneously with the issues of image sequence segmentation and object behaviour classification, thus fusing the domains of image analysis and image understanding. The motivation for the proposed fusion lies in the hypothesis that the prior knowledge relevant to each of these two tasks can help the resolution of the other task. Thus, segmentation and behavior classification will be performed cooperatively for a given image sequence. More concretely, segmentation will supply its results to classification, ensuring that they are consistent with prior knowledge, while classification will furnish dynamic probabilistic priors to guide segmentation. These priors will be based on learning from training data and will adapt dynamically, based on segmentations of earlier images in the sequence.
For image segmentation, we will use a variational approach. The variational segmentation framework is well-suited for the inclusion of a priori information, as has already been demonstrated in the literature by the use of shape priors, allowing accurate segmentation in the presence of noise and occlusion. In our case, the prior information will be supplied by the classification. For object behavior classification, which is actually a task of dynamical sequence classification, we will exploit techniques from the field of machine learning. A popular approach for this task, that we will adopt in this project, is the use of Hidden Markov Models (HMMs), in which the objects (their attributes) resulting from image segmentation constitute observations, while the unknown classes of object behavior correspond to the hidden states. Classification amounts to determining the sequence of hidden states corresponding to an observation sequence and can be computed using the Viterbi algorithm.
The key idea in our framework is to interweave the two processes - classification and segmentation, while iterating through the given image sequence. For each new image, we will perform segmentation using dynamic priors offered by classification. Since we dispose of different priors for each behavior class, they will be introduced in the segmentation energy in a competition approach. In this way, segmentation will converge towards the best matching object and class, within image based constraints.Consequently, this object (its attributes) will be used in the next step of the classification, and so on, until the whole sequence is segmented and classified. As a natural extension of this framework, we will investigate an alternative solution to the problem of joint classification and segmentation, which has the potential of finding a more global solution. The difference is that, instead of performing the sequential segmentation of the image sequence, while using priors resulted from partial classification results, we will perform parallel segmentation of all images in the sequence, guided by a global prior aiming at the classification of object behavior throughout the whole image sequence.
We estimate that this project will have both a theoretical impact, in the field of image analysis with variational methods, due to the novelty of the fusion approach between segmentation and classification, and a practical impact, through a wide range of applications in the area of motion or event recognition, such as gesture recognition or surveillance through activity recognition. In particular, we will demonstrate this in the case of sign language recognition.