3D Reconstruction ; Surface Modeling; Deformable Models ; Computer Vision
Bednarik Jan, Fua Pascal, Salzmann Mathieu (2018), Learning to Reconstruct Texture-Less Deformable Surfaces from a Single View, in
2018 International Conference on 3D Vision (3DV), VeronaIEEE, Conference Proceedings.
Texture-less Deformable Surfaces Dataset
The dataset features deformable objects with uniform albedo captured under varying lighting conditions using depth camera Microsoft Kinect Xbox 360. The main challenge is to recover a 3D shape of a deforming surface observed in a single RGB image. Each sample thus contain an RGB image and corresponding ground truth (GT) normal map and depth map. A small subset of the dataset also contains GT triangulated meshes. Please refer to our publication for a detailed description of the dataset.
Being able to recover the 3D shape of deformable surfaces from ordinary images will make it possible to field reconstruction systems that only require single video cameras, such as those that now equip most mobile devices. It will also allow 3D shape recovery in more specialized contexts, such as when performing endoscopic surgery or using a fast camera to capture the deformations of a rapidly moving object. However, because many different 3D shapes can have virtually the same projection, such monocular shape recovery is inherently ambiguous. Arguably, these ambiguities could be resolved by using a depth-camera. However, even though they are now readily available, such depth-cameras are more difficult and expensive to fit into a cell-phone or an endoscope, and have limited range.The video-based solutions that have been proposed over the years mainly fall into two classes: Those that involve physics-inspired models and those that rely on a non-rigid structure-from-motion approach. The former solutions often entail designing complex objective functions and require hard-to-obtain knowledge about the precise material properties of the target surfaces. The latter depend on points being reliably tracked in image sequences and are only effective for relatively small deformations.To overcome these limitations, we have developed approaches that rely on establishing correspondences between an input image in which the 3D shape is to be recovered and a reference image in which the 3D shape is known. We showed that 3D shape recovery under those conditions could be formulated as an under-constrained linear problem. Furthermore, introducing inextensibility constraints and using control points to reduce the dimensionality of the problem turns it into a well-posed problem, which can be solved quickly using convex optimization.These have proved effective when the surface is well-textured but we have shown in the final stretch of the ongoing project that we can obtain even better results by replacing sparse correspondences by dense ones. Not only are the results more accurate but we can handle surfaces that are only partially textured in the presence of substantial occlusions and illumination changes. However, this improvement comes at a price: Instead of being able to model from individual frames, we have to track from frame-to-frame because the objective function is no longer convex, which makes our algorithm prone to tracking failures.In the continuation of this project, we will leverage the power of Deep Learning approaches to remove this limitation and preserve the accuracy of dense matching without requiring a priori knowledge of an initial pose. This will involve two main research directions:- Learn a mapping from image to rough 3D shape. We will train Convolutional Neural Networks to propose multiple 3D shape hypotheses for image patches and recover the global 3D shape by enforcing spatial consistency over neighboring patches.- Refine the initial estimate in spite of potential occlusions. Since the space of all possible deformations is immense, we do not expect the resulting shapes to be particularly accurate. We will therefore further develop our approach to refining them by better accounting for occlusions and differences on how informative different parts of the surface are. This will result in robust and accurate 3D surface reconstruction algorithms that can truly be deployed in real-world applications.