Performance capture techniques have enabled a breadth of applications, ranging from novel viewpoint synthesis and motion editing in cinematography ; real-time game and VR interfaces ; as well as performance analysis in sports and medicine . The most recent approaches leverage deep learning and large datasets with millions of annotated images. We have started to explore self-supervised learning approaches [3,4]. These learn properties about human pose and shape from multiple images without having to provide annotations during training.
We could already show promising results in [3,4] and the video above, but there remains much to explore. How can we extract body proportion and texture from the unsupervised representations? Currently, we only get the articulated skeleton pose. Can it somehow work for multiple persons? The current training set is rather small, how will it scale to more training and test subjects?
To find answers to these questions, we want to develop new and extend existing deep learning techniques. Depending on your background and preferences, you will explore one of the directions mentioned above.
 Alldieck et al. "Video Based Reconstruction of 3D People Models", project page
 Rhodin et al. "EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras", project page
 Rhodin et al. "Learning Monocular 3D Human Pose Estimation from Multi-view Images", project page
 Rhodin et al. "Unsupervised Geometry-Aware Representation Learning for 3D Human Pose Estimation", project page
Back to the project list.
The candidate should have programming experience, ideally in Python. Previous experience with machine learning and computer vision is a plus.
30% Theory, 30% Implementation, 40% Research and Experiments