Can one infer the movement of a person when we can predict what action the person is to perform? If yes, can we predict more accurate limb movement and poses of the person (i.e., we know the current pose/skeleton of the person)?
Pose prediction from a single image has become an almost solved problem given an image which captures the 2D projection of a 3D body. Without additional knowledge (e.g., depth information), the shape can be inferred easily with modern techniques. However, can we predict the movement of the pose/limbs of the person if we know what kind of action the person is going to perform? This is a more challening problem. There are dependencies between the current pose, his/her previous pose movement and future movements. How can you model these dependencies? Besides, you need to deal with sequences instead of one single pose. What is more chanllening is that can you model the uncertainties in the sequence. Everybody can dance, and each dance is unique. Can we model the uncertain/diverse dynamics of the motion?
We are positive that this task is possible, but by which certainty?
To find answers to these questions, we will apply data analysis and machine learning. Building on our experties [1,2] we will use deep learning techniques such as RNNs to model the pose/skeleton seuquences, and GANs to generate the corresponding person images. The project is structured into three tasks: First, it requires building a suitable dataset (Extract poses from videos of a known action, e.g., tachi, boxing, walking, running). Second, the core algorithm (e.e., reinforcement learning) that exploits the sequence diversities needs to be developed. Third, the estimation performance (e.g., accuracy) is to be validated.
 Chan, Caroline, et al. "Everybody dance now." arXiv preprint arXiv:1808.07371 (2018).
 Wang, Wei, et al. "Every smile is unique: Landmark-guided diverse smile generation." CVPR. 2018.
Back to the project list.
The candidate should have programming experience, ideally in Python. Previous experience with machine learning, computer vision, deep learning platforms (e.g., pytorch) is a plus.
20% Theory, 30% Implementation, 50% Research and Experiments