Towards versatile 3D human motion prediction in the wild
Xu, Sirui
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/120448
Description
Title
Towards versatile 3D human motion prediction in the wild
Author(s)
Xu, Sirui
Issue Date
2023-05-02
Director of Research (if dissertation) or Advisor (if thesis)
Wang, Yuxiong
Gui, Liangyan
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Human Motion Prediction
Abstract
Being able to “look into the future” is a remarkable cognitive hallmark of humans. For example, humans can naturally anticipate how people move or act in the near future, based on their historical movements, even in a complex real-world scenario in the wild, which poses a critical challenge for machines to replicate. On the contrary, the state-of-the-art human motion forecasting method often focuses on simplified scenarios, e.g., predicting future motion of a single person in a deterministic way.
This thesis endeavors to develop novel techniques that enable machines to anticipate human motion while considering real-world complexities. Our work is grounded in two fundamental insights: First, human motion prediction inherently involves uncertainty and multi-modality, especially in long-term forecasting. Second, such uncertainty does not suggest complete randomness in human movements; instead, they are highly dependent on the environment and its changes. To tackle these challenges, we integrate diverse generation and environment-aware prediction into various scenarios.
We commence by investigating the prediction of diverse single-person motion. Our key insight is that future human motions are not completely random or independent, but rather exhibit deterministic properties consistent with physical laws and constraints. Based on this observation, we propose anchor-based representations that encode human motion in the latent space using deterministic and learnable components. These anchors have been trained to specialize and diversify for different modes of future motion, enabling us to generate more diverse and accurate predictions with only a few additional parameters.
We then move on to introduce a task that considers the impact of social interactions on the diversity of future human poses, simultaneously considering the social aspects of multi-person interaction, the realism, and the diversity of human motion. The cumulative difficulties inherent in this task motivate us to adopt a divide-and-conquer strategy that we instantiate as a dual-level generative modeling framework. We demonstrate that various multi-person predictors and generative models can operationalize this general framework, leading to consistent improvements in both accuracy and diversity.
Humans interact not only with other humans but also with the surrounding environment. With this in mind, we develop a novel task of anticipating 3D human-object interactions (HOIs), considering the dynamics of a general object. Our proposed framework injects prior to the interaction that it follows a simple pattern at reference contact points. We show that the framework can model objects with various shapes and ensure physically valid interactions.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.