This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/124589
Description
Title
Human motion synthesis and compression
Author(s)
Li, Zhengyuan
Issue Date
2024-04-30
Director of Research (if dissertation) or Advisor (if thesis)
Gui, Liangyan
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
human motion
Abstract
The synthesis of human motion plays a pivotal role in applications ranging from character animation to autonomous driving. Recent advances in human motion synthesis are driven by powerful denoising diffusion models and transformer architectures. This thesis explores two fundamental challenges in human motion synthesis: designing effective architectural frameworks and developing motion compression components with strong reconstruction capabilities and well-conditioned latent spaces. While the current transformer architectures are predominantly temporal-focused, the spatial structure is inherent in human body. we introduce Positional Mask-Guided Spatial-Temporal Fusion (\ours) -- a novel approach to modeling human motion in a bi-dimensional manner, thus enabling a more nuanced generation of human behavior. Specifically, we design a spatial-temporal transformer architecture with homogeneous and symmetric dual branches for learning representations from human motion sequences. To facilitate the refined interplay between spatial and temporal features, we propose positional masks to guide the fusion process. Extensive experiments demonstrate the state-of-the-art performance of \ours across tasks and datasets. Efficiently compressing human motion sequences allows for a significant reduction in computational overhead and facilitates more complex analyses and synthesis in constrained environments. In order to build an effective two-person motion compression model, researchers should identify the crucial loss terms, adapt adequate network architecture, and control the variance in the latent space. Through exhaustive experiments, the thesis offers deep insights into the optimal design of motion compression systems for future applications. From those two aspects, the work paves the way for the future research in human motion synthesis.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.