Poster
Dual Reciprocal Learning of Language-based Human Motion Understanding and Generation
CHEN LIANG · Zhicheng Shi · Wenguan Wang · Yi Yang
Language-based human motion understanding focuses on describing human motions using natural language descriptions. Conversely, human motion generation aims to generate human motions from textual inputs. Despite significant progress in both fields, further advancements are hindered by two primary challenges: (i) Both tasks heavily rely on vast amounts of paired motion-language data for model training. However, human labeling is costly, making it increasingly unsustainable as model scales increase. (ii) Existing models often learn the two tasks in parallel. The strong reciprocity between them has not been fully explored. In response, this work proposes Dual Reciprocal Learning (DRL) for language-based human motion understanding and generation. DRL establishes a symmetric learning framework where both tasks collaboratively evolve in a closed-loop, bootstrapping manner, effectively leveraging the reciprocity between them. In DRL, the tasks serve as evaluators for each other, enabling the generation of informative feedback signals even with easily acquired unidirectional motion or language data. Furthermore, to mitigate dataset-specific bias in existing evaluations, we propose a generalized protocol that extends evaluation to a general-domain cross-modal feature space. Experimental results on standard benchmarks demonstrate that DRL achieves remarkable performance boosts over state-of-the-art models in both tasks. Our code will be made publicly available.
Live content is unavailable. Log in and register to view live content