Project Page

Human2Humanoid: Physics-Aware Cross-Morphology Motion Retargeting for Humanoid Robots

Unsupervised human-to-humanoid motion transfer with skeleton-aware learning, morphology-invariant end-effector consistency, and deployability-oriented physical constraints.

Tianchen Huang Feiyang Yuan Junchi Gu Shurui Fang Xiaohu Zhang Yu Wang Wei Gao Shiwu Zhang

Institute of Humanoid Robots, Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China

Paper arXiv Video Code (Coming Soon)

Abstract

Retargeting human motion to humanoid robots is critical for teleoperation, imitation learning and human-robot interaction. However, it remains challenging because of substantial morphological discrepancies between humans and robots, including differences in skeletal topology, limb proportions and degrees of freedom, as well as the scarcity of paired motion data. This paper presents Human2Humanoid, an unsupervised motion retargeting framework that transfers human motions to humanoid robot behaviors with high fidelity.

To bridge the domain gap under unpaired data, we adopt a CycleGAN-based architecture equipped with a skeleton-aware graph convolutional network to capture topology-dependent motion features. To address cross-domain scale mismatches, we introduce a morphology-invariant end-effector consistency loss that aligns normalized end-effector trajectories to preserve motion semantics across embodiments. To improve physical plausibility and reduce contact artifacts, we impose explicit physics-aware feasibility constraints to encourage reproduction of the contact patterns in the source motion. Experimental results show that the proposed method successfully retargets human motion to the Unitree G1 humanoid robot without paired data, and outperforms existing methods in both downstream controllability and physical feasibility.

Method

Human2Humanoid learns bidirectional mappings between unpaired human and robot motion domains while keeping the generated robot motion semantically faithful and physically trackable.

Overview of the Human2Humanoid framework — Human2Humanoid combines skeleton-aware motion translation with morphology-invariant and physics-aware objectives.

Skeleton-aware generators

Graph convolutional layers operate on each embodiment's native kinematic topology instead of flattening poses into joint vectors.

Unpaired domain transfer

A CycleGAN-style objective learns human-to-robot and robot-to-human mappings without frame-wise motion correspondence.

Morphology-invariant semantics

End-effector trajectories are normalized by embodiment-specific scale, preserving hands and feet motion semantics across different body proportions.

Physics-aware feasibility

Contact, foot-height and joint-limit constraints reduce foot skating, floating, penetration, and unsafe robot configurations.

Results

Evaluation is conducted on human motions from Motion-X and Unitree G1 motions from PHUMA, using a fixed downstream tracking policy for fair comparison.

88.5% Avg. Success Rate

0.12 Avg. Tracking Error

4.7% Avg. Foot Skating

0.05 cm Avg. Ground Penetration

Method	SR (%) ↑	TE ↓	FS (%) ↓	GP (cm) ↓
GMR	86.9	0.14	6.8	0.12
PHC	32.7	0.22	1.4	0.11
Unitree Retarget	71.2	0.19	11.1	0.35
Human2Humanoid	88.5	0.12	4.7	0.05

Qualitative comparison between Human2Humanoid and optimization-based baselines — Qualitative comparison with optimization-based retargeting baselines.

BibTeX

@misc{huang2026human2humanoid,
  title  = {Human2Humanoid: Physics-Aware Cross-Morphology Motion Retargeting for Humanoid Robots},
  author = {Tianchen Huang and Feiyang Yuan and Junchi Gu and Shurui Fang and Xiaohu Zhang and Yu Wang and Wei Gao and Shiwu Zhang},
  year   = {2026},
  eprint = {2606.03476},
  archivePrefix = {arXiv},
  primaryClass = {cs.RO},
  url    = {https://arxiv.org/abs/2606.03476}
}