Guidance Parameters#

For calibration math, see Calibration Math.

Overview#

After the diffusion model predicts a pose sequence, a Levenberg-Marquardt optimizer refines it to be physically plausible and consistent with sensor readings. The optimizer runs twice: once between denoising steps (inner, 5 iterations) and once after sampling is complete (post, 20 iterations).

Shared Constraints (All Modes)#

These constraints are applied in every guidance mode, including pure egoallo:

Constraint	Weight	What it does
Pose prior	1.0	Keeps joint rotations close to the diffusion output
Torso prior	5.0	Keeps torso joint positions close to the diffusion output
Delta smoothness	10.0	Prevents large frame-to-frame changes in the optimizer correction
Velocity smoothness	5.0	Penalizes acceleration in joint rotations (reduces jitter)
Foot skating	30.0	Prevents foot joints from sliding when contact is predicted
Lower-body joint limits	50.0	Soft penalty for biomechanically implausible knee/ankle rotations (see Lower-Body Joint Angle Limits)

IMU Constraints (RoSHI Modes)#

These are added in roshi and roshi_ariahand modes:

Constraint	Weight	What it does
Local joint matching	5.0	Each IMU-equipped joint rotation should match its IMU reading
Pelvis rotation matching	5.0	Frame-to-frame pelvis rotation change should match the pelvis IMU
Body prior	0.1	Extra prior to keep IMU-guided poses near the diffusion output
Body smoothness	10.0	Extra temporal smoothing for body joints

Local joint matching minimizes the geodesic distance between predicted and IMU-derived parent-relative rotations:

\[\mathcal{L}_{\text{local}} = \sum_{j \in \mathcal{J}_{\text{IMU}}} \left\lVert \log\!\left( \hat{R}_{j}^{\top} \cdot R_{j}^{\text{IMU}} \right) \right\rVert^{2}\]

Pelvis rotation matching:

\[\mathcal{L}_{\text{pelvis}} = \sum_{t} \left\lVert \log\!\left( \Delta \hat{R}_{\text{pelvis}}^{\top}(t) \cdot \Delta R_{\text{pelvis}}^{\text{IMU}}(t) \right) \right\rVert^{2}\]

Body smoothness penalizes large frame-to-frame rotation changes:

\[\mathcal{L}_{\text{smooth}} = \sum_{t} \sum_{j} \left\lVert \log\!\left( \hat{R}_{j}(t)^{\top} \cdot \hat{R}_{j}(t{+}1) \right) \right\rVert^{2}\]

Lower-Body Joint Angle Limits#

The diffusion prior occasionally produces biomechanically impossible knee or ankle poses (hyperextended knees, severely twisted ankles). Two complementary mechanisms keep the lower body realistic:

A soft penalty inside the LM optimizer (used in every guidance mode), so small violations push the solution away from invalid regions while still respecting the IMU and prior terms.
A hard clamp applied once at inference time after the final pose has been recovered (see Step 4 in How to Postprocess the Data), which guarantees the saved body_quats never exceed the limits.

Joints and limits#

Limits are applied per axis on the log (axis-angle) of each joint’s parent-relative rotation. Indices follow the SMPL-H 21-joint body convention (root excluded):

Joint	Index	X (flex/ext)	Y (abd/add)	Z (int/ext rot)
Left / Right knee	3, 4	no hyperextension (X ≥ 0)	±5°	±10°
Left / Right ankle	6, 7	—	±25° (inv/eversion)	±20°

Soft penalty (optimizer)#

For each constrained joint \(j\) and axis \(a\) with limit \(\theta^{\max}_{j,a}\), the optimizer adds a one-sided hinge residual on the log-map components \(\boldsymbol{\omega}_{j} = \log \hat{R}_{j}\):

\[\mathcal{L}_{\text{limit}} = w_{\text{limit}} \sum_{t} \Big[ \sum_{j \in \{\text{knee}\}} \max(-\omega_{j,x}(t),\, 0)^{2} + \sum_{j,\,a \in \{y,z\}} \max\!\big(|\omega_{j,a}(t)| - \theta^{\max}_{j,a},\, 0\big)^{2} \Big]\]

with \(w_{\text{limit}} = 50.0\). The knee X term penalises only negative values (hyperextension); the abduction/rotation axes are penalised symmetrically once the limit is exceeded, so motions that stay within the box incur zero cost.