Calibration Math#

This page summarizes the rotation conventions used by the RoSHI calibration pipeline and the equations used to solve both sensor-to-bone calibration offsets and per-IMU world alignment.

Coordinate Frames#

Frame

Description

\(C\)

Camera frame using the OpenCV convention, with \(x\) right, \(y\) down, and \(z\) forward

\(B\)

Bone frame from the SMPL-X joint

\(T\)

AprilTag frame rigidly attached to the IMU

\(S\)

IMU sensor frame from the fused quaternion output

\(W_i\)

Per-IMU world frame, gravity aligned with arbitrary heading

\(W_p\)

Shared world frame using the pelvis IMU as reference

Offset Calibration#

The AprilTag is rigidly attached to the body segment. At time \(t\):

\[{}^{C}R_{T}(t) = {}^{C}R_{B}(t) \cdot {}^{B}R_{T}\]

Rearranging gives a frame-wise estimate of the constant offset:

\[{}^{B}R_{T}(t) = \left({}^{C}R_{B}(t)\right)^{\top} \cdot {}^{C}R_{T}(t)\]

RoSHI then computes the final calibration by minimizing the geodesic distance over all valid frames:

\[\widehat{{}^{B}R_{T}} = \arg\min_{R \in SO(3)} \sum_{t=1}^{N} \rho\left(d_g\left(R, {}^{B}R_{T}(t)\right)\right)\]

where the geodesic distance is:

\[d_g(R_1, R_2) = \left\lVert \log\left(R_1^{\top}R_2\right) \right\rVert\]

Supported Optimization Methods#

Method

Loss

Typical use

karcher (default)

Geodesic L2

Clean data with approximately Gaussian noise

huber

L2 near zero, L1 for outliers

Moderate outliers

cauchy

Cauchy / Lorentzian

Heavy outliers

l1

Geodesic L1

Median-like robustness

ransac

RANSAC plus Karcher refinement

Severely corrupted samples

Note

In our tests, there is no significant difference between the performance of these solvers. karcher tended to work slightly better overall. ransac can potentially be stronger but typically benefited from more tuning.

IMU-Only Pose Reconstruction#

Given the calibrated offset \({}^{B}R_{S}\) and a live IMU reading \({}^{W_i}R_{S}(t)\), the corresponding bone orientation is:

\[{}^{W_i}R_{B}(t) = {}^{W_i}R_{S}(t) \cdot \left({}^{B}R_{S}\right)^{\top}\]

World-frame alignment uses AprilTag observations to estimate \({}^{W_p}R_{W_i}\) so the individual IMU headings can be expressed in a shared pelvis-centric frame.

For each joint IMU, RoSHI first estimates the camera-to-IMU-world rotation:

\[{}^{C}R_{W_i}(t) = {}^{C}R_{T_i}(t) \cdot \left({}^{W_i}R_{T_i}(t)\right)^{\top}\]

These per-frame estimates are averaged over time, and the pelvis world is used as the shared reference:

\[{}^{W_p}R_{W_i} = \left(\overline{{}^{C}R_{W_p}}\right)^{\top} \cdot \overline{{}^{C}R_{W_i}}\]

The aligned bone orientation in the pelvis world then becomes:

\[{}^{W_p}R_{B}(t) = {}^{W_p}R_{W_i} \cdot {}^{W_i}R_{B}(t)\]

Tag-to-IMU Axis Mapping#

The rigid transform from the AprilTag frame to the IMU frame is:

\[\begin{split}{}^{T}R_{S} = \begin{bmatrix} 0 & -1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & -1 \end{bmatrix}\end{split}\]

Note

We get this mapping from the BNO085 IMU frame and AprilTag frame conventions.