Example 3: Homie — Mixed Motion and Disturbances (Unitree H1)#
Homie is a more “composite” task that mixes velocity tracking, squatting (height control), and upper-body random disturbances.
Two task ids are provided:
Mjlab-Homie-Unitree-H1: the default version.Mjlab-Homie-Unitree-H1-with_hands: mounts Robotiq 2F85 grippers (and adds policy-free random gripper motion).
The core idea is: reduce the policy action space to the lower body, and treat the upper body (and optional grippers) as smooth, time-varying disturbances. This helps the policy keep robust leg locomotion under changing body poses.
Task skeleton: make_homie_env_cfg (base cfg)#
Path: src/mjlab/tasks/homie/homie_env_cfg.py
Homie provides two command generators, both supporting env-group gating:
twist (
UniformVelocityCommand): target base linear (x/y) and yaw velocities.height (
RelativeHeightCommand): target pelvis height relative to feet (squat motion).
# file: src/mjlab/tasks/homie/homie_env_cfg.py
commands = {
"twist": UniformVelocityCommandCfg(
...,
active_env_group="velocity",
rel_standing_envs=1.0 / 6.0,
avoid_consecutive_standing=True,
),
"height": RelativeHeightCommandCfg(
entity_name="robot",
active_env_group="squat",
# Smooth height-command transitions (avoid step changes at resampling).
interp_rate=0.02,
foot_site_names=(), # filled by robot override
ranges=RelativeHeightCommandCfg.Ranges(height=(0.6, 1.0)),
),
}
Env grouping: train three “subtasks” in one vectorized env#
Path: src/mjlab/tasks/homie/mdp/curriculums.py::assign_homie_env_groups
Homie partitions the vectorized env into three env groups (masks), and uses the group names in commands / rewards / curriculum for gating:
squat: ~20% (set_x < 1/5), focuses on height commands (squatting).standing: ~13.3% (1/5 <= set_x <= 1/3), focuses on “stand still” stability under disturbances.velocity: ~66.7% (set_x > 1/3), focuses on velocity tracking (walking/running).
Two key takeaways:
Command gating:
twisthasactive_env_group="velocity": non-velocity envs are forced totwist=0(standing).heighthasactive_env_group="squat": non-squat envs are set toinactive_height(filled by robot override), avoiding height commands “confusing” walking envs.
Reward gating: many reward terms specify
env_group=...to only activate on some groups (e.g., standing stabilization terms, squat-only geometric constraints).
H1 override: unitree_h1_homie_env_cfg#
Path: src/mjlab/tasks/homie/config/h1/env_cfgs.py::unitree_h1_homie_env_cfg
Homie still follows base cfg + robot-specific override. The H1 override mainly:
Switches to plane terrain and disables terrain curriculum (remove
terrain_levels).Splits actions: policy controls legs (hip/knee/ankle); upper-body motion is generated by a policy-free action (next section).
Binds commands to H1 foot geometry: fill
foot_site_namesfor the height command, and set squat/standing ranges +inactive_height:height_cmd.ranges.height = (0.4, 0.98)height_cmd.inactive_height = 0.98(keep a stable standing height outside squat envs)
Adds sensors and contact penalties: adds
self_collisionandhip_knee_ground_contactsensors, and wires thehip_knee_contactreward term.Configures feet “parallel” rewards: fills H1 foot corner sites (
*_foot_fi/fo/ri/ro) forfeet_ground_parallel/feet_paralleland reorders right-foot sites to match left/right local frames.Adds disturbances/randomization: step-scheduled external pushes, and a reset-time constant downward hand load (0–5kg equivalent,
hand_load).Optional with_hands version: when
hands=True, mounts 2F85 and adds a policy-freegripperaction with interval resampling (see below).
Core feature: UpperBodyPoseAction (policy-free, 0-dim action)#
Path: src/mjlab/tasks/homie/config/h1/env_cfgs.py
Besides the policy-controlled joint_pos action, H1 Homie adds upper_body_pose (policy action dim = 0):
0 policy dims: does not increase the neural network output size.
Smooth interpolation: maintains an internal pose target and moves toward it via
torch.lerpeach step.Periodic resampling: an
EventTermCfgperiodically samples a new goal pose (default: every 2 seconds).Optional rate limiting:
max_speed_rad_sclamps per-step target changes to avoid overly abrupt motion.
# Upper-body action config (policy-free)
cfg.actions["upper_body_pose"] = UpperBodyPoseActionCfg(
entity_name="robot",
joint_names=upper_body_joint_expr,
interp_rate=0.05,
max_speed_rad_s=1.0,
target_range=(-0.6, 0.6),
initial_ratio=0.0, # training starts at 0; play mode uses 1.0
use_sampled_ratio=True,
)
# Interval event: resample goals (range is larger but clamped by joint limits + ratio)
cfg.events["upper_body_random_targets"] = EventTermCfg(
func=_sample_upper_body_targets_with_curriculum,
mode="interval",
interval_range_s=(2.0, 2.0),
params={
"action_name": "upper_body_pose",
"target_range": (-3.0, 1.0),
"start_step": step_threshold,
},
)
Curriculum: gradually increase disturbance strength#
Path: src/mjlab/tasks/homie/mdp/curriculums.py
To avoid overwhelming early training, Homie uses upper_body_action_curriculum:
Performance-coupled: when the average
track_linear_velocityreward exceeds a threshold (e.g., 0.8), increase disturbance amplitude.Linear growth: ratio increases from 0 to 1.0.
cfg.curriculum["upper_body_action"] = CurriculumTermCfg(
func=mdp.upper_body_action_curriculum,
params={
"action_name": "upper_body_pose",
"reward_name": "track_linear_velocity",
"success_threshold": 0.8,
"increment": 0.05,
"max_ratio": 1.0,
"start_step": step_threshold,
},
)
Rewards & terminations: balancing mixed objectives#
Homie needs to balance “walk” and “squat” objectives while being robust to upper-body disturbances.
1) Rewards: decouple objectives via env groups#
Env-group gating:
many reward terms use
env_group=...so they only apply to some groupsH1 override adds extra standing stabilization (
track_*_standing) to reduce residual sway instanding
Regularizers for robustness:
knee_deviation_reward: penalize knee lateral deviation during squat, encouraging reasonable squatting postureupright: keep the torso upright (critical for resisting upper-body disturbances)feet_ground_parallel/feet_parallel: constrain feet orientation vs ground / between feet (requires per-robot corner site config)hip_knee_contact/self_collisions: penalize “bad contacts” via rewards instead of terminating too early
2) Terminations: looser coupling#
Relaxed posture limits: H1 has larger motion ranges and disturbances;
fell_overthresholds are typically less strict than smaller robots.Self-collision handling: Homie prefers to keep training signal via reward penalties rather than immediate termination for large-range motion.
H1 override and H1 constants#
Path: src/mjlab/asset_zoo/robots/unitree_h1/h1_constants.py
Homie uses H1-specific actuator parameters heavily:
Multiple actuator groups: H1 is split into
HIP_KNEE,ANKLE_TORSO, andARMgroups with different stiffness/damping.Automatic action scale: per-joint scaling computed from actuator
effort_limit / stiffness.
# Compute action scale automatically
for a in H1_ARTICULATION.actuators:
names = a.target_names_expr
for n in names:
H1_ACTION_SCALE[n] = 0.25 * a.effort_limit / a.stiffness
with_hands: gripper variant (policy-free)#
Path: src/mjlab/tasks/homie/config/h1/__init__.py and src/mjlab/tasks/homie/config/h1/env_cfgs.py
If you choose Mjlab-Homie-Unitree-H1-with_hands:
The robot config mounts 2F85 via
get_h1_robot_cfg(hands=...)(default mount config:_default_hands_cfg).The env adds a policy-free
gripperaction (0-dim) and an interval event that resamples gripper targets periodically (similar spirit to the upper-body action).
Why Homie is a good reference#
Homie is a great reference if you want to build tasks with:
Mixed objectives: velocity tracking + height control in one setup.
Partial actuation: policy controls only part of the body; the rest follows scripted / random targets.
Curriculum beyond domain params: dynamically changing action behavior (not just friction/mass randomization).