vcad.
Back to MCP Tools
MCP Tools

Gym Tools

create_robot_env, gym_step, gym_reset, gym_observe, gym_close

The gym tools provide a reinforcement learning interface built on the Rapier3D physics engine. You create a simulation environment from a vcad assembly (which defines rigid bodies via parts and kinematic constraints via joints), then step the simulation with actions and observe the resulting state. The interface follows the OpenAI Gym convention: reset, step, observe, close.

create_robot_env

Creates a physics simulation environment from a vcad assembly document. The assembly must contain part instances and joints. Returns an env_id that identifies the environment for all subsequent calls.

documentDocumentrequired

IR document containing an assembly with part definitions, instances, and joints. The document must have at least one joint and a ground instance.

end_effector_idsstring[]required

Instance IDs to track as end effectors. The observation includes the 3D pose (position + orientation) of each listed instance at every timestep.

dtnumberoptional

Simulation timestep in seconds. Default 1/240 (0.00417s). Smaller timesteps increase accuracy but require more steps per second of simulated time.

substepsnumberoptional

Number of physics substeps per gym_step call. Default 4. Higher values improve stability for stiff joints and fast-moving parts at the cost of computation time.

max_stepsnumberoptional

Maximum episode length in steps. After this many steps, the done flag is set to true. Default 1000.

Return Value

{
  "env_id": "sim_1",
  "num_joints": 3,
  "action_dim": 3,
  "observation_dim": 18,
  "end_effector_ids": ["gripper"],
  "dt": 0.004166666666666667,
  "substeps": 4,
  "max_steps": 1000
}
FieldTypeDescription
env_idstringEnvironment identifier for subsequent tool calls
num_jointsnumberNumber of actuated joints in the assembly
action_dimnumberNumber of action values expected per step
observation_dimnumberTotal dimension of the observation vector
end_effector_idsstring[]Echoed back for confirmation
dtnumberActual timestep used
substepsnumberActual substep count
max_stepsnumberEpisode length limit

Physics Setup

The tool converts the vcad assembly into a Rapier3D simulation:

  • Each part instance becomes a rigid body with a convex collision shape derived from the part's mesh
  • Fixed joints create rigid constraints between bodies
  • Revolute joints become hinge joints with optional angle limits
  • Slider joints become prismatic joints with optional position limits
  • Cylindrical joints allow both rotation and translation along the same axis
  • Ball joints allow omnidirectional rotation
  • The ground instance is set as a fixed (kinematic) body that does not move
  • Gravity defaults to {x: 0, y: 0, z: -9.81} (Earth gravity in Z-up)

gym_step

Advances the simulation by one timestep with the provided action.

env_idstringrequired

Environment ID from create_robot_env.

action_typestringrequired

How to interpret the action values: "torque" (Newton-meters applied to joints), "position" (target joint angles in degrees or positions in mm), or "velocity" (target joint velocities in deg/s or mm/s).

valuesnumber[]required

Action values, one per actuated joint. The array length must equal action_dim from create_robot_env.

Action Types

Torque. Applies a raw torque (Nm for revolute joints) or force (N for slider joints) directly to each joint. This is the most physically accurate mode but requires the agent to handle PD control or similar stabilization.

Position. Sets target positions for each joint. The physics engine uses an internal PD controller to drive joints toward the targets. Units are degrees for revolute joints and millimeters for slider joints. This is the easiest mode for agents that want to command joint angles directly.

Velocity. Sets target velocities for each joint. The engine applies forces to achieve the requested speed. Units are deg/s for revolute joints and mm/s for slider joints.

Return Value

{
  "observation": {
    "joint_positions": [45.2, -12.8, 0.0],
    "joint_velocities": [1.2, -0.5, 0.0],
    "end_effector_poses": [
      {
        "instance_id": "gripper",
        "position": { "x": 120.5, "y": 0.0, "z": 85.3 },
        "orientation": { "x": 0.0, "y": 0.707, "z": 0.0, "w": 0.707 }
      }
    ],
    "timestep": 42
  },
  "reward": 0.0,
  "done": false
}
FieldTypeDescription
observationobjectCurrent state of the simulation (see below)
rewardnumberReward signal (currently always 0; custom reward computation should be done agent-side)
donebooleanWhether the episode has ended (max_steps reached or simulation diverged)

Observation Format

FieldTypeDescription
joint_positionsnumber[]Current joint angles (deg) or positions (mm), one per actuated joint
joint_velocitiesnumber[]Current joint angular velocities (deg/s) or linear velocities (mm/s)
end_effector_posesarrayPosition and orientation of each tracked end effector
timestepnumberCurrent simulation step count since last reset

End effector positions are in world-space millimeters. Orientations are unit quaternions {x, y, z, w}.

gym_reset

Resets the simulation to its initial state. All joint positions, velocities, and body poses return to the values defined in the assembly document. The timestep counter resets to 0.

env_idstringrequired

Environment ID from create_robot_env.

Return Value

Returns the initial observation (same format as gym_step's observation field).

gym_observe

Returns the current observation without advancing the simulation. Useful for reading state between steps, after a reset, or when you need to inspect the environment without consuming a timestep.

env_idstringrequired

Environment ID from create_robot_env.

Return Value

Returns the current observation (same format as gym_step's observation field).

gym_close

Destroys a simulation environment and frees its resources. After closing, the env_id is no longer valid and any calls using it will return an error.

env_idstringrequired

Environment ID from create_robot_env.

Return Value

{ "success": true }

Batch Operations

For parallel reinforcement learning, three batch tools create and control multiple environments simultaneously.

batch_create_envs

Creates N copies of the same environment for parallel data collection.

documentDocumentrequired

Assembly IR document (same as create_robot_env).

n_envsnumberrequired

Number of parallel environments to create.

end_effector_idsstring[]required

Instance IDs to track as end effectors.

Returns a batch_id string plus action/observation dimensions.

batch_step

Steps all environments in a batch with per-environment actions.

batch_idstringrequired

Batch ID from batch_create_envs.

action_typestringrequired

Action type: "torque", "position", or "velocity".

actionsnumber[][]required

Per-environment actions. The outer array has one entry per environment, and each inner array has one value per joint. Length of the outer array must equal n_envs.

Returns observations, rewards, and done flags for every environment.

batch_reset

Resets all environments in the batch to their initial state.

batch_idstringrequired

Batch ID from batch_create_envs.

Returns initial observations for all environments.

Example Workflow

1. create_cad_document with assembly → define robot parts, instances, joints
2. create_robot_env → initialize physics simulation
3. gym_reset → get initial observation
4. Loop:
   a. gym_step with action → get observation, reward, done
   b. If done: gym_reset
5. gym_close → clean up
Physics availability

The gym tools require the WASM kernel compiled with physics support. If physics is unavailable, create_robot_env returns an error message indicating that the WASM module was not built with the --features physics flag.

For a tutorial on training agents with the gym interface, see Gym Training. For creating assemblies that work with the gym, see the Assembly & Joints format reference.