upkie 9.0.0
Open-source wheeled biped robots
|
Upkie environment where actions command servomotors directly. More...
Public Member Functions | |
None | __init__ (self, Backend backend, Optional[float] frequency=200.0, bool frequency_checks=True, Optional[RobotState] init_state=None, bool regulate_frequency=True, float max_gain_scale=5.0) |
Initialize servos environment. More... | |
dict | get_env_observation (self, dict spine_observation) |
Extract environment observation from spine observation dictionary. More... | |
dict | get_neutral_action (self) |
Get the neutral action where servos don't move. More... | |
dict | get_spine_action (self, dict env_action) |
Convert environment action to a spine action dictionary. More... | |
Public Attributes | |
action_space | |
Action space of the environment. | |
observation_space | |
Observation space of the environment. | |
Upkie environment where actions command servomotors directly.
Actions and observations correspond to the moteus servo API.
The action space is a dictionary with one key for each servo:
left_hip
: left hip joint (qdd100)left_knee
: left knee joint (qdd100)left_wheel
: left wheel joint (mj5208)right_hip
: right hip joint (qdd100)right_knee
: right knee joint (qdd100)right_wheel
: right wheel joint (mj5208)The value for each servo dictionary is itself a dictionary with the following keys:
position
: commanded joint angle \(\theta^*\) in radians (NaN to disable) (required).velocity
: commanded joint velocity \(\dot{\theta}^*\) in rad/s (required).feedforward_torque
: feedforward joint torque \(\tau_{\mathit{ff}}\) in N·m.kp_scale
: scaling factor \(k_{p}^{\mathit{scale}}\) applied to the position feedback gain, between zero and one.kd_scale
: scaling factor \(k_{d}^{\mathit{scale}}\) applied to the velocity feedback gain, between zero and one.maximum_torque
: maximum joint torque \(\tau_{\mathit{max}}\) (feedforward + feedback) enforced during the whole actuation step, in N⋅m.The resulting torque applied by the servo is then:
\[ \begin{align*} \tau & = \underset{ [-\tau_{\mathit{max}}, +\tau_{\mathit{max}}]}{ \mathrm{clamp} } \left( \tau_{\mathit{ff}} + k_{p} k_{p}^{\mathit{scale}} (\theta^* - \theta) + k_{d} k_{d}^{\mathit{scale}} (\dot{\theta}^* - \dot{\theta})) \right) \end{align*} \]
Position and velocity gains \(k_{p}\) and \(k_{d}\) are configured in each moteus controller directly and don't change during execution. We can rather modulate the overall feedback gains via the normalized parameters \(k_{p}^{\mathit{scale}} \in [0, 1]\) and \(k_{d}^{\mathit{scale}} \in [0, 1]\). Note that the servo regulates the torque above at its own frequency, which is higher (typically 40 kHz) than the agent and the spine frequencies. See the moteus reference for more details.
The observation space is a dictionary with one key for each servo. The value for each key is a dictionary with keys:
position
: Joint angle in rad.velocity
: Joint velocity in rad/s.torque
: Joint torque in N⋅m.temperature
: Servo temperature in degree Celsius.voltage
: Power bus voltage of the servo, in V.Full observations from the backend (detailed in Observations) are also available in the info
dictionary returned by the reset and step functions.