upkie 7.0.0
Open-source wheeled biped robots
|
Base Upkie environment where actions command servomotors directly. More...
Public Member Functions | |
None | __init__ (self, Optional[float] frequency=200.0, bool frequency_checks=True, Optional[RobotState] init_state=None, bool regulate_frequency=True, str shm_name="/upkie", Optional[dict] spine_config=None) |
Initialize environment. More... | |
def | __del__ (self) |
Stop the spine when deleting the environment instance. | |
None | close (self) |
Stop the spine properly. | |
Optional[float] | dt (self) |
Regulated period of the control loop in seconds, or None if there is no loop frequency regulation. | |
Optional[float] | frequency (self) |
Regulated frequency of the control loop in Hz, or None if there is no loop frequency regulation. | |
dict | get_neutral_action (self) |
Get the neutral action where servos don't move. More... | |
None | update_init_rand (self, **kwargs) |
Update initial-state randomization. More... | |
Tuple[np.ndarray, dict] | reset (self, *Optional[int] seed=None, Optional[dict] options=None) |
Resets the spine and get an initial observation. More... | |
Tuple[np.ndarray, float, bool, bool, dict] | step (self, np.ndarray action) |
Run one timestep of the environment's dynamics. More... | |
None | log (self, str name, Any entry) |
Log a new entry to the "log" key of the action dictionary. More... | |
dict | get_bullet_action (self) |
Get the Bullet action that will be applied at next step. More... | |
None | set_bullet_action (self, dict bullet_action) |
Prepare for the next step an extra action for the Bullet spine. More... | |
Static Public Attributes | |
int | version = 5 |
Environment version number. | |
Base Upkie environment where actions command servomotors directly.
Actions and observations correspond to the moteus servo API. Under the hood, the environment provides a number of features:
Note that Upkie environments are made to run on a single CPU thread. The downside for reinforcement learning is that computations are not massively parallel. The upside is that it simplifies deployment to the real robot, as it relies on the same spine interface that runs on real robots.
The action space is a dictionary with one key for each servo:
left_hip
: left hip joint (qdd100)left_knee
: left knee joint (qdd100)left_wheel
: left wheel joint (mj5208)right_hip
: right hip joint (qdd100)right_knee
: right knee joint (qdd100)right_wheel
: right wheel joint (mj5208)The value for each servo dictionary is itself a dictionary with the following keys:
position
: commanded joint angle \(\theta^*\) in [rad] (NaN to disable) (required).velocity
: commanded joint velocity \(\dot{\theta}^*\) in [rad] / [s] (required).feedforward_torque
: feedforward joint torque \(\tau_{\mathit{ff}}\) in [N m].kp_scale
: scaling factor \(k_{p}^{\mathit{scale}}\) applied to the position feedback gain, between zero and one.kd_scale
: scaling factor \(k_{d}^{\mathit{scale}}\) applied to the velocity feedback gain, between zero and one.maximum_torque
: maximum joint torque \(\tau_{\mathit{max}}\) (feedforward + feedback) enforced during the whole actuation step, in [N m].The resulting torque applied by the servo is then:
\[ \begin{align*} \tau & = \underset{ [-\tau_{\mathit{max}}, +\tau_{\mathit{max}}]}{ \mathrm{clamp} } \left( \tau_{\mathit{ff}} + k_{p} k_{p}^{\mathit{scale}} (\theta^* - \theta) + k_{d} k_{d}^{\mathit{scale}} (\dot{\theta}^* - \dot{\theta})) \right) \end{align*} \]
Position and velocity gains \(k_{p}\) and \(k_{d}\) are configured in each moteus controller directly and don't change during execution. We can rather modulate the overall feedback gains via the normalized parameters \(k_{p}^{\mathit{scale}} \in [0, 1]\) and \(k_{d}^{\mathit{scale}} \in [0, 1]\). Note that the servo regulates the torque above at its own frequency, which is higher (typically 40 kHz) than the agent and the spine frequencies. See the moteus reference for more details.
The observation space is a dictionary with one key for each servo. The value for each key is a dictionary with keys:
position
: Joint angle in [rad].velocity
: Joint velocity in [rad] / [s].torque
: Joint torque in [N m].temperature
: Servo temperature in degree Celsius.voltage
: Power bus voltage of the servo, in [V].As with all Upkie environments, full observations from the spine (detailed in Observations) are also available in the info
dictionary returned by the reset and step functions.
None upkie.envs.upkie_servos.UpkieServos.__init__ | ( | self, | |
Optional[float] | frequency = 200.0 , |
||
bool | frequency_checks = True , |
||
Optional[RobotState] | init_state = None , |
||
bool | regulate_frequency = True , |
||
str | shm_name = "/upkie" , |
||
Optional[dict] | spine_config = None |
||
) |
Initialize environment.
frequency | Regulated frequency of the control loop, in Hz. Can be prescribed even when regulate_frequency is unset, in which case self.dt will be defined but the loop frequency will not be regulated. |
frequency_checks | If regulate_frequency is set and this parameter is true (default), a warning is issued every time the control loop runs slower than the desired frequency . Set this parameter to false to disable these warnings. |
init_state | Initial state of the robot, only used in simulation. |
regulate_frequency | If set (default), the environment will regulate the control loop frequency to the value prescribed in frequency . |
shm_name | Name of shared-memory file to exchange with the spine. |
spine_config | Additional spine configuration overriding the default upkie.config.SPINE_CONFIG . The combined configuration dictionary is sent to the spine at every reset. |
SpineError | If the spine did not respond after the prescribed number of trials. |
Reimplemented in upkie.envs.upkie_servo_positions.UpkieServoPositions, and upkie.envs.upkie_servo_torques.UpkieServoTorques.
dict upkie.envs.upkie_servos.UpkieServos.get_bullet_action | ( | self | ) |
Get the Bullet action that will be applied at next step.
dict upkie.envs.upkie_servos.UpkieServos.get_neutral_action | ( | self | ) |
Get the neutral action where servos don't move.
None upkie.envs.upkie_servos.UpkieServos.log | ( | self, | |
str | name, | ||
Any | entry | ||
) |
Log a new entry to the "log" key of the action dictionary.
name | Name of the entry. |
entry | Dictionary to log along with the actual action. |
Tuple[np.ndarray, dict] upkie.envs.upkie_servos.UpkieServos.reset | ( | self, | |
*Optional[int] | seed = None , |
||
Optional[dict] | options = None |
||
) |
Resets the spine and get an initial observation.
seed | Number used to initialize the environment’s internal random number generator. |
options | Currently unused. |
observation
: Initial vectorized observation, i.e. an element of the environment's observation_space
.info
: Dictionary with auxiliary diagnostic information. For Upkie this is the full observation dictionary sent by the spine. None upkie.envs.upkie_servos.UpkieServos.set_bullet_action | ( | self, | |
dict | bullet_action | ||
) |
Prepare for the next step an extra action for the Bullet spine.
This extra action can be for instance a set of external forces applied to some robot bodies.
bullet_action | Action dictionary processed by the Bullet spine. |
Tuple[np.ndarray, float, bool, bool, dict] upkie.envs.upkie_servos.UpkieServos.step | ( | self, | |
np.ndarray | action | ||
) |
Run one timestep of the environment's dynamics.
When the end of the episode is reached, you are responsible for calling reset()
to reset the environment's state.
action | Action from the agent. |
observation
: Observation of the environment, i.e. an element of its observation_space
.reward
: Reward returned after taking the action.terminated
: Whether the agent reached a terminal state, which may be a good or a bad thing. When true, the user needs to call reset()
.truncated
: Whether the episode is reaching max number of steps. This boolean can signal a premature end of the episode, i.e. before a terminal state is reached. When true, the user needs to call reset()
.info
: Dictionary with additional information, reporting in particular the full observation dictionary coming from the spine. None upkie.envs.upkie_servos.UpkieServos.update_init_rand | ( | self, | |
** | kwargs | ||
) |
Update initial-state randomization.
Keyword arguments are forwarded as is to upkie.utils.robot_state_randomization.RobotStateRandomization.update.