upkie 7.0.0
Open-source wheeled biped robots
|
Environment where Upkie is used as a wheeled inverted pendulum. More...
Public Member Functions | |
def | __init__ (self, UpkieServos env, float fall_pitch=1.0, bool left_wheeled=True, float max_ground_velocity=1.0, float wheel_radius=0.06) |
Initialize environment. More... | |
Tuple[np.ndarray, Dict] | reset (self, *Optional[int] seed=None, Optional[dict] options=None) |
Resets the environment and get an initial observation. More... | |
Tuple[np.ndarray, float, bool, bool, dict] | step (self, np.ndarray action) |
Run one timestep of the environment's dynamics. More... | |
Public Attributes | |
observation_space | |
Observation space. | |
action_space | |
Action space. | |
env | |
Internal upkie.envs.upkie_servos.UpkieServos environment. | |
fall_pitch | |
Fall detection pitch angle, in radians. | |
left_wheeled | |
Set to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion. More... | |
wheel_radius | |
Wheel radius in [m]. | |
Static Public Attributes | |
int | version = 4 |
Environment version number. | |
Environment where Upkie is used as a wheeled inverted pendulum.
With this environment, Upkie keeps its legs straight and actions only affect wheel velocities. This way, it behaves like a wheeled inverted pendulum. This ground-velocity environment is used for instance by the MPC balancer and PPO balancer agents.
The action corresponds to the ground velocity resulting from wheel velocities. The action vector is simply:
\[ a =\begin{bmatrix} \dot{p}^* \end{bmatrix} \]
where we denote by \(\dot{p}^*\) the commanded ground velocity in [m] / [s], which is internally converted into wheel velocity commands. Note that, while this action is not normalized, [-1, 1] m/s is a reasonable range for ground velocities.
Vectorized observations have the following structure:
\[ \begin{align*} o &= \begin{bmatrix} \theta \\ p \\ \dot{\theta} \\ \dot{p} \end{bmatrix} \end{align*} \]
where we denote by:
As with all Upkie environments, full observations from the spine (detailed in Observations) are also available in the info
dictionary returned by the reset and step functions.
def upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.__init__ | ( | self, | |
UpkieServos | env, | ||
float | fall_pitch = 1.0 , |
||
bool | left_wheeled = True , |
||
float | max_ground_velocity = 1.0 , |
||
float | wheel_radius = 0.06 |
||
) |
Initialize environment.
env | UpkieServos environment to command servomotors. |
fall_pitch | Fall detection pitch angle, in radians. |
left_wheeled | Set to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion. Set to False for a right-wheeled variant. |
max_ground_velocity | Maximum commanded ground velocity in m/s. The default value of 1 m/s is conservative, don't hesitate to increase it once you feel confident in your agent. |
wheel_radius | Wheel radius in [m]. |
Tuple[np.ndarray, Dict] upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.reset | ( | self, | |
*Optional[int] | seed = None , |
||
Optional[dict] | options = None |
||
) |
Resets the environment and get an initial observation.
seed | Number used to initialize the environment’s internal random number generator. |
options | Currently unused. |
observation
: Initial vectorized observation, i.e. an element of the environment's observation_space
.info
: Dictionary with auxiliary diagnostic information. For Upkie this is the full observation dictionary sent by the spine. Tuple[np.ndarray, float, bool, bool, dict] upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.step | ( | self, | |
np.ndarray | action | ||
) |
Run one timestep of the environment's dynamics.
When the end of the episode is reached, you are responsible for calling reset()
to reset the environment's state.
action | Action from the agent. |
observation
: Observation of the environment, i.e. an element of its observation_space
.reward
: Reward returned after taking the action.terminated
: Whether the agent reached a terminal state, which may be a good or a bad thing. When true, the user needs to call reset()
.truncated
: Whether the episode is reaching max number of steps. This boolean can signal a premature end of the episode, i.e. before a terminal state is reached. When true, the user needs to call reset()
.info
: Dictionary with additional information, reporting in particular the full observation dictionary coming from the spine. upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.left_wheeled |
Set to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion.
Set to False for a right-wheeled variant.