upkie 7.0.0
Open-source wheeled biped robots
Loading...
Searching...
No Matches
upkie.envs.upkie_ground_velocity.UpkieGroundVelocity Class Reference

Environment where Upkie is used as a wheeled inverted pendulum. More...

Public Member Functions

def __init__ (self, UpkieServos env, float fall_pitch=1.0, bool left_wheeled=True, float max_ground_velocity=1.0, float wheel_radius=0.06)
 Initialize environment. More...
 
Tuple[np.ndarray, Dict] reset (self, *Optional[int] seed=None, Optional[dict] options=None)
 Resets the environment and get an initial observation. More...
 
Tuple[np.ndarray, float, bool, bool, dict] step (self, np.ndarray action)
 Run one timestep of the environment's dynamics. More...
 

Public Attributes

 observation_space
 Observation space.
 
 action_space
 Action space.
 
 env
 Internal upkie.envs.upkie_servos.UpkieServos environment.
 
 fall_pitch
 Fall detection pitch angle, in radians.
 
 left_wheeled
 Set to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion. More...
 
 wheel_radius
 Wheel radius in [m].
 

Static Public Attributes

int version = 4
 Environment version number.
 

Detailed Description

Environment where Upkie is used as a wheeled inverted pendulum.

With this environment, Upkie keeps its legs straight and actions only affect wheel velocities. This way, it behaves like a wheeled inverted pendulum. This ground-velocity environment is used for instance by the MPC balancer and PPO balancer agents.

Note
For reinforcement learning with neural-network policies: the observation space and action space are not normalized.

Action space

The action corresponds to the ground velocity resulting from wheel velocities. The action vector is simply:

\[ a =\begin{bmatrix} \dot{p}^* \end{bmatrix} \]

where we denote by \(\dot{p}^*\) the commanded ground velocity in [m] / [s], which is internally converted into wheel velocity commands. Note that, while this action is not normalized, [-1, 1] m/s is a reasonable range for ground velocities.

Observation space

Vectorized observations have the following structure:

\[ \begin{align*} o &= \begin{bmatrix} \theta \\ p \\ \dot{\theta} \\ \dot{p} \end{bmatrix} \end{align*} \]

where we denote by:

  • \(\theta\) the pitch angle of the base with respect to the world vertical, in radians. This angle is positive when the robot leans forward.
  • \(p\) the position of the average wheel contact point, in meters.
  • \(\dot{\theta}\) the body angular velocity of the base frame along its lateral axis, in radians per seconds.
  • \(\dot{p}\) the velocity of the average wheel contact point, in meters per seconds.

As with all Upkie environments, full observations from the spine (detailed in Observations) are also available in the info dictionary returned by the reset and step functions.

Constructor & Destructor Documentation

◆ __init__()

def upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.__init__ (   self,
UpkieServos  env,
float   fall_pitch = 1.0,
bool   left_wheeled = True,
float   max_ground_velocity = 1.0,
float   wheel_radius = 0.06 
)

Initialize environment.

Parameters
envUpkieServos environment to command servomotors.
fall_pitchFall detection pitch angle, in radians.
left_wheeledSet to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion. Set to False for a right-wheeled variant.
max_ground_velocityMaximum commanded ground velocity in m/s. The default value of 1 m/s is conservative, don't hesitate to increase it once you feel confident in your agent.
wheel_radiusWheel radius in [m].

Member Function Documentation

◆ reset()

Tuple[np.ndarray, Dict] upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.reset (   self,
*Optional[int]   seed = None,
Optional[dict]   options = None 
)

Resets the environment and get an initial observation.

Parameters
seedNumber used to initialize the environment’s internal random number generator.
optionsCurrently unused.
Returns
  • observation: Initial vectorized observation, i.e. an element of the environment's observation_space.
  • info: Dictionary with auxiliary diagnostic information. For Upkie this is the full observation dictionary sent by the spine.

◆ step()

Tuple[np.ndarray, float, bool, bool, dict] upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.step (   self,
np.ndarray  action 
)

Run one timestep of the environment's dynamics.

When the end of the episode is reached, you are responsible for calling reset() to reset the environment's state.

Parameters
actionAction from the agent.
Returns
  • observation: Observation of the environment, i.e. an element of its observation_space.
  • reward: Reward returned after taking the action.
  • terminated: Whether the agent reached a terminal state, which may be a good or a bad thing. When true, the user needs to call reset().
  • truncated: Whether the episode is reaching max number of steps. This boolean can signal a premature end of the episode, i.e. before a terminal state is reached. When true, the user needs to call reset().
  • info: Dictionary with additional information, reporting in particular the full observation dictionary coming from the spine.

Member Data Documentation

◆ left_wheeled

upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.left_wheeled

Set to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion.

Set to False for a right-wheeled variant.


The documentation for this class was generated from the following file: