Wrapper to make Upkie act as a wheeled inverted pendulum with yaw control. More...

Public Member Functions
def	__init__ (self, UpkieServos servos_env, float fall_pitch=1.0, float leg_gain_scale=1.0, float max_ground_velocity=3.0, float max_yaw_velocity=1.0)
	Initialize environment. More...

float	leg_gain_scale (self)
	Get the current kp/kd scale upplied to hip and knee joints. More...

None	set_leg_gain_scale (self, float leg_gain_scale)
	Set a new kp/kd scale upplied to hip and knee joints. More...

Tuple[np.ndarray, Dict]	reset (self, *Optional[int] seed=None, Optional[dict] options=None)
	Resets the environment and get an initial observation. More...

Tuple[np.ndarray, float, bool, bool, dict]	step (self, np.ndarray action)
	Run one timestep of the environment's dynamics. More...

Public Attributes
	observation_space
	Observation space.

	action_space
	Action space.

	fall_pitch
	Fall detection pitch angle, in radians.

Detailed Description

Wrapper to make Upkie act as a wheeled inverted pendulum with yaw control.

When this wrapper is applied, Upkie keeps its legs straight and actions affect wheel velocity commands. Overall, the robot behaves like a wheeled inverted pendulum in the sagittal plane, and turns by differential drive.

Note: Note if you are doing reinforcement learning with neural-network policies that the observation space and action space are not normalized.

Action space

The action corresponds to the ground sagittal velocity and yaw velocity:

\[ a = \begin{bmatrix} \dot{p}^* \\ \dot{\psi}^* \end{bmatrix} \]

where \(\dot{p}^*\) is the commanded ground velocity in m/s and \(\dot{\psi}^*\) is the commanded yaw velocity in rad/s. Both are internally converted into wheel velocity commands.

Observation space

Vectorized observations have the following structure:

\[ o = \begin{bmatrix} p \\ \theta \\ \psi \\ \dot{p} \\ \dot{\theta} \\ \dot{\psi} \end{bmatrix} \]

where we denote by:

\(p \in \mathbb{R}\) the position of the average wheel contact point, in meters. This is only a 1D position: it is an x-coordinate as long as the robot keeps the same sagittal plane, once it turns \(p\) should be considered a curvilinear abscissa.
\(\theta\) the pitch angle of the base with respect to the world vertical, in radians. This angle is positive when the robot leans forward.
\(\psi\) the yaw angle, in radians, integrated from the commanded yaw velocity. It resets to zero at each episode.
\(\dot{p}\) the velocity of the average wheel contact point, in meters per seconds.
\(\dot{\theta}\) the body angular velocity of the base frame along its lateral axis, in radians per seconds.
\(\dot{\psi}\) the commanded yaw velocity, in radians per second.

Note: The pitch velocity \(\dot{\theta}\) is the body-frame angular velocity from the IMU, not the exact time derivative of the pitch Euler angle. Similarly, the yaw velocity \(\dot{\psi}\) is the commanded value. A future improvement could compute both from the full orientation and angular velocity.

As with all Upkie environments, full observations from the spine (detailed in Observations) are also available in the info dictionary returned by the reset and step functions.

The documentation for this class was generated from the following file:

upkie/envs/upkie_gyropod.py

Public Member Functions

Public Attributes

Detailed Description

Action space

Observation space