Environment where Upkie is used as a wheeled inverted pendulum. More...

Public Member Functions
def	__init__ (self, UpkieServos env, float fall_pitch=1.0, bool left_wheeled=True, float max_ground_velocity=1.0, float wheel_radius=0.06)
	Initialize environment. More...

Tuple[np.ndarray, Dict]	reset (self, *Optional[int] seed=None, Optional[dict] options=None)
	Resets the environment and get an initial observation. More...

Tuple[np.ndarray, float, bool, bool, dict]	step (self, np.ndarray action)
	Run one timestep of the environment's dynamics. More...

Public Attributes
	observation_space
	Observation space.

	action_space
	Action space.

	env
	Internal upkie.envs.upkie_servos.UpkieServos environment.

	fall_pitch
	Fall detection pitch angle, in radians.

	left_wheeled
	Set to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion. More...

	wheel_radius
	Wheel radius in [m].

Static Public Attributes
int	version = 4
	Environment version number.

Detailed Description

Environment where Upkie is used as a wheeled inverted pendulum.

With this environment, Upkie keeps its legs straight and actions only affect wheel velocities. This way, it behaves like a wheeled inverted pendulum. This ground-velocity environment is used for instance by the MPC balancer and PPO balancer agents.

Note: For reinforcement learning with neural-network policies: the observation space and action space are not normalized.

Action space

The action corresponds to the ground velocity resulting from wheel velocities. The action vector is simply:

\[ a =\begin{bmatrix} \dot{p}^* \end{bmatrix} \]

where we denote by \(\dot{p}^*\) the commanded ground velocity in [m] / [s], which is internally converted into wheel velocity commands. Note that, while this action is not normalized, [-1, 1] m/s is a reasonable range for ground velocities.

Observation space

Vectorized observations have the following structure:

\[ \begin{align*} o &= \begin{bmatrix} \theta \\ p \\ \dot{\theta} \\ \dot{p} \end{bmatrix} \end{align*} \]

where we denote by:

\(\theta\) the pitch angle of the base with respect to the world vertical, in radians. This angle is positive when the robot leans forward.
\(p\) the position of the average wheel contact point, in meters.
\(\dot{\theta}\) the body angular velocity of the base frame along its lateral axis, in radians per seconds.
\(\dot{p}\) the velocity of the average wheel contact point, in meters per seconds.

As with all Upkie environments, full observations from the spine (detailed in Observations) are also available in the info dictionary returned by the reset and step functions.

Constructor & Destructor Documentation

◆ init()

def upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.__init__	(		self,
		UpkieServos	env,
		float	fall_pitch = `1.0`,
		bool	left_wheeled = `True`,
		float	max_ground_velocity = `1.0`,
		float	wheel_radius = `0.06`
	)

Initialize environment.

Parameters

env	UpkieServos environment to command servomotors.
fall_pitch	Fall detection pitch angle, in radians.
left_wheeled	Set to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion. Set to False for a right-wheeled variant.
max_ground_velocity	Maximum commanded ground velocity in m/s. The default value of 1 m/s is conservative, don't hesitate to increase it once you feel confident in your agent.
wheel_radius	Wheel radius in [m].

Member Function Documentation

◆ reset()

Tuple[np.ndarray, Dict] upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.reset	(		self,
		*Optional[int]	seed = `None`,
		Optional[dict]	options = `None`
	)

Resets the environment and get an initial observation.

Parameters

seed	Number used to initialize the environment’s internal random number generator.
options	Currently unused.

Returns

observation: Initial vectorized observation, i.e. an element of the environment's observation_space.
info: Dictionary with auxiliary diagnostic information. For Upkie this is the full observation dictionary sent by the spine.

◆ step()

Tuple[np.ndarray, float, bool, bool, dict] upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.step	(		self,
		np.ndarray	action
	)

Run one timestep of the environment's dynamics.

When the end of the episode is reached, you are responsible for calling reset() to reset the environment's state.

Parameters

action Action from the agent.

Returns

observation: Observation of the environment, i.e. an element of its observation_space.
reward: Reward returned after taking the action.
terminated: Whether the agent reached a terminal state, which may be a good or a bad thing. When true, the user needs to call reset().
truncated: Whether the episode is reaching max number of steps. This boolean can signal a premature end of the episode, i.e. before a terminal state is reached. When true, the user needs to call reset().
info: Dictionary with additional information, reporting in particular the full observation dictionary coming from the spine.

Member Data Documentation

◆ left_wheeled

upkie.envs.upkie_ground_velocity.UpkieGroundVelocity.left_wheeled

Set to True (default) if the robot is left wheeled, that is, a positive turn of the left wheel results in forward motion.

Set to False for a right-wheeled variant.

The documentation for this class was generated from the following file:

upkie/envs/upkie_ground_velocity.py

Public Member Functions

Public Attributes

Static Public Attributes

Detailed Description

Action space

Observation space

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ reset()

◆ step()

Member Data Documentation

◆ left_wheeled

◆ init()