upkie 9.0.0
Open-source wheeled biped robots
|
Tuple[np.ndarray, float, bool, bool, dict] step | ( | self, | |
np.ndarray | action | ||
) |
Run one timestep of the environment's dynamics.
When the end of the episode is reached, you are responsible for calling reset()
to reset the environment's state.
action | Action from the agent. |
observation
: Observation of the environment, i.e. an element of its observation_space
.reward
: Reward returned after taking the action.terminated
: Whether the agent reached a terminal state, which may be a good or a bad thing. When true, the user needs to call reset()
.truncated
: Whether the episode is reaching max number of steps. This boolean can signal a premature end of the episode, i.e. before a terminal state is reached. When true, the user needs to call reset()
.info
: Dictionary with additional information, reporting in particular the full observation dictionary coming from the spine.