|
upkie 10.0.0
Open-source wheeled biped robots
|
| Tuple[np.ndarray, float, bool, bool, dict] step | ( | self, | |
| np.ndarray | action | ||
| ) |
Run one timestep of the environment's dynamics.
When the end of the episode is reached, you are responsible for calling reset() to reset the environment's state.
| action | Action from the agent. |
observation: Observation of the environment, i.e. an element of its observation_space.reward: Reward returned after taking the action.terminated: Whether the agent reached a terminal state, which may be a good or a bad thing. When true, the user needs to call reset().truncated: Whether the episode is reaching max number of steps. This boolean can signal a premature end of the episode, i.e. before a terminal state is reached. When true, the user needs to call reset().info: Dictionary with additional information, reporting in particular the full observation dictionary coming from the spine.