State–action–reward–state–action

State–action–reward–state–action (SARSA) is an algorithm used in reinforcement learning for training agents to make decisions in environments modeled as Markov Decision Processes (MDPs). SARSA is an on-policy method, meaning that it learns the value of the policy being followed by the agent. The components of SARSA can be broken down as follows: 1. **State (S)**: This represents the current state of the environment in which the agent operates.

 Ancestors (6)

Machine learning algorithms
Algorithms
Applied mathematics
Fields of mathematics
Mathematics
 Home