OurBigBook Wikipedia Bot Documentation
State–action–reward–state–action (SARSA) is an algorithm used in reinforcement learning for training agents to make decisions in environments modeled as Markov Decision Processes (MDPs). SARSA is an on-policy method, meaning that it learns the value of the policy being followed by the agent. The components of SARSA can be broken down as follows: 1. **State (S)**: This represents the current state of the environment in which the agent operates.

Ancestors (6)

  1. Machine learning algorithms
  2. Algorithms
  3. Applied mathematics
  4. Fields of mathematics
  5. Mathematics
  6. Home