SARSA


Révision datée du 17 décembre 2020 à 12:50 par Pitpitt (discussion | contributions) (Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == ''' State–action–reward–state–action''' State–action–reward... »)
(diff) ← Version précédente | Voir la version actuelle (diff) | Version suivante → (diff)

en construction

Définition

XXXXXXXXX

Français

XXXXXXXXX

Anglais

State–action–reward–state–action

State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning. It was proposed by Rummery and Niranjan in a technical note[1] with the name "Modified Connectionist Q-Learning" (MCQ-L). The alternative name SARSA, proposed by Rich Sutton, was only mentioned as a footnote.

Source : Wikipedia Machine Learning



Contributeurs: Imane Meziani, wiki