« ThinkAct » : différence entre les versions
(Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' ThinkAct''' == Anglais == '''ThinkAct''' A dual-system framework, uses reinforced visual latent planning to enable high-level reasoning and robust action execution in vision-language-action tasks. A framework that enables robots to "think before acting" by combining high-level reasoning with low-level action execution. The approach addresses a key limitation in current vision-language-act... ») |
(Aucune différence)
|
Version du 24 juillet 2025 à 08:05
en construction
Définition
XXXXXXXXX
Français
ThinkAct
Anglais
ThinkAct
A dual-system framework, uses reinforced visual latent planning to enable high-level reasoning and robust action execution in vision-language-action tasks. A framework that enables robots to "think before acting" by combining high-level reasoning with low-level action execution. The approach addresses a key limitation in current vision-language-action models that directly map inputs to actions without explicit planning, making them struggle with complex, multi-step tasks. ThinkAct uses reinforcement learning to train multimodal language models to generate reasoning plans that guide downstream action execution.
Source
Contributeurs: Arianne Arel, wiki





