Autoattention multitêtes
en construction
Définition
XXXXXXXXX
Français
XXXXXXXXX
Anglais
Multi-Head Attention
Multi-head Attention is a module for attention mechanisms which runs through an attention mechanism several times in parallel. The independent attention outputs are then concatenated and linearly transformed into the expected dimension. Intuitively, multiple attention heads allows for attending to parts of the sequence differently (e.g. longer-term dependencies versus shorter-term dependencies).
Source
Contributeurs: Claude Coulombe, Patrick Drouin, wiki