« Transformateur Switch » : différence entre les versions
(Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == ''' Switch transformer''' A new neural net which goal was facilitating... ») |
(Aucune différence)
|
Version du 16 juin 2021 à 15:28
en construction
Définition
XXXXXXXXX
Français
XXXXXXXXX
Anglais
Switch transformer A new neural net which goal was facilitating the creation of larger models without increasing computational costs.
The feature that distinguishes this model from previous ones is a simplification of the Mixture of Experts algorithm. Mixture of Experts (MoE) consist of a system by which tokens (elemental parts of the input) entering the model are sent to be processed by different parts of the neural net (experts). Thus, to process a given token, only a subsection of the model is active; we have a sparse model. This reduces the computational costs, allowing them to reach the trillion-parameter mark.
[https://towardsdatascience.com/top-5-gpt-3-successors-you-should-know-in-2021-42ffe94cbbf
Source : towardsdatascience]
Contributeurs: Imane Meziani, wiki