Transformateur Switch
en construction
Définition
XXXXXXXXX
Français
XXXXXXXXX
Anglais
Switch transformer A new neural net which goal was facilitating the creation of larger models without increasing computational costs.
The feature that distinguishes this model from previous ones is a simplification of the Mixture of Experts algorithm. Mixture of Experts (MoE) consist of a system by which tokens (elemental parts of the input) entering the model are sent to be processed by different parts of the neural net (experts). Thus, to process a given token, only a subsection of the model is active; we have a sparse model. This reduces the computational costs, allowing them to reach the trillion-parameter mark.
[https://towardsdatascience.com/top-5-gpt-3-successors-you-should-know-in-2021-42ffe94cbbf
Source : towardsdatascience]
Contributeurs: Imane Meziani, wiki