« MatMul » : différence entre les versions
(Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == ''' Scalable MatMul-free Language Modeling ''' Matrix multiplication (MatMul) typically dominates the overall computational cost of large language models (LLMs). This cost only grows as LLMs scale to larger embedding dimensions and context lengths. In this work, we show that MatMul operations can be completely eliminated from LLMs while maintaining strong perform... ») |
Aucun résumé des modifications |
||
Ligne 8 : | Ligne 8 : | ||
== Anglais == | == Anglais == | ||
'''Matrix multiplication''' | |||
''' Scalable MatMul-free Language Modeling ''' | ''' Scalable MatMul-free Language Modeling ''' | ||
Version du 5 juillet 2024 à 17:09
en construction
Définition
XXXXXXXXX
Français
XXXXXXXXX
Anglais
Matrix multiplication
Scalable MatMul-free Language Modeling
Matrix multiplication (MatMul) typically dominates the overall computational cost of large language models (LLMs). This cost only grows as LLMs scale to larger embedding dimensions and context lengths. In this work, we show that MatMul operations can be completely eliminated from LLMs while maintaining strong performance at billion-parameter scales.