Mistral 7B


Révision datée du 30 décembre 2023 à 10:39 par Pitpitt (discussion | contributions) (Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' Mistral 7B''' == Anglais == ''' Mistral 7B''' The Mistral 7B paper introduces a compact yet powerful language model that, despite its relatively modest size of 7 billion tokens, outperforms its larger counterparts, such as the 13B Llama 2 model, in various benchmarks. (Next to the two-times larger Qwen 14B, Mistral 7B was also the base model used in the winning solutions of this year's Ne... »)
(diff) ← Version précédente | Voir la version actuelle (diff) | Version suivante → (diff)

en construction

Définition

XXXXXXXXX

Français

Mistral 7B

Anglais

Mistral 7B

 The Mistral 7B paper introduces a compact yet powerful language model that, despite its relatively modest size of 7 billion tokens, outperforms its larger counterparts, such as the 13B Llama 2 model, in various benchmarks. (Next to the two-times larger Qwen 14B, Mistral 7B was also the base model used in the winning solutions of this year's NeurIPS LLM Finetuning & Efficiency challenge.)

Source : arxiv



Contributeurs: Patrick Drouin, wiki