« Mistral 7B » : différence entre les versions
(Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' Mistral 7B''' == Anglais == ''' Mistral 7B''' The Mistral 7B paper introduces a compact yet powerful language model that, despite its relatively modest size of 7 billion tokens, outperforms its larger counterparts, such as the 13B Llama 2 model, in various benchmarks. (Next to the two-times larger Qwen 14B, Mistral 7B was also the base model used in the winning solutions of this year's Ne... ») |
m (Remplacement de texte : « ↵<small> » par « ==Sources== ») |
||
Ligne 12 : | Ligne 12 : | ||
The Mistral 7B paper introduces a compact yet powerful language model that, despite its relatively modest size of 7 billion tokens, outperforms its larger counterparts, such as the 13B Llama 2 model, in various benchmarks. (Next to the two-times larger Qwen 14B, Mistral 7B was also the base model used in the winning solutions of this year's NeurIPS LLM Finetuning & Efficiency challenge.) | The Mistral 7B paper introduces a compact yet powerful language model that, despite its relatively modest size of 7 billion tokens, outperforms its larger counterparts, such as the 13B Llama 2 model, in various benchmarks. (Next to the two-times larger Qwen 14B, Mistral 7B was also the base model used in the winning solutions of this year's NeurIPS LLM Finetuning & Efficiency challenge.) | ||
==Sources== | |||
[https://arxiv.org/abs/2310.06825 Source : arxiv] | [https://arxiv.org/abs/2310.06825 Source : arxiv] |
Version du 28 janvier 2024 à 10:01
en construction
Définition
XXXXXXXXX
Français
Mistral 7B
Anglais
Mistral 7B
The Mistral 7B paper introduces a compact yet powerful language model that, despite its relatively modest size of 7 billion tokens, outperforms its larger counterparts, such as the 13B Llama 2 model, in various benchmarks. (Next to the two-times larger Qwen 14B, Mistral 7B was also the base model used in the winning solutions of this year's NeurIPS LLM Finetuning & Efficiency challenge.)
Sources
Contributeurs: Patrick Drouin, wiki