« SAIL-VL2 » : différence entre les versions
(Page créée avec « == Définition == XXXXXXXXX == Français == ''' SAIL-VL2''' == Anglais == '''SAIL-VL2''' An open-source vision-language foundation model designed for comprehensive multimodal understanding and reasoning. SAIL-VL2 represents a comprehensive advancement in efficient vision-language modeling through innovations in architecture, training strategies, and data curation. The model successfully demonstrates that smaller, well-designed models can achieve competitive... ») |
Aucun résumé des modifications |
||
| (Une version intermédiaire par un autre utilisateur non affichée) | |||
| Ligne 9 : | Ligne 9 : | ||
'''SAIL-VL2''' | '''SAIL-VL2''' | ||
<!--Vision-language foundation model for comprehensive multimodal understanding and reasoning. It achieves state-of-the-art performance across diverse benchmarks through data curation, progressive training, and sparse MoE architecture.--> | |||
== | == Sources == | ||
[https://arxiv.org/abs/2509.14033 Source : arxiv] | |||
[https://github.com/BytedanceDouyinContent/SAIL-VL2 Source : GitHub] | |||
[https://huggingface.co/papers/2509.14033 Source : huggingface] | [https://huggingface.co/papers/2509.14033 Source : huggingface] | ||
Dernière version du 23 février 2026 à 14:16
Définition
XXXXXXXXX
Français
SAIL-VL2
Anglais
SAIL-VL2
Sources
Contributeurs: Arianne Arel, wiki





