UniVideo - Historique des versions

Pitpitt le 1 mai 2026 à 00:19

2026-05-01T00:19:49Z

← Version précédente		Version du 30 avril 2026 à 20:19
Ligne 22 :		Ligne 22 :
	[https://congwei1230.github.io/UniVideo/ Source : UniVideo, GitHub.io]		[https://congwei1230.github.io/UniVideo/ Source : UniVideo, GitHub.io]

	[[Catégorie:~~vocabulary]]~~		[[Catégorie:GRAND LEXIQUE FRANÇAIS]]

	~~[[Catégorie:Publication~~]]

Claude COULOMBE le 28 avril 2026 à 19:05

2026-04-28T19:05:23Z

← Version précédente		Version du 28 avril 2026 à 15:05
Ligne 1 :		Ligne 1 :
	~~== EN CONSTRUCTION ==~~

	== Définition ==		== Définition ==
	Nom propre d'un outil permettant de combiner une requête textuelle grâce à un '''[[grand modèle de langues (GML)]]''' et des images sources afin de '''[[génération automatique d'image\|générer un montage vidéo]]''' qui combine ces images selon la requête.		Nom propre d'un outil permettant de combiner une requête textuelle grâce à un '''[[grand modèle de langues (GML)]]''' et des images sources afin de '''[[génération automatique d'image\|générer un montage vidéo]]''' qui combine ces images selon la requête.
Ligne 25 :		Ligne 23 :

	[[Catégorie:vocabulary]]		[[Catégorie:vocabulary]]

			[[Catégorie:Publication]]

Claude COULOMBE le 28 avril 2026 à 19:04

2026-04-28T19:04:39Z

← Version précédente		Version du 28 avril 2026 à 15:04
Ligne 2 :		Ligne 2 :

	== Définition ==		== Définition ==
	Nom propre d'un outil permettant de combiner une requête textuelle grâce à un '''[[grand modèle de langues (GML)]]''' et des images sources afin de ''[[génération automatique d'image\|générer un montage vidéo]]''' qui combine ces images selon la requête.		Nom propre d'un outil permettant de combiner une requête textuelle grâce à un '''[[grand modèle de langues (GML)]]''' et des images sources afin de '''[[génération automatique d'image\|générer un montage vidéo]]''' qui combine ces images selon la requête.

	== Compléments ==		== Compléments ==

Claude COULOMBE le 28 avril 2026 à 19:03

2026-04-28T19:03:40Z

← Version précédente		Version du 28 avril 2026 à 15:03
Ligne 2 :		Ligne 2 :

	== Définition ==		== Définition ==
	~~Cadre~~ permettant de ~~comprendre la~~ '''[[génération automatique d'image]]''' ~~et le~~ montage ~~du domaine~~ de la vidéo ~~grâce à~~ une architecture à double flux, ~~combinant un '''[[grand modèle de langues (GML)]]''' pour la compréhension des instructions~~ et un modèle '''DiT multimodal (MMDiT)''' ~~pour la~~ génération d'image.		Nom propre d'un outil permettant de combiner une requête textuelle grâce à un '''[[grand modèle de langues (GML)]]''' et des images sources afin de ''[[génération automatique d'image\|générer un montage vidéo]]''' qui combine ces images selon la requête.

			== Compléments ==
			Le montage de la vidéo utilise une architecture à double flux, et un modèle '''DiT multimodal (MMDiT)''' de génération d'image.

	== Français ==		== Français ==
Ligne 10 :		Ligne 13 :
	'''UniVideo '''		'''UniVideo '''

	<!--Framework for ~~unederstanding~~ generation and editing in the video domain with a dual-stream design, combining a Multimodal Large Language Model (MLLM) for instruction understanding with a Multimodal DiT (MMDiT) for video generation.		<!--Framework for understanding generation and editing in the video domain with a dual-stream design, combining a Multimodal Large Language Model (MLLM) for instruction understanding with a Multimodal DiT (MMDiT) for video generation.

	Multimodal DiT?-->		Multimodal DiT?-->
Ligne 20 :		Ligne 23 :

	[https://congwei1230.github.io/UniVideo/ Source : UniVideo, GitHub.io]		[https://congwei1230.github.io/UniVideo/ Source : UniVideo, GitHub.io]


	[[Catégorie:vocabulary]]		[[Catégorie:vocabulary]]

Arianne le 13 mars 2026 à 15:15

2026-03-13T15:15:14Z

← Version précédente		Version du 13 mars 2026 à 11:15
Ligne 2 :		Ligne 2 :

	== Définition ==		== Définition ==
	~~xxxxx~~		Cadre permettant de comprendre la '''[[génération automatique d'image]]''' et le montage du domaine de la vidéo grâce à une architecture à double flux, combinant un '''[[grand modèle de langues (GML)]]''' pour la compréhension des instructions et un modèle '''DiT multimodal (MMDiT)''' pour la génération d'image.

	== Français ==		== Français ==
Ligne 10 :		Ligne 10 :
	'''UniVideo '''		'''UniVideo '''

	<!--Framework for unederstanding generation and editing in the video domain with a dual-stream design, combining a Multimodal Large Language Model (MLLM) for instruction understanding with a Multimodal DiT (MMDiT) for video generation.-->		<!--Framework for unederstanding generation and editing in the video domain with a dual-stream design, combining a Multimodal Large Language Model (MLLM) for instruction understanding with a Multimodal DiT (MMDiT) for video generation.

			Multimodal DiT?-->

	==Sources==		==Sources==

Arianne le 24 février 2026 à 15:58

2026-02-24T15:58:24Z

← Version précédente		Version du 24 février 2026 à 11:58
Ligne 8 :		Ligne 8 :

	== Anglais ==		== Anglais ==
	'''~~xxxUniVideoxx~~ '''		'''UniVideo '''

	~~A unified framework that combines video understanding,~~ generation, and editing ~~capabilities within a single model. Unlike existing approaches that handle these tasks separately, UniVideo can interpret complex multimodal instructions and perform diverse~~ video ~~operations through~~ a dual-stream ~~architecture. The system demonstrates strong performance across multiple video tasks while enabling novel capabilities like visual prompt understanding and task composition.~~		<!--Framework for unederstanding generation and editing in the video domain with a dual-stream design, combining a Multimodal Large Language Model (MLLM) for instruction understanding with a Multimodal DiT (MMDiT) for video generation.-->
	~~UniVideo~~, ~~a dual-stream framework~~ combining a Multimodal Large Language Model ~~and~~ a Multimodal DiT~~, extends unified modeling to~~ video generation ~~and editing, achieving state-of~~-~~the~~-~~art performance and supporting task composition and generalization.~~

	==Sources==		==Sources==
	[https://huggingface.co/papers/2510.08377 ~~Sources~~ : huggingface]		[https://arxiv.org/abs/2510.08377 Source : arxiv]

			[https://huggingface.co/papers/2510.08377 Source : huggingface]

			[https://congwei1230.github.io/UniVideo/ Source : UniVideo, GitHub.io]


	[[Catégorie:vocabulary]]		[[Catégorie:vocabulary]]

Pitpitt le 28 octobre 2025 à 00:59

2025-10-28T00:59:18Z

← Version précédente		Version du 27 octobre 2025 à 20:59
Ligne 11 :		Ligne 11 :

	A unified framework that combines video understanding, generation, and editing capabilities within a single model. Unlike existing approaches that handle these tasks separately, UniVideo can interpret complex multimodal instructions and perform diverse video operations through a dual-stream architecture. The system demonstrates strong performance across multiple video tasks while enabling novel capabilities like visual prompt understanding and task composition.		A unified framework that combines video understanding, generation, and editing capabilities within a single model. Unlike existing approaches that handle these tasks separately, UniVideo can interpret complex multimodal instructions and perform diverse video operations through a dual-stream architecture. The system demonstrates strong performance across multiple video tasks while enabling novel capabilities like visual prompt understanding and task composition.
	UniVideo, a dual-stream framework combining a Multimodal Large Language Model and a Multimodal DiT, extends unified modeling to video generation and editing, achieving state-of-the-art performance and supporting task composition and generalization.		UniVideo, a dual-stream framework combining a Multimodal Large Language Model and a Multimodal DiT, extends unified modeling to video generation and editing, achieving state-of-the-art performance and supporting task composition and generalization.

	==Sources==		==Sources==

Pitpitt : Page créée avec « == EN CONSTRUCTION == == Définition == xxxxx == Français == '''UniVideo''' == Anglais == '''xxxUniVideoxx ''' A unified framework that combines video understanding, generation, and editing capabilities within a single model. Unlike existing approaches that handle these tasks separately, UniVideo can interpret complex multimodal instructions and perform diverse video operations through a dual-stream architecture. The system demonstrates strong performance a... »

2025-10-28T00:58:51Z

Page créée avec « == EN CONSTRUCTION == == Définition == xxxxx == Français == '''UniVideo''' == Anglais == '''xxxUniVideoxx ''' A unified framework that combines video understanding, generation, and editing capabilities within a single model. Unlike existing approaches that handle these tasks separately, UniVideo can interpret complex multimodal instructions and perform diverse video operations through a dual-stream architecture. The system demonstrates strong performance a... »

Nouvelle page

== EN CONSTRUCTION ==

== Définition ==
xxxxx

== Français ==
'''UniVideo'''

== Anglais ==
'''xxxUniVideoxx '''

A unified framework that combines video understanding, generation, and editing capabilities within a single model. Unlike existing approaches that handle these tasks separately, UniVideo can interpret complex multimodal instructions and perform diverse video operations through a dual-stream architecture. The system demonstrates strong performance across multiple video tasks while enabling novel capabilities like visual prompt understanding and task composition.
UniVideo, a dual-stream framework combining a Multimodal Large Language Model and a Multimodal DiT, extends unified modeling to video generation and editing, achieving state-of-the-art performance and supporting task composition and generalization.

==Sources==
[https://huggingface.co/papers/2510.08377 Sources : huggingface]

[[Catégorie:vocabulary]]