QLoRA
en construction
Définition
XXXXXXXXX
Français
QLoRA
Anglais
QLoRA
QLoRA stands for quantized LoRA (low-rank adaptation). The standard LoRA method modifies a pretrained LLM by adding low-rank matrices to the weights of the model's layers. These matrices are smaller and, therefore, require fewer resources to update during finetuning. In QLoRA, these low-rank matrices are quantized, meaning their numerical precision is reduced. This is done by mapping the continuous range of values in these matrices to a limited set of discrete levels. This process reduces the model's memory footprint and computational demands, as operations on lower-precision numbers are less memory-intensive.
Contributeurs: Patrick Drouin, wiki