« Quantification » : différence entre les versions

Version du 30 octobre 2023 à 09:12

en construction

Définition

XXXXXXXXX

Français

XXXXXXXXX

Anglais

Quantisation

 allows us to reduce the size of our neural networks by converting the network’s weights and biases from their original floating-point format (e.g. 32-bit) to a lower precision format (e.g. 8-bit). The original floating point format can vary depending on several factors such as the model’s architecture and training processes. The ultimate purpose of quantisation is to reduce the size of our model, thereby reducing memory and computational requirements to run inference and train our model. Quantisation can very quickly become fiddly if you are attempting to quantise the models yourself.

Source : towardsdatascience

Source : mathworks

@@ Ligne 14 : / Ligne 14 : @@
 [https://towardsdatascience.com/quantisation-and-co-reducing-inference-times-on-llms-by-80-671db9349bdb    Source : towardsdatascience]
+[https://www.mathworks.com/discovery/quantization.html    Source : mathworks]
 [[Catégorie:vocabulary]]

« Quantification » : différence entre les versions

Version du 30 octobre 2023 à 09:12

en construction

Définition

Français

Anglais

« Quantification » : différence entre les versions