« Gaussian Naive Bayes » : différence entre les versions


Aucun résumé des modifications
Aucun résumé des modifications
Ligne 1 : Ligne 1 :
==en construction==
== Définition ==
== Définition ==
XXXXXXXXX
VOIR '''[[classification naïve bayésienne]]'''


== Français ==
== Français ==
''' XXXXXXXXX '''
'''classification naïve bayésienne'''


== Anglais ==
== Anglais ==
''' Gaussian naïve Bayes'''
'''Gaussian naïve Bayes'''
When dealing with continuous data, a typical assumption is that the continuous values associated with each class are distributed according to a normal (or Gaussian) distribution. For example, suppose the training data contains a continuous attribute, {\displaystyle x}x. We first segment the data by the class, and then compute the mean and variance of {\displaystyle x}x in each class. Let {\displaystyle \mu _{k}}\mu _{k} be the mean of the values in {\displaystyle x}x associated with class Ck, and let {\displaystyle \sigma _{k}^{2}}{\displaystyle \sigma _{k}^{2}} be the Bessel corrected variance of the values in {\displaystyle x}x associated with class Ck. Suppose we have collected some observation value {\displaystyle v}v. Then, the probability distribution of {\displaystyle v}v given a class {\displaystyle C_{k}}C_{k}, {\displaystyle p(x=v\mid C_{k})}{\displaystyle p(x=v\mid C_{k})}, can be computed by plugging {\displaystyle v}v into the equation for a normal distribution parameterized by {\displaystyle \mu _{k}}\mu _{k} and {\displaystyle \sigma _{k}^{2}}{\displaystyle \sigma _{k}^{2}}. That is,
 
{\displaystyle p(x=v\mid C_{k})={\frac {1}{\sqrt {2\pi \sigma _{k}^{2}}}}\,e^{-{\frac {(v-\mu _{k})^{2}}{2\sigma _{k}^{2}}}}}{\displaystyle p(x=v\mid C_{k})={\frac {1}{\sqrt {2\pi \sigma _{k}^{2}}}}\,e^{-{\frac {(v-\mu _{k})^{2}}{2\sigma _{k}^{2}}}}}
Another common technique for handling continuous values is to use binning to discretize the feature values, to obtain a new set of Bernoulli-distributed features; some literature in fact suggests that this is necessary to apply naive Bayes, but it is not, and the discretization may throw away discriminative information.[5]
 
Sometimes the distribution of class-conditional marginal densities is far from normal. In these cases, kernel density estimation can be used for a more realistic estimate of the marginal densities of each class. This method, which was introduced by John and Langley,[12] can boost the accuracy of the classifier considerably. [13][14]


<small>
<small>

Version du 25 novembre 2021 à 12:59

Définition

VOIR classification naïve bayésienne

Français

classification naïve bayésienne

Anglais

Gaussian naïve Bayes

Source : Wikipedia Machine Learning

Contributeurs: Claire Gorjux, wiki