|
|
(4 versions intermédiaires par 2 utilisateurs non affichées) |
Ligne 1 : |
Ligne 1 : |
| ==en construction==
| | #REDIRECTION[[Classification naïve bayésienne]] |
|
| |
|
| == Définition ==
| | [[Catégorie:ENGLISH]] |
| XXXXXXXXX
| |
| | |
| == Français ==
| |
| ''' XXXXXXXXX '''
| |
| | |
| == Anglais ==
| |
| ''' Gaussian naïve Bayes'''
| |
| When dealing with continuous data, a typical assumption is that the continuous values associated with each class are distributed according to a normal (or Gaussian) distribution. For example, suppose the training data contains a continuous attribute, {\displaystyle x}x. We first segment the data by the class, and then compute the mean and variance of {\displaystyle x}x in each class. Let {\displaystyle \mu _{k}}\mu _{k} be the mean of the values in {\displaystyle x}x associated with class Ck, and let {\displaystyle \sigma _{k}^{2}}{\displaystyle \sigma _{k}^{2}} be the Bessel corrected variance of the values in {\displaystyle x}x associated with class Ck. Suppose we have collected some observation value {\displaystyle v}v. Then, the probability distribution of {\displaystyle v}v given a class {\displaystyle C_{k}}C_{k}, {\displaystyle p(x=v\mid C_{k})}{\displaystyle p(x=v\mid C_{k})}, can be computed by plugging {\displaystyle v}v into the equation for a normal distribution parameterized by {\displaystyle \mu _{k}}\mu _{k} and {\displaystyle \sigma _{k}^{2}}{\displaystyle \sigma _{k}^{2}}. That is,
| |
| | |
| {\displaystyle p(x=v\mid C_{k})={\frac {1}{\sqrt {2\pi \sigma _{k}^{2}}}}\,e^{-{\frac {(v-\mu _{k})^{2}}{2\sigma _{k}^{2}}}}}{\displaystyle p(x=v\mid C_{k})={\frac {1}{\sqrt {2\pi \sigma _{k}^{2}}}}\,e^{-{\frac {(v-\mu _{k})^{2}}{2\sigma _{k}^{2}}}}}
| |
| Another common technique for handling continuous values is to use binning to discretize the feature values, to obtain a new set of Bernoulli-distributed features; some literature in fact suggests that this is necessary to apply naive Bayes, but it is not, and the discretization may throw away discriminative information.[5]
| |
| | |
| Sometimes the distribution of class-conditional marginal densities is far from normal. In these cases, kernel density estimation can be used for a more realistic estimate of the marginal densities of each class. This method, which was introduced by John and Langley,[12] can boost the accuracy of the classifier considerably. [13][14]
| |
| | |
| <small>
| |
| | |
| [https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Gaussian_na%C3%AFve_Bayes Source : Wikipedia Machine Learning ]
| |
| | |
| | |
| [[Catégorie:vocabulary]]
| |
| [[Catégorie:Wikipedia-IA]] | |