« Feature selection » : différence entre les versions


m (Remplacement de texte — « == Domaine == » par « == en construction == <small>Entrez ici les domaines et catégories...</small> »)
Aucun résumé des modifications
Balise : Éditeur de wikicode 2017
 
(2 versions intermédiaires par le même utilisateur non affichées)
Ligne 1 : Ligne 1 :
#REDIRECTION[[Sélection de caractéristiques]]


== en construction ==
[[Catégorie:ENGLISH]]
<small>Entrez ici les domaines et catégories...</small>
[[Category:Vocabulary]]
== Définition ==
 
 
 
== Français ==
 
== Anglais ==
 
''' Feature selection '''
 
 
In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Feature selection techniques are used for four reasons:
 
* simplification of models to make them easier to interpret by researchers/users,[1]
* shorter training times,
* to avoid the curse of dimensionality,
* enhanced generalization by reducing overfitting[2] (formally, reduction of variance[1])
 
The central premise when using a feature selection technique is that the data contains many features that are either redundant or irrelevant, and can thus be removed without incurring much loss of information.[2] Redundant or irrelevant features are two distinct notions, since one relevant feature may be redundant in the presence of another relevant feature with which it is strongly correlated.[3]
 
Feature selection techniques should be distinguished from feature extraction. Feature extraction creates new features from functions of the original features, whereas feature selection returns a subset of the features. Feature selection techniques are often used in domains where there are many features and comparatively few samples (or data points). Archetypal cases for the application of feature selection include the analysis of written texts and DNA microarray data, where there are many thousands of features, and a few tens to hundreds of samples.
 
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

Dernière version du 25 septembre 2019 à 22:40



Contributeurs: wiki