« Ensemble de données déséquilibré » : différence entre les versions
(Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == ''' Imbalanced Dataset''' Imbalanced datasets are those where the tar... ») |
Aucun résumé des modifications |
||
Ligne 12 : | Ligne 12 : | ||
Imbalanced datasets are those where the target attribute (the attribute to be predicted) is unevenly distributed. This is definitely not an uncommon scenario while working on data science problems. For example, predicting fraudulent credit card transactions is an excellent example of an imbalanced dataset. Because most of the credit card transactions would be genuine. Yet there are some fraudulent transactions as well. | Imbalanced datasets are those where the target attribute (the attribute to be predicted) is unevenly distributed. This is definitely not an uncommon scenario while working on data science problems. For example, predicting fraudulent credit card transactions is an excellent example of an imbalanced dataset. Because most of the credit card transactions would be genuine. Yet there are some fraudulent transactions as well. | ||
The imbalanced datasets need special attention as the normal approach to building models or evaluating performance would not work. Here is an article that talks in detail about imbalanced datasets and the best approaches to handle them better. | The imbalanced datasets need special attention as the normal approach to building models or evaluating performance would not work. Here is an article that talks in detail about imbalanced datasets and the best approaches to handle them better. | ||
Version du 3 janvier 2022 à 09:13
en construction
Définition
XXXXXXXXX
Français
XXXXXXXXX
Anglais
Imbalanced Dataset
Imbalanced datasets are those where the target attribute (the attribute to be predicted) is unevenly distributed. This is definitely not an uncommon scenario while working on data science problems. For example, predicting fraudulent credit card transactions is an excellent example of an imbalanced dataset. Because most of the credit card transactions would be genuine. Yet there are some fraudulent transactions as well.
The imbalanced datasets need special attention as the normal approach to building models or evaluating performance would not work. Here is an article that talks in detail about imbalanced datasets and the best approaches to handle them better.
Contributeurs: Marie Alfaro, wiki