Skip to main content

2001 | OriginalPaper | Buchkapitel

Improving Identification of Difficult Small Classes by Balancing Class Distribution

verfasst von : Jorma Laurikkala

Erschienen in: Artificial Intelligence in Medicine

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

We studied three methods to improve identification of difficult small classes by balancing imbalanced class distribution with data reduction. The new method, neighborhood cleaning rule (NCL), outperformed simple random and one-sided selection methods in experiments with ten data sets. All reduction methods improved identification of small classes (20–30%), but the differences were insignificant. However, significant differences in accuracies, true-positive rates and true-negative rates obtained with the 3-nearest neighbor method and C4.5 from the reduced data favored NCL. The results suggest that NCL is a useful method for improving the modeling of difficult small classes, and for building classifiers to identify these classes from the real-world data.

Metadaten
Titel
Improving Identification of Difficult Small Classes by Balancing Class Distribution
verfasst von
Jorma Laurikkala
Copyright-Jahr
2001
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-48229-6_9