Radial-Based Undersampling for Imbalanced Data Classification

Koziarski, Michał

Computer Science > Machine Learning

arXiv:1906.00452 (cs)

[Submitted on 2 Jun 2019 (v1), last revised 17 Apr 2021 (this version, v2)]

Title:Radial-Based Undersampling for Imbalanced Data Classification

Authors:Michał Koziarski

View PDF

Abstract:Data imbalance remains one of the most widespread problems affecting contemporary machine learning. The negative effect data imbalance can have on the traditional learning algorithms is most severe in combination with other dataset difficulty factors, such as small disjuncts, presence of outliers and insufficient number of training observations. Aforementioned difficulty factors can also limit the applicability of some of the methods of dealing with data imbalance, in particular the neighborhood-based oversampling algorithms based on SMOTE. Radial-Based Oversampling (RBO) was previously proposed to mitigate some of the limitations of the neighborhood-based methods. In this paper we examine the possibility of utilizing the concept of mutual class potential, used to guide the oversampling process in RBO, in the undersampling procedure. Conducted computational complexity analysis indicates a significantly reduced time complexity of the proposed Radial-Based Undersampling algorithm, and the results of the performed experimental study indicate its usefulness, especially on difficult datasets.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1906.00452 [cs.LG]
	(or arXiv:1906.00452v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.00452

Submission history

From: Michał Koziarski [view email]
[v1] Sun, 2 Jun 2019 17:06:28 UTC (972 KB)
[v2] Sat, 17 Apr 2021 13:51:23 UTC (983 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-06

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Michal Koziarski

export BibTeX citation

Computer Science > Machine Learning

Title:Radial-Based Undersampling for Imbalanced Data Classification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Radial-Based Undersampling for Imbalanced Data Classification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators