Computer Science > Neural and Evolutionary Computing
[Submitted on 8 Feb 2022]
Title:Feature subset selection for Big Data via Chaotic Binary Differential Evolution under Apache Spark
View PDFAbstract:Feature subset selection (FSS) using a wrapper approach is essentially a combinatorial optimization problem having two objective functions namely cardinality of the selected-feature-subset, which should be minimized and the corresponding area under the ROC curve (AUC) to be maximized. In this research study, we propose a novel multiplicative single objective function involving cardinality and AUC. The randomness involved in the Binary Differential Evolution (BDE) may yield less diverse solutions thereby getting trapped in local minima. Hence, we embed Logistic and Tent chaotic maps into the BDE and named it as Chaotic Binary Differential Evolution (CBDE). Designing a scalable solution to the FSS is critical when dealing with high-dimensional and voluminous datasets. Hence, we propose a scalable island (iS) based parallelization approach where the data is divided into multiple partitions/islands thereby the solution evolves individually and gets combined eventually in a migration strategy. The results empirically show that the proposed parallel Chaotic Binary Differential Evolution (P-CBDE-iS) is able to find the better quality feature subsets than the Parallel Bi-nary Differential Evolution (P-BDE-iS). Logistic Regression (LR) is used as a classifier owing to its simplicity and power. The speedup attained by the proposed parallel approach signifies the importance.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.