Computer Science > Machine Learning
[Submitted on 3 Mar 2021 (v1), last revised 24 Jan 2022 (this version, v2)]
Title:Lazy FSCA for Unsupervised Variable Selection
View PDFAbstract:Various unsupervised greedy selection methods have been proposed as computationally tractable approximations to the NP-hard subset selection problem. These methods rely on sequentially selecting the variables that best improve performance with respect to a selection criterion. Theoretical results exist that provide performance bounds and enable "lazy greedy" efficient implementations for selection criteria that satisfy a diminishing returns property known as submodularity. This has motivated the development of variable selection algorithms based on mutual information and frame potential. Recently, the authors introduced Forward Selection Component Analysis (FSCA) which uses variance explained as its selection criterion. While this criterion is not submodular, FSCA has been shown to be highly effective for applications such as measurement plan optimisation. In this paper a "lazy" implementation of the FSCA algorithm (L-FSCA) is proposed, which, although not equivalent to FSCA due to the absence of submodularity, has the potential to yield comparable performance while being up to an order of magnitude faster to compute. The efficacy of L-FSCA is demonstrated by performing a systematic comparison with FSCA and five other unsupervised variable selection methods from the literature using simulated and real-world case studies. Experimental results confirm that L-FSCA yields almost identical performance to FSCA while reducing computation time by between 22% and 94% for the case studies considered.
Submission history
From: Federico Zocco [view email][v1] Wed, 3 Mar 2021 21:10:26 UTC (553 KB)
[v2] Mon, 24 Jan 2022 12:14:10 UTC (291 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
IArxiv Recommender
(What is IArxiv?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.