Statistics > Machine Learning
[Submitted on 21 May 2018 (v1), last revised 31 Jul 2019 (this version, v3)]
Title:Super learning in the SAS system
View PDFAbstract:Background and objective: Stacking is an ensemble machine learning method that averages predictions from multiple other algorithms, such as generalized linear models and regression trees. An implementation of stacking, called super learning, has been developed as a general approach to supervised learning and has seen frequent usage, in part due to the availability of an R package. We develop super learning in the SAS software system using a new macro, and demonstrate its performance relative to the R package.
Methods: Following previous work using the R SuperLearner package we assess the performance of super learning in a number of domains. We compare the R package with the new SAS macro in a small set of simulations assessing curve fitting in a predictive model as well in a set of 14 publicly available datasets to assess cross-validated accuracy.
Results: Across the simulated data and the publicly available data, the SAS macro performed similarly to the R package, despite a different set of potential algorithms available natively in R and SAS.
Conclusions: Our super learner macro performs as well as the R package at a number of tasks. Further, by extending the macro to include the use of R packages, the macro can leverage both the robust, enterprise oriented procedures in SAS and the nimble, cutting edge packages in R. In the spirit of ensemble learning, this macro extends the potential library of algorithms beyond a single software system and provides a simple avenue into machine learning in SAS.
Submission history
From: Alexander Keil [view email][v1] Mon, 21 May 2018 13:53:08 UTC (79 KB)
[v2] Tue, 22 May 2018 18:02:23 UTC (73 KB)
[v3] Wed, 31 Jul 2019 13:43:32 UTC (71 KB)
Current browse context:
stat.ML
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.