Super learning in the SAS system

Keil, Alexander P.; Westreich, Daniel; Edwards, Jessie K; Cole, Stephen R

Statistics > Machine Learning

arXiv:1805.08058 (stat)

[Submitted on 21 May 2018 (v1), last revised 31 Jul 2019 (this version, v3)]

Title:Super learning in the SAS system

Authors:Alexander P. Keil, Daniel Westreich, Jessie K Edwards, Stephen R Cole

View PDF

Abstract:Background and objective: Stacking is an ensemble machine learning method that averages predictions from multiple other algorithms, such as generalized linear models and regression trees. An implementation of stacking, called super learning, has been developed as a general approach to supervised learning and has seen frequent usage, in part due to the availability of an R package. We develop super learning in the SAS software system using a new macro, and demonstrate its performance relative to the R package.
Methods: Following previous work using the R SuperLearner package we assess the performance of super learning in a number of domains. We compare the R package with the new SAS macro in a small set of simulations assessing curve fitting in a predictive model as well in a set of 14 publicly available datasets to assess cross-validated accuracy.
Results: Across the simulated data and the publicly available data, the SAS macro performed similarly to the R package, despite a different set of potential algorithms available natively in R and SAS.
Conclusions: Our super learner macro performs as well as the R package at a number of tasks. Further, by extending the macro to include the use of R packages, the macro can leverage both the robust, enterprise oriented procedures in SAS and the nimble, cutting edge packages in R. In the spirit of ensemble learning, this macro extends the potential library of algorithms beyond a single software system and provides a simple avenue into machine learning in SAS.

Comments:	7 pages, 1 table, 3 figures
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1805.08058 [stat.ML]
	(or arXiv:1805.08058v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1805.08058

Submission history

From: Alexander Keil [view email]
[v1] Mon, 21 May 2018 13:53:08 UTC (79 KB)
[v2] Tue, 22 May 2018 18:02:23 UTC (73 KB)
[v3] Wed, 31 Jul 2019 13:43:32 UTC (71 KB)

Statistics > Machine Learning

Title:Super learning in the SAS system

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Super learning in the SAS system

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators