Statistics > Applications
[Submitted on 18 Sep 2018]
Title:Pan-disease clustering analysis of the trend of period prevalence
View PDFAbstract:For all diseases, prevalence has been carefully studied. In the "classic" paradigm, the prevalence of different diseases has usually been studied separately. Accumulating evidences have shown that diseases can be "correlated". The joint analysis of prevalence of multiple diseases can provide important insights beyond individual-disease analysis, however, has not been well conducted. In this study, we take advantage of the uniquely valuable Taiwan National Health Insurance Research Database (NHIRD), and conduct a pan-disease analysis of period prevalence trend. The goal is to identify clusters within which diseases share similar period prevalence trends. For this purpose, a novel penalization pursuit approach is developed, which has an intuitive formulation and satisfactory properties. In data analysis, the period prevalence values are computed using records on close to 1 million subjects and 14 years of observation. For 405 diseases, 35 nontrivial clusters (with sizes larger than one) and 27 trivial clusters (with sizes one) are identified. The results differ significantly from those of the alternatives. A closer examination suggests that the clustering results have sound interpretations. This study is the first to conduct a pan-disease clustering analysis of prevalence trend using the uniquely valuable NHIRD data and can have important value in multiple aspects.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.