Statistics > Methodology
[Submitted on 31 Mar 2025]
Title:Using directed acyclic graphs to determine whether multiple imputation or subsample multiple imputation estimates of an exposure-outcome association are unbiased
View PDFAbstract:Background: Missing data is a pervasive problem in epidemiology, with complete records analyses (CRA) or multiple imputation (MI) the most common methods to deal with incomplete data. MI is valid when incomplete variables are independent of response indicators, conditional on complete variables - however, this can be hard to assess with multiple incomplete variables. Previous literature has shown that MI may be valid in subsamples of the data, even if not necessarily valid in the full dataset. Current guidance on how to decide whether MI is appropriate is lacking.
Methods: We develop an algorithm that is sufficient to indicate when MI will estimate an exposure-outcome coefficient without bias and show how to implement this using directed acyclic graphs (DAGs). We extend the algorithm to investigate whether MI applied to a subsample of the data, in which some variables and complete and the remaining are imputed, will be unbiased for the same estimand. We demonstrate the algorithm by applying it to several simple examples and a more complex real-life example.
Conclusions: Multiple incomplete variables are common in practice. Assessing the plausibility of each of CRA and MI estimating an exposure-outcome association without bias is crucial in analysing and interpreting results. Our algorithm provides researchers with the tools to decide whether (and how) to use MI in practice. Further work could focus on the likely size and direction of biases, and the impact of different missing data patterns.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.