Borrowing Information from an Unidentifiable Model: Guaranteed Efficiency Gain with a Dichotomized Outcome in the External Data

Wang, Lu; Ma, Yanyuan; Zhao, Jiwei

Abstract:In the era of big data, the increasing availability of diverse data sources has driven interest in analytical approaches that integrate information across sources to enhance statistical accuracy, efficiency, and scientific insights. Many existing methods assume exchangeability among data sources and often implicitly require that sources measure identical covariates or outcomes, or that the error distribution is correctly specified-assumptions that may not hold in complex real-world scenarios. This paper explores the integration of data from sources with distinct outcome scales, focusing on leveraging external data to improve statistical efficiency. Specifically, we consider a scenario where the primary dataset includes a continuous outcome, and external data provides a dichotomized version of the same outcome. We propose two novel estimators: the first estimator remains asymptotically consistent even when the error distribution is potentially misspecified, while the second estimator guarantees an efficiency gain over weighted least squares estimation that uses the primary study data alone. Theoretical properties of these estimators are rigorously derived, and extensive simulation studies are conducted to highlight their robustness and efficiency gains across various scenarios. Finally, a real-world application using the NHANES dataset demonstrates the practical utility of the proposed methods.

Subjects:	Methodology (stat.ME)
MSC classes:	62F35 (Primary)
Cite as:	arXiv:2501.06360 [stat.ME]
	(or arXiv:2501.06360v1 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.2501.06360

Statistics > Methodology

Title:Borrowing Information from an Unidentifiable Model: Guaranteed Efficiency Gain with a Dichotomized Outcome in the External Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators