Semantic and Cognitive Tools to Aid Statistical Science: Replace Confidence and Significance by Compatibility and Surprise

Rafi, Zad; Greenland, Sander

doi:10.1186/s12874-020-01105-9

Statistics > Methodology

arXiv:1909.08579 (stat)

[Submitted on 18 Sep 2019 (v1), last revised 1 Oct 2020 (this version, v7)]

Title:Semantic and Cognitive Tools to Aid Statistical Science: Replace Confidence and Significance by Compatibility and Surprise

Authors:Zad Rafi, Sander Greenland

View PDF

Abstract:Researchers often misinterpret and misrepresent statistical outputs. This abuse has led to a large literature on modification or replacement of testing thresholds and $P$-values with confidence intervals, Bayes factors, and other devices. Because the core problems appear cognitive rather than statistical, we review simple aids to statistical interpretations. These aids emphasize logical and information concepts over probability, and thus may be more robust to common misinterpretations than are traditional descriptions. We use the Shannon transform of the $P$-value $p$, also known as the binary surprisal or $S$-value $s=-\log_{2}(p)$, to measure the information supplied by the testing procedure, and to help calibrate intuitions against simple physical experiments like coin tossing. We also use tables or graphs of test statistics for alternative hypotheses, and interval estimates for different percentile levels, to thwart fallacies arising from arbitrary dichotomies. Finally, we reinterpret $P$-values and interval estimates in unconditional terms, which describe compatibility of data with the entire set of analysis assumptions. We illustrate these methods with a reanalysis of data from an existing record-based cohort study. In line with other recent recommendations, we advise that teaching materials and research reports discuss $P$-values as measures of compatibility rather than significance, compute $P$-values for alternative hypotheses whenever they are computed for null hypotheses, and interpret interval estimates as showing values of high compatibility with data, rather than regions of confidence. Our recommendations emphasize cognitive devices for displaying the compatibility of the observed data with various hypotheses of interest, rather than focusing on single hypothesis tests or interval estimates. We believe these simple reforms are well worth the minor effort they require.

Comments:	22 pages; 5 figures; 2 tables; 94 references; Published at BMC Medical Research Methodology
Subjects:	Methodology (stat.ME); Quantitative Methods (q-bio.QM); Applications (stat.AP)
Cite as:	arXiv:1909.08579 [stat.ME]
	(or arXiv:1909.08579v7 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.1909.08579
Journal reference:	BMC Med Res Methodol 20, 244 (2020)
Related DOI:	https://doi.org/10.1186/s12874-020-01105-9

Submission history

From: Zad Rafi [view email]
[v1] Wed, 18 Sep 2019 17:09:43 UTC (1,498 KB)
[v2] Thu, 19 Sep 2019 02:49:12 UTC (1,499 KB)
[v3] Sun, 22 Sep 2019 02:18:45 UTC (1,499 KB)
[v4] Fri, 19 Jun 2020 00:27:15 UTC (1,234 KB)
[v5] Wed, 8 Jul 2020 01:55:45 UTC (1,252 KB)
[v6] Tue, 29 Sep 2020 18:28:15 UTC (1,251 KB)
[v7] Thu, 1 Oct 2020 01:43:23 UTC (1,251 KB)

Statistics > Methodology

Title:Semantic and Cognitive Tools to Aid Statistical Science: Replace Confidence and Significance by Compatibility and Surprise

Submission history

Access Paper:

References & Citations

3 blog links

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title:Semantic and Cognitive Tools to Aid Statistical Science: Replace Confidence and Significance by Compatibility and Surprise

Submission history

Access Paper:

References & Citations

3 blog links

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators