使用成分数据分析对双向列联表进行建模和可视化：关于偏头痛天数个体自我预测的案例研究

Modeling and visualizing two-way contingency tables using compositional data analysis: A case-study on individual self-prediction of migraine days.

作者信息

Vives-Mestres Marina, Casanova Amparo

机构信息

Department of Computer Science, Applied Mathematics and Statistics, Universitat de Girona, Girona, Spain.

Clinical Statistics, Curelator Inc., Cambridge, Massachusetts, USA.

出版信息

Stat Med. 2021 Jan 30;40(2):213-225. doi: 10.1002/sim.8769. Epub 2020 Oct 28.

DOI:10.1002/sim.8769

PMID:33113589

Abstract

Two-way contingency tables arise in many fields, such as in medical studies, where the relation between two discrete random variables or responses is to be assessed. We propose to analyze and visualize a sample of 2 × 2 tables in the context of single-subject repeated measurements design by means of compositional data (CoDa) methods. First, we propose to visualize the tables in a quaternary diagram. Second, we show how to represent these tables by means of logratios indicating the relationship between the two variables as well as their strength and direction of dependency. Finally, we describe a technique to model those tables with a simplicial regression model. Data from a real-world study of self-prediction of migraine attack onset is used to illustrate this methodology. For each individual, the 2 × 2 table of their migraine expectation vs next day migraine occurrence is computed, generating a sample of tables. Then we visualize and interpret the prediction ability of individuals both in the simplex and in terms of logratios of components. Finally, we model the self-prediction ability with respect to demographic variables, days tracked and disease characteristics. Our application demonstrates that CoDa can be a useful tool for visualizing, modeling, and interpreting the components of 2 × 2 tables.

摘要

双向列联表出现在许多领域，比如医学研究中，用于评估两个离散随机变量或反应之间的关系。我们建议在单受试者重复测量设计的背景下，通过成分数据（CoDa）方法对一个2×2表格样本进行分析和可视化。首先，我们建议在四元图中可视化这些表格。其次，我们展示如何通过对数比来表示这些表格，对数比可表明两个变量之间的关系以及它们的依赖强度和方向。最后，我们描述一种用单纯形回归模型对这些表格进行建模的技术。来自一项关于偏头痛发作自我预测的实际研究的数据用于说明这种方法。对于每个个体，计算其偏头痛预期与次日偏头痛发生情况的2×2表格，从而生成一个表格样本。然后我们在单纯形中以及根据成分的对数比来可视化和解释个体的预测能力。最后，我们针对人口统计学变量、跟踪天数和疾病特征对自我预测能力进行建模。我们的应用表明，成分数据可以成为可视化、建模和解释2×2表格成分的有用工具。