Suppr超能文献

有序变量的最大选择卡方统计量。

Maximally selected chi-square statistics for ordinal variables.

作者信息

Boulesteix Anne-Laure

机构信息

Department of Statistics, University of Munich, Akademiestrasse 1, D-80799 Munich, Germany.

出版信息

Biom J. 2006 Jun;48(3):451-62. doi: 10.1002/bimj.200510161.

Abstract

The association between a binary variable Y and a variable X having an at least ordinal measurement scale might be examined by selecting a cutpoint in the range of X and then performing an association test for the obtained 2 x 2 contingency table using the chi-square statistic. The distribution of the maximally selected chi-square statistic (i.e. the maximal chi-square statistic over all possible cutpoints) under the null-hypothesis of no association between X and Y is different from the known chi-square distribution. In the last decades, this topic has been extensively studied for continuous X variables, but not for non-continuous variables of at least ordinal measurement scale (which include e.g. classical ordinal or discretized continuous variables). In this paper, we suggest an exact method to determine the finite-sample distribution of maximally selected chi-square statistics in this context. This novel approach can be seen as a method to measure the association between a binary variable and variables having an at least ordinal scale of different types (ordinal, discretized continuous, etc). As an illustration, this method is applied to a new data set describing pregnancy and birth for 811 babies.

摘要

二元变量Y与至少具有有序测量尺度的变量X之间的关联,可以通过在X的取值范围内选择一个切点,然后使用卡方统计量对得到的2×2列联表进行关联检验来考察。在X与Y无关联的原假设下,最大选择卡方统计量(即所有可能切点上的最大卡方统计量)的分布不同于已知的卡方分布。在过去几十年中,针对连续型X变量对该主题进行了广泛研究,但对于至少具有有序测量尺度的非连续变量(包括例如经典有序变量或离散化连续变量)却没有。在本文中,我们提出一种精确方法来确定在此背景下最大选择卡方统计量的有限样本分布。这种新方法可以看作是一种测量二元变量与具有不同类型(有序、离散化连续等)至少有序尺度的变量之间关联的方法。作为示例,该方法应用于一个描述811名婴儿怀孕和出生情况的新数据集。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验