Department of Pathology and Laboratory Medicine, University of California, Irvine, CA 92697, USA.
Anticancer Agents Med Chem. 2013 Feb;13(2):203-11. doi: 10.2174/1871520611313020004.
In case-control profiling studies, increasing the sample size does not always improve statistical power because the variance may also be increased if samples are highly heterogeneous. For instance, tumor samples used for gene expression assay are often heterogeneous in terms of tissue composition or mechanism of progression, or both; however, such variation is rarely taken into account in expression profiles analysis. We use a prostate cancer prognosis study as an example to demonstrate that solely recruiting more patient samples may not increase power for biomarker detection at all. In response to the heterogeneity due to mixed tissue, we developed a sample selection strategy termed Stepwise Enrichment by which samples are systematically culled based on tumor content and analyzed with t-test to determine an optimal threshold for tissue percentage. The selected tissue-percentage threshold identified the most significant data by balancing the sample size and the sample homogeneity; therefore, the power is substantially increased for identifying the prognostic biomarkers in prostate tumor epithelium cells as well as in prostate stroma cells. This strategy can be generally applied to profiling studies where the level of sample heterogeneity can be measured or estimated.
在病例对照分析研究中,增加样本量并不总是能提高统计功效,因为如果样本高度异质,方差也可能增加。例如,用于基因表达检测的肿瘤样本在组织成分或进展机制方面通常存在异质性,或者两者兼有;然而,在表达谱分析中很少考虑到这种变化。我们以前列腺癌预后研究为例来说明,仅仅招募更多的患者样本并不能提高生物标志物检测的功效。为了应对由于混合组织引起的异质性,我们开发了一种样本选择策略,称为逐步富集,通过该策略可以根据肿瘤含量系统地剔除样本,并使用 t 检验进行分析,以确定组织百分比的最佳阈值。所选的组织百分比阈值通过平衡样本量和样本同质性来确定最显著的数据;因此,在识别前列腺肿瘤上皮细胞和前列腺基质细胞中的预后生物标志物方面,功效大大提高。该策略可广泛应用于可以测量或估计样本异质性水平的分析研究。