Lee Joowon, Lee Seungyeoun, Jang Jin-Young, Park Taesung
Department of Statistics, Seoul National University, Seoul, South Korea.
Department of Applied Statistics, Sejong University, Seoul, South Korea.
BMC Med Genomics. 2018 Apr 20;11(Suppl 2):30. doi: 10.1186/s12920-018-0344-z.
Recent statistical methods for next generation sequencing (NGS) data have been successfully applied to identifying rare genetic variants associated with certain diseases. However, most commonly used methods (e.g., burden tests and variance-component tests) rely on large sample sizes. Notwithstanding, due to its-still high cost, NGS data is generally restricted to small sample sizes, that cannot be analyzed by most existing methods.
In this work, we propose a new exact association test for sequencing data that does not require a large sample approximation, which is applicable to both common and rare variants. Our method, based on the Generalized Cochran-Mantel-Haenszel (GCMH) statistic, was applied to NGS datasets from intraductal papillary mucinous neoplasm (IPMN) patients. IPMN is a unique pancreatic cancer subtype that can turn into an invasive and hard-to-treat metastatic disease.
Application of our method to IPMN data successfully identified susceptible genes associated with progression of IPMN to pancreatic cancer.
Our method is expected to identify disease-associated genetic variants more successfully, and corresponding signal pathways, improving our understanding of specific disease's etiology and prognosis.
用于下一代测序(NGS)数据的最新统计方法已成功应用于识别与某些疾病相关的罕见遗传变异。然而,大多数常用方法(例如,负荷检验和方差成分检验)依赖于大样本量。尽管如此,由于其成本仍然很高,NGS数据通常限于小样本量,而大多数现有方法无法对其进行分析。
在这项工作中,我们提出了一种用于测序数据的新的精确关联检验,该检验不需要大样本近似,适用于常见和罕见变异。我们基于广义 Cochr an-Mantel-Haenszel(GCMH)统计量的方法应用于导管内乳头状黏液性肿瘤(IPMN)患者的NGS数据集。IPMN是一种独特的胰腺癌亚型,可转变为侵袭性且难以治疗的转移性疾病。
将我们的方法应用于IPMN数据成功识别出与IPMN进展为胰腺癌相关的易感基因。
我们的方法有望更成功地识别与疾病相关的遗传变异以及相应的信号通路,增进我们对特定疾病病因和预后的理解。