Division of Biostatistics and Health Data Science, School of Public Health, University of Minnesota, Minneapolis, MN 55414, United States.
Technology and Operations Management, Harvard Business School, Harvard University, Boston, MA 02163, United States.
Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btad777.
Multi-trait analysis has been shown to have greater statistical power than single-trait analysis. Most of the existing multi-trait analysis methods only work with a limited number of traits and usually prioritize high statistical power over identifying relevant traits, which heavily rely on domain knowledge.
To handle diseases and traits with obscure etiology, we developed TraitScan, a powerful and fast algorithm that identifies potential pleiotropic traits from a moderate or large number of traits (e.g. dozens to thousands) and tests the association between one genetic variant and the selected traits. TraitScan can handle either individual-level or summary-level GWAS data. We evaluated TraitScan using extensive simulations and found that it outperformed existing methods in terms of both testing power and trait selection when sparsity was low or modest. We then applied it to search for traits associated with Ewing Sarcoma, a rare bone tumor with peak onset in adolescence, among 754 traits in UK Biobank. Our analysis revealed a few promising traits worthy of further investigation, highlighting the use of TraitScan for more effective multi-trait analysis as biobanks emerge. We also extended TraitScan to search and test association with a polygenic risk score and genetically imputed gene expression.
Our algorithm is implemented in an R package "TraitScan" available at https://github.com/RuiCao34/TraitScan.
多性状分析已被证明比单性状分析具有更大的统计功效。大多数现有的多性状分析方法只能处理有限数量的性状,并且通常优先考虑高统计功效,而不是识别相关性状,这严重依赖于领域知识。
为了处理病因不明的疾病和性状,我们开发了 TraitScan,这是一种强大而快速的算法,可以从中等或大量性状(例如数十到数千个)中识别潜在的多效性状,并测试一个遗传变异与所选性状之间的关联。TraitScan 可以处理个体水平或汇总水平的 GWAS 数据。我们使用广泛的模拟评估了 TraitScan,发现当稀疏度较低或适中时,它在测试功效和性状选择方面都优于现有方法。然后,我们将其应用于在英国生物库中搜索与尤因肉瘤相关的性状,尤因肉瘤是一种罕见的青少年发病高峰期的骨肿瘤。我们的分析揭示了一些有前途的性状,值得进一步研究,强调了在生物库出现时使用 TraitScan 进行更有效的多性状分析。我们还扩展了 TraitScan 以搜索和测试与多基因风险评分和遗传推断的基因表达的关联。
我们的算法在一个名为“TraitScan”的 R 包中实现,可在 https://github.com/RuiCao34/TraitScan 上获得。