• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用惩罚方法对全基因组序列数据进行罕见变异分析。

Rare variants analysis using penalization methods for whole genome sequence data.

作者信息

Yazdani Akram, Yazdani Azam, Boerwinkle Eric

机构信息

Human Genetics Center, University of Texas Health Science Center at Houston, TX, USA.

Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.

出版信息

BMC Bioinformatics. 2015 Dec 4;16:405. doi: 10.1186/s12859-015-0825-4.

DOI:10.1186/s12859-015-0825-4
PMID:26637205
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4670502/
Abstract

BACKGROUND

Availability of affordable and accessible whole genome sequencing for biomedical applications poses a number of statistical challenges and opportunities, particularly related to the analysis of rare variants and sparseness of the data. Although efforts have been devoted to address these challenges, the performance of statistical methods for rare variants analysis still needs further consideration.

RESULT

We introduce a new approach that applies restricted principal component analysis with convex penalization and then selects the best predictors of a phenotype by a concave penalized regression model, while estimating the impact of each genomic region on the phenotype. Using simulated data, we show that the proposed method maintains good power for association testing while keeping the false discovery rate low under a verity of genetic architectures. Illustrative data analyses reveal encouraging result of this method in comparison with other commonly applied methods for rare variants analysis.

CONCLUSION

By taking into account linkage disequilibrium and sparseness of the data, the proposed method improves power and controls the false discovery rate compared to other commonly applied methods for rare variant analyses.

摘要

背景

可用于生物医学应用的经济实惠且易于获取的全基因组测序带来了诸多统计挑战和机遇,特别是在罕见变异分析和数据稀疏性方面。尽管已经致力于应对这些挑战,但用于罕见变异分析的统计方法的性能仍需进一步考量。

结果

我们引入了一种新方法,该方法应用带凸惩罚的受限主成分分析,然后通过凹惩罚回归模型选择表型的最佳预测因子,同时估计每个基因组区域对表型的影响。使用模拟数据,我们表明所提出的方法在关联测试中保持了良好的功效,同时在多种遗传结构下保持较低的错误发现率。实例数据分析显示,与其他常用的罕见变异分析方法相比,该方法取得了令人鼓舞的结果。

结论

通过考虑连锁不平衡和数据稀疏性,与其他常用的罕见变异分析方法相比,所提出的方法提高了功效并控制了错误发现率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/ed41267b4097/12859_2015_825_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/57e2958a86bc/12859_2015_825_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/16947a9fb459/12859_2015_825_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/42551c225807/12859_2015_825_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/5e28e838c111/12859_2015_825_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/ed41267b4097/12859_2015_825_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/57e2958a86bc/12859_2015_825_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/16947a9fb459/12859_2015_825_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/42551c225807/12859_2015_825_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/5e28e838c111/12859_2015_825_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/448c/4670502/ed41267b4097/12859_2015_825_Fig5_HTML.jpg

相似文献

1
Rare variants analysis using penalization methods for whole genome sequence data.使用惩罚方法对全基因组序列数据进行罕见变异分析。
BMC Bioinformatics. 2015 Dec 4;16:405. doi: 10.1186/s12859-015-0825-4.
2
On association analysis of rare variants under population substructure: an approach for the detection of subjects that can cause bias in the analysis--T opt: an outlier detection method.基于群体亚结构下稀有变异的关联分析:一种检测可能导致分析偏差的个体的方法——Topt:一种异常值检测方法。
Genet Epidemiol. 2013 Jul;37(5):431-9. doi: 10.1002/gepi.21734. Epub 2013 May 14.
3
Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies.全基因组测序研究中稀有变异关联区域的动态扫描程序。
Am J Hum Genet. 2019 May 2;104(5):802-814. doi: 10.1016/j.ajhg.2019.03.002. Epub 2019 Apr 12.
4
Methods for collapsing multiple rare variants in whole-genome sequence data.全基因组序列数据中多个罕见变异的合并方法。
Genet Epidemiol. 2014 Sep;38 Suppl 1(0 1):S13-20. doi: 10.1002/gepi.21820.
5
Selection Probability for Rare Variant Association Studies.罕见变异关联研究的选择概率
J Comput Biol. 2017 May;24(5):400-411. doi: 10.1089/cmb.2016.0222. Epub 2017 Mar 10.
6
A Novel Statistic for Global Association Testing Based on Penalized Regression.一种基于惩罚回归的用于全局关联检验的新型统计量。
Genet Epidemiol. 2015 Sep;39(6):415-26. doi: 10.1002/gepi.21915.
7
Rare variant association test with multiple phenotypes.针对多种表型的罕见变异关联测试。
Genet Epidemiol. 2017 Apr;41(3):198-209. doi: 10.1002/gepi.22021. Epub 2016 Dec 31.
8
DoEstRare: A statistical test to identify local enrichments in rare genomic variants associated with disease.DoEstRare:一种用于识别与疾病相关的罕见基因组变异中局部富集情况的统计检验。
PLoS One. 2017 Jul 24;12(7):e0179364. doi: 10.1371/journal.pone.0179364. eCollection 2017.
9
A haplotype-based framework for group-wise transmission/disequilibrium tests for rare variant association analysis.一种基于单倍型的框架,用于罕见变异关联分析的分组传递/不平衡检验。
Bioinformatics. 2015 May 1;31(9):1452-9. doi: 10.1093/bioinformatics/btu860. Epub 2015 Jan 6.
10
Identification of Rare Variants in Metabolites of the Carnitine Pathway by Whole Genome Sequencing Analysis.通过全基因组测序分析鉴定肉碱途径代谢物中的罕见变异体。
Genet Epidemiol. 2016 Sep;40(6):486-91. doi: 10.1002/gepi.21980. Epub 2016 Jun 3.

引用本文的文献

1
A review of model evaluation metrics for machine learning in genetics and genomics.遗传学和基因组学中机器学习模型评估指标综述。
Front Bioinform. 2024 Sep 10;4:1457619. doi: 10.3389/fbinf.2024.1457619. eCollection 2024.
2
A tree-based gene-environment interaction analysis with rare features.一种基于树的具有罕见特征的基因-环境相互作用分析。
Stat Anal Data Min. 2022 Oct;15(5):648-674. doi: 10.1002/sam.11578. Epub 2022 Mar 1.
3
[An improved association analysis pipeline for tumor susceptibility variant in haplotype amplification area].

本文引用的文献

1
Whole genome sequencing data from pedigrees suggests linkage disequilibrium among rare variants created by population admixture.来自家系的全基因组测序数据表明,群体混合产生的罕见变异之间存在连锁不平衡。
BMC Proc. 2014 Jun 17;8(Suppl 1):S44. doi: 10.1186/1753-6561-8-S1-S44. eCollection 2014.
2
Function and distribution of apolipoprotein A1 in the artery wall are markedly distinct from those in plasma.载脂蛋白 A1 在动脉壁中的功能和分布与在血浆中的明显不同。
Circulation. 2013 Oct 8;128(15):1644-55. doi: 10.1161/CIRCULATIONAHA.113.002624. Epub 2013 Aug 22.
3
A linkage disequilibrium-based approach to selecting disease-associated rare variants.
[一种用于单倍型扩增区域肿瘤易感性变异的改进关联分析流程]
Nan Fang Yi Ke Da Xue Xue Bao. 2020 Oct 30;40(10):1493-1499. doi: 10.12122/j.issn.1673-4254.2020.10.16.
4
locStra: Fast analysis of regional/global stratification in whole-genome sequencing studies.快速分析全基因组测序研究中的区域/全局分层。
Genet Epidemiol. 2021 Feb;45(1):82-98. doi: 10.1002/gepi.22356. Epub 2020 Sep 14.
5
Genome analysis and pleiotropy assessment using causal networks with loss of function mutation and metabolomics.利用功能丧失突变和代谢组学的因果网络进行基因组分析和多效性评估。
BMC Genomics. 2019 May 21;20(1):395. doi: 10.1186/s12864-019-5772-4.
6
Reliable heritability estimation using sparse regularization in ultrahigh dimensional genome-wide association studies.利用超高维全基因组关联研究中的稀疏正则化进行可靠的遗传力估计。
BMC Bioinformatics. 2019 Apr 30;20(1):219. doi: 10.1186/s12859-019-2792-7.
7
A Multi-Trait Approach Identified Genetic Variants Including a Rare Mutation in RGS3 with Impact on Abnormalities of Cardiac Structure/Function.一种多特征分析方法鉴定了 RGS3 中的遗传变异,包括一种罕见的突变,该突变与心脏结构/功能异常有关。
Sci Rep. 2019 Apr 10;9(1):5845. doi: 10.1038/s41598-019-41362-3.
8
An improved burden-test pipeline for identifying associations from rare germline and somatic variants.一种改进的负担测试管道,用于从罕见的种系和体细胞变异中识别关联。
BMC Genomics. 2017 Oct 16;18(Suppl 7):753. doi: 10.1186/s12864-017-4133-4.
9
Longitudinal data analysis for rare variants detection with penalized quadratic inference function.基于惩罚二次推断函数的稀有变异检测的纵向数据分析。
Sci Rep. 2017 Apr 5;7(1):650. doi: 10.1038/s41598-017-00712-9.
10
On the association analysis of genome-sequencing data: A spatial clustering approach for partitioning the entire genome into nonoverlapping windows.关于基因组测序数据的关联分析:一种将整个基因组划分为非重叠窗口的空间聚类方法。
Genet Epidemiol. 2017 May;41(4):332-340. doi: 10.1002/gepi.22040. Epub 2017 Mar 20.
基于连锁不平衡的疾病相关罕见变异选择方法。
PLoS One. 2013 Jul 11;8(7):e69226. doi: 10.1371/journal.pone.0069226. Print 2013.
4
Whole-genome sequence-based analysis of high-density lipoprotein cholesterol.基于全基因组序列的高密度脂蛋白胆固醇分析。
Nat Genet. 2013 Aug;45(8):899-901. doi: 10.1038/ng.2671. Epub 2013 Jun 16.
5
Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies.最优统一方法用于罕见变异关联测试及其在小样本病例对照全外显子测序研究中的应用。
Am J Hum Genet. 2012 Aug 10;91(2):224-37. doi: 10.1016/j.ajhg.2012.06.007. Epub 2012 Aug 2.
6
Evolution and functional impact of rare coding variation from deep sequencing of human exomes.人类外显子组深度测序中罕见编码变异的进化和功能影响。
Science. 2012 Jul 6;337(6090):64-9. doi: 10.1126/science.1219240. Epub 2012 May 17.
7
Associations between lipoprotein(a) levels and cardiovascular outcomes in black and white subjects: the Atherosclerosis Risk in Communities (ARIC) Study.脂蛋白(a)水平与黑人和白人受试者心血管结局的关系:动脉粥样硬化风险社区(ARIC)研究。
Circulation. 2012 Jan 17;125(2):241-9. doi: 10.1161/CIRCULATIONAHA.111.045120. Epub 2011 Nov 29.
8
Clan genomics and the complex architecture of human disease.族基因组学与人类疾病的复杂结构。
Cell. 2011 Sep 30;147(1):32-43. doi: 10.1016/j.cell.2011.09.008.
9
Rare-variant association testing for sequencing data with the sequence kernel association test.基于序列核关联检验的测序数据罕见变异关联分析
Am J Hum Genet. 2011 Jul 15;89(1):82-93. doi: 10.1016/j.ajhg.2011.05.029. Epub 2011 Jul 7.
10
Testing for an unusual distribution of rare variants.检测罕见变异的异常分布。
PLoS Genet. 2011 Mar;7(3):e1001322. doi: 10.1371/journal.pgen.1001322. Epub 2011 Mar 3.