• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种名为DISSECT的新工具,用于使用大数据方法分析大型基因组数据集。

A new tool called DISSECT for analysing large genomic data sets using a Big Data approach.

作者信息

Canela-Xandri Oriol, Law Andy, Gray Alan, Woolliams John A, Tenesa Albert

机构信息

The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush Campus, Edinburgh EH25 9RG, UK.

EPCC, The University of Edinburgh, Edinburgh EH9 3FD, UK.

出版信息

Nat Commun. 2015 Dec 11;6:10162. doi: 10.1038/ncomms10162.

DOI:10.1038/ncomms10162
PMID:26657010
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4682108/
Abstract

Large-scale genetic and genomic data are increasingly available and the major bottleneck in their analysis is a lack of sufficiently scalable computational tools. To address this problem in the context of complex traits analysis, we present DISSECT. DISSECT is a new and freely available software that is able to exploit the distributed-memory parallel computational architectures of compute clusters, to perform a wide range of genomic and epidemiologic analyses, which currently can only be carried out on reduced sample sizes or under restricted conditions. We demonstrate the usefulness of our new tool by addressing the challenge of predicting phenotypes from genotype data in human populations using mixed-linear model analysis. We analyse simulated traits from 470,000 individuals genotyped for 590,004 SNPs in ∼4 h using the combined computational power of 8,400 processor cores. We find that prediction accuracies in excess of 80% of the theoretical maximum could be achieved with large sample sizes.

摘要

大规模的遗传和基因组数据越来越容易获取,而对其进行分析的主要瓶颈是缺乏足够可扩展的计算工具。为了在复杂性状分析的背景下解决这个问题,我们推出了DISSECT。DISSECT是一款全新的免费软件,它能够利用计算集群的分布式内存并行计算架构,进行广泛的基因组和流行病学分析,而这些分析目前只能在样本量减少或条件受限的情况下进行。我们通过使用混合线性模型分析应对从人类群体的基因型数据预测表型这一挑战,展示了我们新工具的实用性。我们利用8400个处理器核心的综合计算能力,在约4小时内分析了470,000名个体针对590,004个单核苷酸多态性(SNP)进行基因分型的模拟性状。我们发现,在大样本量的情况下,可以实现超过理论最大值80%的预测准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/95838cc99d04/ncomms10162-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/7115309ce653/ncomms10162-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/2122ee084e2c/ncomms10162-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/70b5e44a488f/ncomms10162-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/95838cc99d04/ncomms10162-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/7115309ce653/ncomms10162-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/2122ee084e2c/ncomms10162-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/70b5e44a488f/ncomms10162-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc26/4682108/95838cc99d04/ncomms10162-f4.jpg

相似文献

1
A new tool called DISSECT for analysing large genomic data sets using a Big Data approach.一种名为DISSECT的新工具,用于使用大数据方法分析大型基因组数据集。
Nat Commun. 2015 Dec 11;6:10162. doi: 10.1038/ncomms10162.
2
HIBLUP: an integration of statistical models on the BLUP framework for efficient genetic evaluation using big genomic data.HIBLUP:BLUP 框架上的统计模型集成,用于使用大型基因组数据进行高效的遗传评估。
Nucleic Acids Res. 2023 May 8;51(8):3501-3512. doi: 10.1093/nar/gkad074.
3
Erratum: High-Throughput Identification of Resistance to Pseudomonas syringae pv. Tomato in Tomato using Seedling Flood Assay.勘误:利用幼苗浸没法高通量鉴定番茄对丁香假单胞菌 pv.番茄的抗性。
J Vis Exp. 2023 Oct 18(200). doi: 10.3791/6576.
4
Comparing algorithms to approximate accuracies for single-step genomic best linear unbiased predictor.比较算法以近似单步基因组最佳线性无偏预测器的准确性。
J Anim Sci. 2024 Jan 3;102. doi: 10.1093/jas/skae195.
5
MARV: a tool for genome-wide multi-phenotype analysis of rare variants.MARV:一种用于罕见变异全基因组多表型分析的工具。
BMC Bioinformatics. 2017 Feb 16;18(1):110. doi: 10.1186/s12859-017-1530-2.
6
Breeding and Genetics Symposium: really big data: processing and analysis of very large data sets.繁殖与遗传学研讨会:真正的大数据:超大数据集的处理和分析。
J Anim Sci. 2012 Mar;90(3):723-33. doi: 10.2527/jas.2011-4584. Epub 2011 Nov 18.
7
OCMA: Fast, Memory-Efficient Factorization of Prohibitively Large Relationship Matrices.OCMA:快速、高效地分解超大关系矩阵。
G3 (Bethesda). 2019 Jan 9;9(1):13-19. doi: 10.1534/g3.118.200908.
8
Development of genomic predictions for Angus cattle in Brazil incorporating genotypes from related American sires.发展巴西安格斯牛的基因组预测,纳入相关美国父本的基因型。
J Anim Sci. 2022 Feb 1;100(2). doi: 10.1093/jas/skac009.
9
BGData - A Suite of R Packages for Genomic Analysis with Big Data.BGData - 一套用于大数据基因组分析的 R 包。
G3 (Bethesda). 2019 May 7;9(5):1377-1383. doi: 10.1534/g3.119.400018.
10
parSMURF, a high-performance computing tool for the genome-wide detection of pathogenic variants.parSMURF,一种用于全基因组致病性变异检测的高性能计算工具。
Gigascience. 2020 May 1;9(5). doi: 10.1093/gigascience/giaa052.

引用本文的文献

1
Evaluating regional heritability mapping methods for identifying QTLs in a wild population of Soay sheep.评估区域遗传力定位方法以鉴定索艾羊野生种群中的数量性状基因座。
Heredity (Edinb). 2025 May 23. doi: 10.1038/s41437-025-00770-0.
2
Divide and conquer approach for genome-wide association studies.全基因组关联研究的分而治之方法。
Genetics. 2025 Apr 17;229(4). doi: 10.1093/genetics/iyaf019.
3
The interplay of sex and genotype in disease associations: a comprehensive network analysis in the UK Biobank.疾病关联中性别与基因型的相互作用:英国生物银行的综合网络分析

本文引用的文献

1
Efficient Bayesian mixed-model analysis increases association power in large cohorts.高效的贝叶斯混合模型分析提高了大型队列研究中的关联效能。
Nat Genet. 2015 Mar;47(3):284-90. doi: 10.1038/ng.3190. Epub 2015 Feb 2.
2
Genetic-based prediction of disease traits: prediction is very difficult, especially about the future.基于基因的疾病性状预测:预测非常困难,尤其是对未来的预测。
Front Genet. 2014 Jun 2;5:162. doi: 10.3389/fgene.2014.00162. eCollection 2014.
3
Fast principal component analysis of large-scale genome-wide data.大规模全基因组数据的快速主成分分析。
Hum Genomics. 2025 Jan 17;19(1):4. doi: 10.1186/s40246-024-00710-9.
4
Comparing genomic studies in animal breeding and human genetics: focus on disease-related traits in livestock - A review.动物育种与人类遗传学中的基因组研究比较:聚焦家畜疾病相关性状——综述
Anim Biosci. 2025 Feb;38(2):189-197. doi: 10.5713/ab.24.0487. Epub 2024 Oct 24.
5
Investigating pedigree- and SNP-associated components of heritability in a wild population of Soay sheep.调查野生斯羔绵羊中家系和 SNP 相关遗传成分。
Heredity (Edinb). 2024 Apr;132(4):202-210. doi: 10.1038/s41437-024-00673-6. Epub 2024 Feb 10.
6
The impact of SNP density on quantitative genetic analyses of body size traits in a wild population of Soay sheep.单核苷酸多态性(SNP)密度对索艾羊野生种群体型性状数量遗传分析的影响。
Ecol Evol. 2022 Dec 14;12(12):e9639. doi: 10.1002/ece3.9639. eCollection 2022 Dec.
7
A generalized linear mixed model association tool for biobank-scale data.一种用于生物样本库规模数据的广义线性混合模型关联工具。
Nat Genet. 2021 Nov;53(11):1616-1621. doi: 10.1038/s41588-021-00954-4. Epub 2021 Nov 4.
8
Identification of Major Loci and Candidate Genes for Meat Production-Related Traits in Broilers.肉鸡产肉相关性状的主要基因座和候选基因鉴定
Front Genet. 2021 Mar 30;12:645107. doi: 10.3389/fgene.2021.645107. eCollection 2021.
9
Regional heritability mapping identifies several novel loci (STAT4, ULK4, and KCNH5) for primary biliary cholangitis in the Japanese population.区域遗传力图谱鉴定出日本人群原发性胆汁性胆管炎的几个新位点(STAT4、ULK4 和 KCNH5)。
Eur J Hum Genet. 2021 Aug;29(8):1282-1291. doi: 10.1038/s41431-021-00854-5. Epub 2021 Apr 9.
10
Quantitative trait loci and transcriptome signatures associated with avian heritable resistance to Campylobacter.与禽源弯曲杆菌遗传抗性相关的数量性状位点和转录组特征。
Sci Rep. 2021 Jan 12;11(1):1623. doi: 10.1038/s41598-020-79005-7.
PLoS One. 2014 Apr 9;9(4):e93766. doi: 10.1371/journal.pone.0093766. eCollection 2014.
4
Regional heritability advanced complex trait analysis for GPU and traditional parallel architectures.针对GPU和传统并行架构的区域遗传力推进复杂性状分析。
Bioinformatics. 2014 Apr 15;30(8):1177-1179. doi: 10.1093/bioinformatics/btt754. Epub 2014 Jan 7.
5
Employing a Monte Carlo algorithm in Newton-type methods for restricted maximum likelihood estimation of genetic parameters.在用于遗传参数限制最大似然估计的牛顿型方法中采用蒙特卡罗算法。
PLoS One. 2013 Dec 10;8(12):e80821. doi: 10.1371/journal.pone.0080821. eCollection 2013.
6
Pitfalls of predicting complex traits from SNPs.从单核苷酸多态性预测复杂性状的陷阱。
Nat Rev Genet. 2013 Jul;14(7):507-15. doi: 10.1038/nrg3457.
7
Biology: The big challenges of big data.生物学:大数据的巨大挑战。
Nature. 2013 Jun 13;498(7453):255-60. doi: 10.1038/498255a.
8
Power and predictive accuracy of polygenic risk scores.多基因风险评分的效力和预测准确性。
PLoS Genet. 2013 Mar;9(3):e1003348. doi: 10.1371/journal.pgen.1003348. Epub 2013 Mar 21.
9
Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies.基于全基因组关联研究的多基因分析预测风险的性能。
Nat Genet. 2013 Apr;45(4):400-5, 405e1-3. doi: 10.1038/ng.2579. Epub 2013 Mar 3.
10
Localising loci underlying complex trait variation using Regional Genomic Relationship Mapping.利用区域基因组关系映射定位复杂性状变异的基因座。
PLoS One. 2012;7(10):e46501. doi: 10.1371/journal.pone.0046501. Epub 2012 Oct 15.