重新构建人类线粒体系统发育树：一种自动化、可扩展的方法，结合专家知识。

Rebooting the human mitochondrial phylogeny: an automated and scalable methodology with expert knowledge.

机构信息

Departamento de Informática e Ingeniería de Sistemas, Universidad de Zaragoza, Zaragoza, Spain.

出版信息

BMC Bioinformatics. 2011 May 19;12:174. doi: 10.1186/1471-2105-12-174.

DOI:10.1186/1471-2105-12-174

PMID:21595926

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3123235/

Abstract

BACKGROUND

Mitochondrial DNA is an ideal source of information to conduct evolutionary and phylogenetic studies due to its extraordinary properties and abundance. Many insights can be gained from these, including but not limited to screening genetic variation to identify potentially deleterious mutations. However, such advances require efficient solutions to very difficult computational problems, a need that is hampered by the very plenty of data that confers strength to the analysis.

RESULTS

We develop a systematic, automated methodology to overcome these difficulties, building from readily available, public sequence databases to high-quality alignments and phylogenetic trees. Within each stage in an autonomous workflow, outputs are carefully evaluated and outlier detection rules defined to integrate expert knowledge and automated curation, hence avoiding the manual bottleneck found in past approaches to the problem. Using these techniques, we have performed exhaustive updates to the human mitochondrial phylogeny, illustrating the power and computational scalability of our approach, and we have conducted some initial analyses on the resulting phylogenies.

CONCLUSIONS

The problem at hand demands careful definition of inputs and adequate algorithmic treatment for its solutions to be realistic and useful. It is possible to define formal rules to address the former requirement by refining inputs directly and through their combination as outputs, and the latter are also of help to ascertain the performance of chosen algorithms. Rules can exploit known or inferred properties of datasets to simplify inputs through partitioning, therefore cutting computational costs and affording work on rapidly growing, otherwise intractable datasets. Although expert guidance may be necessary to assist the learning process, low-risk results can be fully automated and have proved themselves convenient and valuable.

摘要

背景

线粒体 DNA 是进行进化和系统发育研究的理想信息来源，因为它具有非凡的特性和丰富的含量。从这些研究中可以获得许多见解，包括但不限于筛选遗传变异以识别潜在的有害突变。然而，这些进展需要高效的解决方案来解决非常困难的计算问题，而这些问题的解决需要大量的数据，这也给分析带来了困难。

结果

我们开发了一种系统的、自动化的方法来克服这些困难，从现成的公共序列数据库构建高质量的比对和系统发育树。在自主工作流程的每个阶段，都会仔细评估输出，并定义异常值检测规则，以整合专家知识和自动化编辑，从而避免过去解决该问题的方法中存在的手动瓶颈。使用这些技术，我们对人类线粒体系统发育进行了详尽的更新，展示了我们方法的强大功能和计算可扩展性，并且我们对生成的系统发育树进行了一些初步分析。

结论

手头的问题需要仔细定义输入，并对其解决方案进行适当的算法处理，才能使其具有现实意义和实用价值。可以通过直接细化输入并通过它们的组合作为输出来定义正式规则来满足前一个要求，而后者也有助于确定所选算法的性能。规则可以利用数据集的已知或推断属性通过分区简化输入，从而降低计算成本，并允许对快速增长的、否则难以处理的数据集进行处理。尽管可能需要专家指导来协助学习过程，但低风险的结果可以完全自动化，并且已经证明它们是方便和有价值的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e999/3123235/f09783e9990f/1471-2105-12-174-1.jpg

相似文献

Rebooting the human mitochondrial phylogeny: an automated and scalable methodology with expert knowledge.重新构建人类线粒体系统发育树：一种自动化、可扩展的方法，结合专家知识。

BMC Bioinformatics. 2011 May 19;12:174. doi: 10.1186/1471-2105-12-174.

Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences.基于整个质体和整个线粒体基因组序列推断的基因组BLAST距离系统发育树。

BMC Bioinformatics. 2006 Jul 19;7:350. doi: 10.1186/1471-2105-7-350.

The tree alignment problem.树对齐问题。

BMC Bioinformatics. 2012 Nov 9;13:293. doi: 10.1186/1471-2105-13-293.

On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。

Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.

Direct maximum parsimony phylogeny reconstruction from genotype data.从基因型数据直接进行最大简约系统发育重建。

BMC Bioinformatics. 2007 Dec 5;8:472. doi: 10.1186/1471-2105-8-472.

Assessment of phylogenomic and orthology approaches for phylogenetic inference.用于系统发育推断的系统发育基因组学和直系同源方法评估。

Bioinformatics. 2007 Apr 1;23(7):815-24. doi: 10.1093/bioinformatics/btm015. Epub 2007 Jan 19.

Mitochondrial cardiomyopathies: how to identify candidate pathogenic mutations by mitochondrial DNA sequencing, MITOMASTER and phylogeny.线粒体心肌病：如何通过线粒体 DNA 测序、MITOMASTER 和系统发生学来鉴定候选致病性突变。

Eur J Hum Genet. 2011 Feb;19(2):200-7. doi: 10.1038/ejhg.2010.169. Epub 2010 Oct 27.

A memetic-aided approach to hierarchical clustering from distance matrices: application to gene expression clustering and phylogeny.一种基于模因辅助的距离矩阵层次聚类方法：在基因表达聚类和系统发育中的应用。

Biosystems. 2003 Nov;72(1-2):75-97. doi: 10.1016/s0303-2647(03)00136-9.

The landscape of mitochondrial DNA variation in human colorectal cancer on the background of phylogenetic knowledge.基于系统发育学知识背景下的人类结直肠癌线粒体DNA变异情况

Biochim Biophys Acta. 2012 Apr;1825(2):153-9. doi: 10.1016/j.bbcan.2011.11.004. Epub 2011 Dec 2.

LMAP_S: Lightweight Multigene Alignment and Phylogeny eStimation.LMAP_S：轻量级多基因对齐与系统发育估算。

BMC Bioinformatics. 2019 Dec 30;20(1):739. doi: 10.1186/s12859-019-3292-5.

引用本文的文献

Germline transmission of donor, maternal and paternal mtDNA in primates.灵长类动物供体、母系和父系 mtDNA 的种系传递。

Hum Reprod. 2021 Jan 25;36(2):493-505. doi: 10.1093/humrep/deaa308.

Incompatibility between Nuclear and Mitochondrial Genomes Contributes to an Interspecies Reproductive Barrier.核基因组与线粒体基因组之间的不兼容性导致种间生殖障碍。

Cell Metab. 2016 Aug 9;24(2):283-94. doi: 10.1016/j.cmet.2016.06.012. Epub 2016 Jul 14.

Mitochondrial DNA disease and developmental implications for reproductive strategies.线粒体DNA疾病及其对生殖策略的发育学影响

Mol Hum Reprod. 2015 Jan;21(1):11-22. doi: 10.1093/molehr/gau090. Epub 2014 Nov 24.

MtDNA segregation in heteroplasmic tissues is common in vivo and modulated by haplotype differences and developmental stage.异质性组织中的线粒体DNA分离在体内很常见，并受单倍型差异和发育阶段的调节。

Cell Rep. 2014 Jun 26;7(6):2031-2041. doi: 10.1016/j.celrep.2014.05.020. Epub 2014 Jun 6.

HmtDB, a genomic resource for mitochondrion-based human variability studies.HmtDB，一个基于线粒体的人类变异研究的基因组资源。

Nucleic Acids Res. 2012 Jan;40(Database issue):D1150-9. doi: 10.1093/nar/gkr1086. Epub 2011 Dec 1.

本文引用的文献

RECONSTRUCTING CHARACTER EVOLUTION ON POLYTOMOUS CLADOGRAMS.在多歧分支系统树上重建性状演化

Cladistics. 1989 Dec;5(4):365-377. doi: 10.1111/j.1096-0031.1989.tb00569.x.

CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP.系统发育树的置信区间：一种使用自展法的方法。

Evolution. 1985 Jul;39(4):783-791. doi: 10.1111/j.1558-5646.1985.tb00420.x.

Temporal logics for phylogenetic analysis via model checking.通过模型检查进行系统发育分析的时态逻辑。

IEEE/ACM Trans Comput Biol Bioinform. 2013 Jul-Aug;10(4):1058-70. doi: 10.1109/TCBB.2013.87.

Genetic history of an archaic hominin group from Denisova Cave in Siberia.西伯利亚丹尼索瓦洞穴古人类群体的遗传历史。

Nature. 2010 Dec 23;468(7327):1053-60. doi: 10.1038/nature09710.

High-throughput sequencing of complete human mtDNA genomes from the Philippines.菲律宾完整人类线粒体 DNA 基因组的高通量测序。

Genome Res. 2011 Jan;21(1):1-11. doi: 10.1101/gr.107615.110. Epub 2010 Dec 8.

GenBank.基因银行

Nucleic Acids Res. 2011 Jan;39(Database issue):D32-7. doi: 10.1093/nar/gkq1079. Epub 2010 Nov 10.

The complete mitochondrial DNA genome of an unknown hominin from southern Siberia.西伯利亚南部未知原始人类的完整线粒体 DNA 基因组。

Nature. 2010 Apr 8;464(7290):894-7. doi: 10.1038/nature08976. Epub 2010 Mar 24.

The acquisition of an inheritable 50-bp deletion in the human mtDNA control region does not affect the mtDNA copy number in peripheral blood cells.人类 mtDNA 控制区获得可遗传的 50bp 缺失不会影响外周血细胞中的 mtDNA 拷贝数。

Hum Mutat. 2010 May;31(5):538-43. doi: 10.1002/humu.21220.

Brief communication: mitochondrial haplotype C4c confirmed as a founding genome in the Americas.简要沟通：确认线粒体单倍型 C4c 为美洲的创始基因组。

Am J Phys Anthropol. 2010 Mar;141(3):494-7. doi: 10.1002/ajpa.21238.

phyloXML: XML for evolutionary biology and comparative genomics.phyloXML：用于进化生物学和比较基因组学的 XML。

BMC Bioinformatics. 2009 Oct 27;10:356. doi: 10.1186/1471-2105-10-356.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

重新构建人类线粒体系统发育树：一种自动化、可扩展的方法，结合专家知识。

Rebooting the human mitochondrial phylogeny: an automated and scalable methodology with expert knowledge.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献