• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

AST:一种用于提高基因系统发育树分类多样性的自动序列抽样方法。

AST: an automated sequence-sampling method for improving the taxonomic diversity of gene phylogenetic trees.

机构信息

Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology and Institute of Bioinformatics, University of Georgia, Athens, Georgia, United States of America.

Department of Biology, East Carolina University, Greenville, North Carolina, United States of America.

出版信息

PLoS One. 2014 Jun 3;9(6):e98844. doi: 10.1371/journal.pone.0098844. eCollection 2014.

DOI:10.1371/journal.pone.0098844
PMID:24892935
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4044049/
Abstract

A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.

摘要

在基因树的系统发育推断中,一个挑战是如何正确地从大量同源序列中采样,以得出序列的良好代表性子集。这种需求出现在各种应用中,例如:(1)准确性导向的系统发育重建方法由于对计算资源的高需求,可能无法处理大量的序列;(2)分析一组基因树的应用可能更愿意使用具有较少分类单元(OTUs)的树,例如通过识别系统发育冲突来检测水平基因转移事件;(3)可用序列集偏向于广泛研究的物种。过去,子样本的创建通常依赖于手动选择。在这里,我们提出了一种用于提高基因系统发育树分类多样性的自动序列采样方法 AST,以获得代表序列,这些序列最大限度地提高了采样序列的分类多样性。为了证明 AST 的有效性,我们已经测试了它来解决四个问题,即大肠杆菌小核糖体亚单位蛋白 S5 的进化历史推断、16S 核糖体 RNA 和糖基转移酶基因家族 8,以及细菌到植物的古代水平基因转移研究。我们的结果表明,我们的计算结果的分辨率几乎与领域专家的手动推断一样好,因此使该工具对非系统发育专家的系统发育研究具有普遍的用处。该程序可在 http://csbl.bmb.uga.edu/~zhouchan/AST.php 获得。

相似文献

1
AST: an automated sequence-sampling method for improving the taxonomic diversity of gene phylogenetic trees.AST:一种用于提高基因系统发育树分类多样性的自动序列抽样方法。
PLoS One. 2014 Jun 3;9(6):e98844. doi: 10.1371/journal.pone.0098844. eCollection 2014.
2
Evolution of the RNA polymerase B' subunit gene (rpoB') in Halobacteriales: a complementary molecular marker to the SSU rRNA gene.嗜盐杆菌目RNA聚合酶B'亚基基因(rpoB')的进化:16S核糖体RNA基因的互补分子标记
Mol Biol Evol. 2004 Dec;21(12):2340-51. doi: 10.1093/molbev/msh248. Epub 2004 Sep 8.
3
An automated phylogenetic tree-based small subunit rRNA taxonomy and alignment pipeline (STAP).一种基于系统发育树的自动化小亚基核糖体RNA分类与比对流程(STAP)。
PLoS One. 2008 Jul 2;3(7):e2566. doi: 10.1371/journal.pone.0002566.
4
An emerging phylogenetic core of Archaea: phylogenies of transcription and translation machineries converge following addition of new genome sequences.古菌一个新出现的系统发育核心:随着新基因组序列的增加,转录和翻译机制的系统发育趋同。
BMC Evol Biol. 2005 Jun 2;5:36. doi: 10.1186/1471-2148-5-36.
5
Ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses.幽灵树:用于多样性分析的杂种基因系统发育树的构建。
Microbiome. 2016 Feb 24;4:11. doi: 10.1186/s40168-016-0153-6.
6
A standard operating procedure for phylogenetic inference (SOPPI) using (rRNA) marker genes.一种使用(rRNA)标记基因进行系统发育推断的标准操作程序(SOPPI)。
Syst Appl Microbiol. 2008 Sep;31(4):251-7. doi: 10.1016/j.syapm.2008.08.003. Epub 2008 Sep 10.
7
Treetrimmer: a method for phylogenetic dataset size reduction.Treetrimmer:一种用于减少系统发育数据集大小的方法。
BMC Res Notes. 2013 Apr 12;6:145. doi: 10.1186/1756-0500-6-145.
8
Exploration of phylogenetic data using a global sequence analysis method.使用全局序列分析方法对系统发育数据进行探索。
BMC Evol Biol. 2005 Nov 9;5:63. doi: 10.1186/1471-2148-5-63.
9
Comparison of phylogenetic trees through alignment of embedded evolutionary distances.通过嵌入进化距离的比对来比较系统发育树。
BMC Bioinformatics. 2009 Dec 15;10:423. doi: 10.1186/1471-2105-10-423.
10
The impact of automated filtering of BLAST-determined homologs in the phylogenetic detection of horizontal gene transfer from a transcriptome assembly.BLAST 同源性自动过滤对基于转录组组装的水平基因转移的系统发育检测的影响。
Mol Phylogenet Evol. 2014 Feb;71:184-92. doi: 10.1016/j.ympev.2013.11.016. Epub 2013 Dec 7.

引用本文的文献

1
Predicting phenotype from genotype: Improving accuracy through more robust experimental and computational modeling.从基因型预测表型:通过更稳健的实验和计算模型提高准确性。
Hum Mutat. 2017 May;38(5):569-580. doi: 10.1002/humu.23193. Epub 2017 Feb 28.

本文引用的文献

1
DACTAL: divide-and-conquer trees (almost) without alignments.DACTAL:无需对齐的分而治之树(几乎)。
Bioinformatics. 2012 Jun 15;28(12):i274-82. doi: 10.1093/bioinformatics/bts218.
2
dbCAN: a web resource for automated carbohydrate-active enzyme annotation.dbCAN:一个用于自动化碳水化合物活性酶注释的网络资源。
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W445-51. doi: 10.1093/nar/gks479. Epub 2012 May 29.
3
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.
SATe-II:一种非常快速且准确的同时估计多个序列比对和系统发育树的方法。
Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.
4
Species tree inference in a recent radiation of orioles (Genus Icterus): multiple markers and methods reveal cytonuclear discordance in the northern oriole group.Recent 辐射中的黄鹂属(属 Icterus)种系树推断:多个标记和方法揭示了北方黄鹂群的核质不符。
Mol Phylogenet Evol. 2011 Nov;61(2):460-9. doi: 10.1016/j.ympev.2011.06.017. Epub 2011 Jun 26.
5
Estimating species trees from unrooted gene trees.从无根基因树估计物种树。
Syst Biol. 2011 Oct;60(5):661-7. doi: 10.1093/sysbio/syr027. Epub 2011 Mar 28.
6
The impact of taxon sampling on phylogenetic inference: a review of two decades of controversy.分类群采样对系统发育推断的影响:二十年来争议的综述。
Brief Bioinform. 2012 Jan;13(1):122-34. doi: 10.1093/bib/bbr014. Epub 2011 Mar 23.
7
Taxon sampling and the optimal rates of evolution for phylogenetic inference.用于系统发育推断的分类群抽样与最优进化速率
Syst Biol. 2011 May;60(3):358-65. doi: 10.1093/sysbio/syq097. Epub 2011 Feb 8.
8
Benchmark datasets and software for developing and testing methods for large-scale multiple sequence alignment and phylogenetic inference.用于开发和测试大规模多序列比对及系统发育推断方法的基准数据集和软件。
PLoS Curr. 2010 Nov 18;2:RRN1195. doi: 10.1371/currents.RRN1195.
9
Identification of novel proteins involved in plant cell-wall synthesis based on protein-protein interaction data.基于蛋白质-蛋白质相互作用数据鉴定参与植物细胞壁合成的新蛋白质。
J Proteome Res. 2010 Oct 1;9(10):5025-37. doi: 10.1021/pr100249c.
10
Optimal selection of gene and ingroup taxon sampling for resolving phylogenetic relationships.最佳基因选择和内群分类单元抽样,用于解决系统发育关系。
Syst Biol. 2010 Jul;59(4):446-57. doi: 10.1093/sysbio/syq025. Epub 2010 May 19.