Suppr超能文献

利用系统发育信息注释(PIA)在非模式生物的转录组中搜索光相互作用基因。

Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms.

作者信息

Speiser Daniel I, Pankey M Sabrina, Zaharoff Alexander K, Battelle Barbara A, Bracken-Grissom Heather D, Breinholt Jesse W, Bybee Seth M, Cronin Thomas W, Garm Anders, Lindgren Annie R, Patel Nipam H, Porter Megan L, Protas Meredith E, Rivera Ajna S, Serb Jeanne M, Zigler Kirk S, Crandall Keith A, Oakley Todd H

机构信息

Department of Ecology, Evolution, and Marine Biology, University of California Santa Barbara, Santa Barbara, CA, USA.

Department of Biological Sciences, University of South Carolina, Columbia, SC, USA.

出版信息

BMC Bioinformatics. 2014 Nov 19;15(1):350. doi: 10.1186/s12859-014-0350-x.

Abstract

BACKGROUND

Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families.

RESULTS

We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ).

CONCLUSIONS

Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.

摘要

背景

高通量测序和从头组装工具使几乎任何生物体的转录组(即组织中表达的一组基因)分析成为可能。然而,生物学家面临的一个挑战是,很难为基因序列确定身份,尤其是来自非模式生物的序列。系统发育分析是为这些序列确定身份的一种有用方法,但由于需要为每个感兴趣的基因以及每次分析新数据集时重新计算树,这种方法往往很耗时。作为回应,我们利用现有的系统发育分析工具,开发了一种计算效率高、基于树的方法来注释转录组或新基因组,我们将其称为系统发育信息注释(PIA),该方法将未表征的基因置于预先计算的基因家族系统发育树中。

结果

我们为来自光相互作用工具包(LIT)的109个基因生成了最大似然树,LIT是后生动物中与光相互作用结构的功能或发育相关的一组基因。为此,我们搜索了从29个全测序基因组预测的蛋白质序列,并使用Galaxy(一个开源工作流管理系统)的Osiris包中的系统发育分析工具构建树。接下来,为了快速注释缺乏测序基因组的生物体的转录组,我们重新利用了基于最大似然的进化定位算法(在RAxML中实现),将潜在LIT基因的序列置于我们预先计算的基因树上。最后,我们在Galaxy中实现了PIA,并使用它在一系列头足类软体动物、节肢动物和立方水母类刺胞动物的光相互作用组织的28个新测序转录组中搜索LIT基因。我们新的LIT基因树可在Bitbucket公共存储库(http://bitbucket.org/osiris_phylogenetics/pia/)上获取,我们在一个可公开访问的网络服务器(http://galaxy-dev.cnsi.ucsb.edu/pia/)上展示了PIA。

结论

我们新的LIT基因树将为研究眼睛或其他光相互作用结构进化的研究人员提供宝贵资源。我们还介绍了PIA,这是一种利用系统发育关系在非模式生物转录组中识别LIT基因的高通量方法。通过简单修改,我们的方法可用于搜索不同的基因集或注释后生动物以外类群的数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7d9/4255452/6c1d7a2b3bb5/12859_2014_350_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验