Suppr超能文献

豆科和禾本科植物家族中蛋白质结构域的家族特异性增减

Family-Specific Gains and Losses of Protein Domains in the Legume and Grass Plant Families.

作者信息

Yadav Akshay, Fernández-Baca David, Cannon Steven B

机构信息

Bioinformatics and Computational Biology Graduate Program, Iowa State University, Ames, IA, USA.

Department of Computer Science, Iowa State University, Ames, IA, USA.

出版信息

Evol Bioinform Online. 2020 Jul 9;16:1176934320939943. doi: 10.1177/1176934320939943. eCollection 2020.

Abstract

Protein domains can be regarded as sections of protein sequences capable of folding independently and performing specific functions. In addition to amino-acid level changes, protein sequences can also evolve through domain shuffling events such as domain insertion, deletion, or duplication. The evolution of protein domains can be studied by tracking domain changes in a selected set of species with known phylogenetic relationships. Here, we conduct such an analysis by defining domains as "features" or "descriptors," and considering the species (target + outgroup) as instances or data-points in a data matrix. We then look for features (domains) that are significantly different between the target species and the outgroup species. We study the domain changes in 2 large, distinct groups of plant species: legumes (Fabaceae) and grasses (Poaceae), with respect to selected outgroup species. We evaluate 4 types of domain feature matrices: domain content, domain duplication, domain abundance, and domain versatility. The 4 types of domain feature matrices attempt to capture different aspects of domain changes through which the protein sequences may evolve-that is, via gain or loss of domains, increase or decrease in the copy number of domains along the sequences, expansion or contraction of domains, or through changes in the number of adjacent domain partners. All the feature matrices were analyzed using feature selection techniques and statistical tests to select protein domains that have significant different feature values in legumes and grasses. We report the biological functions of the top selected domains from the analysis of all the feature matrices. In addition, we also perform domain-centric gene ontology (dcGO) enrichment analysis on all selected domains from all 4 feature matrices to study the gene ontology terms associated with the significantly evolving domains in legumes and grasses. Domain content analysis revealed a striking loss of protein domains from the Fanconi anemia (FA) pathway, the pathway responsible for the repair of interstrand DNA crosslinks. The abundance analysis of domains found in legumes revealed an increase in glutathione synthase enzyme, an antioxidant required from nitrogen fixation, and a decrease in xanthine oxidizing enzymes, a phenomenon confirmed by previous studies. In grasses, the abundance analysis showed increases in domains related to gene silencing which could be due to polyploidy or due to enhanced response to viral infection. We provide a docker container that can be used to perform this analysis workflow on any user-defined sets of species, available at https://cloud.docker.com/u/akshayayadav/repository/docker/akshayayadav/protein-domain-evolution-project.

摘要

蛋白质结构域可被视为能够独立折叠并执行特定功能的蛋白质序列片段。除了氨基酸水平的变化外,蛋白质序列还可通过结构域改组事件(如结构域插入、缺失或复制)发生进化。通过追踪一组具有已知系统发育关系的选定物种中的结构域变化,可以研究蛋白质结构域的进化。在此,我们通过将结构域定义为“特征”或“描述符”,并将物种(目标物种+外类群)视为数据矩阵中的实例或数据点来进行此类分析。然后,我们寻找目标物种和外类群物种之间存在显著差异的特征(结构域)。我们研究了两大类不同的植物物种(豆科植物和禾本科植物)相对于选定外类群物种的结构域变化。我们评估了4种类型的结构域特征矩阵:结构域含量、结构域复制、结构域丰度和结构域通用性。这4种类型的结构域特征矩阵试图捕捉蛋白质序列可能通过其进化的结构域变化的不同方面,即通过结构域的获得或丧失、沿序列的结构域拷贝数的增加或减少、结构域的扩展或收缩,或通过相邻结构域伙伴数量的变化。使用特征选择技术和统计测试对所有特征矩阵进行分析,以选择在豆科植物和禾本科植物中具有显著不同特征值的蛋白质结构域。我们报告了对所有特征矩阵分析中排名靠前的选定结构域的生物学功能。此外,我们还对来自所有4个特征矩阵的所有选定结构域进行了以结构域为中心的基因本体(dcGO)富集分析,以研究与豆科植物和禾本科植物中显著进化的结构域相关的基因本体术语。结构域含量分析显示,范可尼贫血(FA)途径(负责修复链间DNA交联的途径)中的蛋白质结构域显著丧失。对豆科植物中发现的结构域进行的丰度分析显示,谷胱甘肽合酶(一种固氮所需的抗氧化剂)增加,黄嘌呤氧化酶减少,这一现象已得到先前研究的证实。在禾本科植物中,丰度分析显示与基因沉默相关的结构域增加,这可能是由于多倍体或对病毒感染的反应增强所致。我们提供了一个Docker容器,可用于在任何用户定义的物种集上执行此分析工作流程,可在https://cloud.docker.com/u/akshayayadav/repository/docker/akshayayadav/protein-domain-evolution-project获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14c6/7350399/00e091c7cce9/10.1177_1176934320939943-fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验