• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

我们何时不应在序列旁系同源物之间转移功能注释?

WHEN SHOULD WE NOT TRANSFER FUNCTIONAL ANNOTATION BETWEEN SEQUENCE PARALOGS?

作者信息

Cao Mengfei, Cowen Lenore J

机构信息

Department of Computer Science, Tufts University, Medford, MA 02155, USA,

出版信息

Pac Symp Biocomput. 2017;22:15-26. doi: 10.1142/9789813207813_0003.

DOI:10.1142/9789813207813_0003
PMID:27896958
Abstract

Current automated computational methods to assign functional labels to unstudied genes often involve transferring annotation from orthologous or paralogous genes, however such genes can evolve divergent functions, making such transfer inappropriate. We consider the problem of determining when it is correct to make such an assignment between paralogs. We construct a benchmark dataset of two types of similar paralogous pairs of genes in the well-studied model organism S. cerevisiae: one set of pairs where single deletion mutants have very similar phenotypes (implying similar functions), and another set of pairs where single deletion mutants have very divergent phenotypes (implying different functions). State of the art methods for this problem will determine the evolutionary history of the paralogs with references to multiple related species. Here, we ask a first and simpler question: we explore to what extent any computational method with access only to data from a single species can solve this problem.We consider divergence data (at both the amino acid and nucleotide levels), and network data (based on the yeast protein-protein interaction network, as captured in BioGRID), and ask if we can extract features from these data that can distinguish between these sets of paralogous gene pairs. We find that the best features come from measures of sequence divergence, however, simple network measures based on degree or centrality or shortest path or diffusion state distance (DSD), or shared neighborhood in the yeast protein-protein interaction (PPI) network also contain some signal. One should, in general, not transfer function if sequence divergence is too high. Further improvements in classification will need to come from more computationally expensive but much more powerful evolutionary methods that incorporate ancestral states and measure evolutionary divergence over multiple species based on evolutionary trees.

摘要

当前用于为未研究基因分配功能标签的自动化计算方法通常涉及从直系同源或旁系同源基因转移注释,然而这些基因可能会进化出不同的功能,使得这种转移并不合适。我们考虑确定何时在旁系同源基因之间进行这种分配是正确的问题。我们在经过充分研究的模式生物酿酒酵母中构建了一个由两种类型的相似旁系同源基因对组成的基准数据集:一组基因对,其单基因缺失突变体具有非常相似的表型(意味着功能相似),另一组基因对,其单基因缺失突变体具有非常不同的表型(意味着功能不同)。针对这个问题的现有方法将参考多个相关物种来确定旁系同源基因的进化历史。在这里,我们提出一个首要且更简单的问题:我们探索仅能访问单个物种数据的任何计算方法在多大程度上可以解决这个问题。我们考虑分歧数据(在氨基酸和核苷酸水平)以及网络数据(基于酵母蛋白质 - 蛋白质相互作用网络,如在BioGRID中所捕获的),并询问我们是否可以从这些数据中提取能够区分这些旁系同源基因对集合的特征。我们发现最佳特征来自序列分歧的度量,然而,基于度、中心性、最短路径或扩散状态距离(DSD)的简单网络度量,或者酵母蛋白质 - 蛋白质相互作用(PPI)网络中的共享邻域也包含一些信号。一般来说,如果序列分歧过高,就不应转移功能。分类的进一步改进将需要来自更计算昂贵但更强大的进化方法,这些方法纳入祖先状态并基于进化树测量多个物种的进化分歧。

相似文献

1
WHEN SHOULD WE NOT TRANSFER FUNCTIONAL ANNOTATION BETWEEN SEQUENCE PARALOGS?我们何时不应在序列旁系同源物之间转移功能注释?
Pac Symp Biocomput. 2017;22:15-26. doi: 10.1142/9789813207813_0003.
2
A scale of functional divergence for yeast duplicated genes revealed from analysis of the protein-protein interaction network.通过蛋白质-蛋白质相互作用网络分析揭示的酵母重复基因功能分化程度
Genome Biol. 2004;5(10):R76. doi: 10.1186/gb-2004-5-10-r76. Epub 2004 Sep 15.
3
Identity and divergence of protein domain architectures after the yeast whole-genome duplication event.酵母全基因组复制事件后蛋白质结构域架构的同一性与分歧
Mol Biosyst. 2010 Nov;6(11):2305-15. doi: 10.1039/c003507f. Epub 2010 Aug 26.
4
Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction.利用 PPI 网络自相关性在层次多标签分类树中进行基因功能预测。
BMC Bioinformatics. 2013 Sep 26;14:285. doi: 10.1186/1471-2105-14-285.
5
Protein function prediction from protein-protein interaction network using gene ontology based neighborhood analysis and physico-chemical features.基于基因本体的邻域分析和物理化学特征,从蛋白质-蛋白质相互作用网络预测蛋白质功能。
J Bioinform Comput Biol. 2018 Dec;16(6):1850025. doi: 10.1142/S0219720018500257. Epub 2018 Sep 19.
6
A supervised weighted similarity measure for gene expressions using biological knowledge.一种利用生物学知识的基因表达监督加权相似性度量。
Gene. 2016 Dec 31;595(2):150-160. doi: 10.1016/j.gene.2016.09.033. Epub 2016 Sep 26.
7
Network motif-based analysis of regulatory patterns in paralogous gene pairs.基于网络基序的旁系同源基因对调控模式分析。
J Bioinform Comput Biol. 2020 Jun;18(3):2040008. doi: 10.1142/S0219720020400089.
8
Gene duplications contribute to the overrepresentation of interactions between proteins of a similar age.基因重复导致具有相似年龄的蛋白质之间的相互作用过度表达。
BMC Evol Biol. 2012 Jun 25;12:99. doi: 10.1186/1471-2148-12-99.
9
Integrative network alignment reveals large regions of global network similarity in yeast and human.整合网络比对揭示了酵母和人类中大量的全局网络相似区域。
Bioinformatics. 2011 May 15;27(10):1390-6. doi: 10.1093/bioinformatics/btr127. Epub 2011 Mar 16.
10
AVID: an integrative framework for discovering functional relationships among proteins.AVID:一个用于发现蛋白质间功能关系的综合框架。
BMC Bioinformatics. 2005 Jun 1;6:136. doi: 10.1186/1471-2105-6-136.

引用本文的文献

1
The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction.重新审视直系同源推断假说:直系同源物和旁系同源物在功能预测中的价值。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i219-i226. doi: 10.1093/bioinformatics/btaa468.