• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过基因组重排进行直系同源基因的分配。

Assignment of orthologous genes via genome rearrangement.

作者信息

Chen Xin, Zheng Jie, Fu Zheng, Nan Peng, Zhong Yang, Lonardi Stefano, Jiang Tao

机构信息

Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2005 Oct-Dec;2(4):302-15. doi: 10.1109/TCBB.2005.48.

DOI:10.1109/TCBB.2005.48
PMID:17044168
Abstract

The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not clearly delineate the evolutionary relationship among genes of the same families. In this paper, we present a new approach to ortholog assignment that takes into account both sequence similarity and evolutionary events at a genome level, where orthologous genes are assumed to correspond to each other in the most parsimonious evolving scenario under genome rearrangement. First, the problem is formulated as that of computing the signed reversal distance with duplicates between the two genomes of interest. Then, the problem is decomposed into two new optimization problems, called minimum common partition and maximum cycle decomposition, for which efficient heuristic algorithms are given. Following this approach, we have implemented a high-throughput system for assigning orthologs on a genome scale, called SOAR, and tested it on both simulated data and real genome sequence data. Compared to a recent ortholog assignment method based entirely on homology search (called INPARANOID), SOAR shows a marginally better performance in terms of sensitivity on the real data set because it is able to identify several correct orthologous pairs that are missed by INPARANOID. The simulation results demonstrate that SOAR, in general, performs better than the iterated exemplar algorithm in terms of computing the reversal distance and assigning correct orthologs.

摘要

在一对基因组之间确定直系同源基因是比较基因组学中的一个基本且具有挑战性的问题。现有的基于DNA或蛋白质序列相似性来确定直系同源基因的方法,当序列相似性无法清晰界定同一家族基因间的进化关系时,可能会做出错误的分配。在本文中,我们提出了一种新的直系同源基因分配方法,该方法在基因组层面同时考虑了序列相似性和进化事件,其中直系同源基因被假定在基因组重排的最简约进化场景中相互对应。首先,将该问题表述为计算两个目标基因组之间带重复的有符号反转距离的问题。然后,将该问题分解为两个新的优化问题,即最小公共划分和最大循环分解,并针对这两个问题给出了高效的启发式算法。按照这种方法,我们实现了一个用于在基因组规模上分配直系同源基因的高通量系统,称为SOAR,并在模拟数据和真实基因组序列数据上对其进行了测试。与最近一种完全基于同源性搜索的直系同源基因分配方法(称为INPARANOID)相比,SOAR在真实数据集上的敏感性方面表现略好,因为它能够识别出INPARANOID遗漏的几个正确的直系同源对。模拟结果表明,总体而言,在计算反转距离和分配正确的直系同源基因方面,SOAR比迭代范例算法表现更好。

相似文献

1
Assignment of orthologous genes via genome rearrangement.通过基因组重排进行直系同源基因的分配。
IEEE/ACM Trans Comput Biol Bioinform. 2005 Oct-Dec;2(4):302-15. doi: 10.1109/TCBB.2005.48.
2
MSOAR: a high-throughput ortholog assignment system based on genome rearrangement.MSOAR:一种基于基因组重排的高通量直系同源物分配系统。
J Comput Biol. 2007 Nov;14(9):1160-75. doi: 10.1089/cmb.2007.0048.
3
MSOAR 2.0: Incorporating tandem duplications into ortholog assignment based on genome rearrangement.MSOAR 2.0:基于基因组重排的串联重复整合到直系同源物分配中。
BMC Bioinformatics. 2010 Jan 6;11:10. doi: 10.1186/1471-2105-11-10.
4
Clustering of main orthologs for multiple genomes.多个基因组主要直系同源基因的聚类
Comput Syst Bioinformatics Conf. 2007;6:195-201.
5
MultiMSOAR 2.0: an accurate tool to identify ortholog groups among multiple genomes.MultiMSOAR 2.0:一种用于在多个基因组中识别直系同源物的精确工具。
PLoS One. 2011;6(6):e20892. doi: 10.1371/journal.pone.0020892. Epub 2011 Jun 21.
6
Clustering of main orthologs for multiple genomes.多个基因组主要直系同源基因的聚类
J Bioinform Comput Biol. 2008 Jun;6(3):573-84. doi: 10.1142/s0219720008003540.
7
Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.通过成对物种比较对直系同源基因和旁系同源基因进行自动聚类。
J Mol Biol. 2001 Dec 14;314(5):1041-52. doi: 10.1006/jmbi.2000.5197.
8
Computing the breakpoint distance between partially ordered genomes.计算部分有序基因组之间的断点距离。
J Bioinform Comput Biol. 2007 Oct;5(5):1087-101. doi: 10.1142/s0219720007003107.
9
Improving the specificity of high-throughput ortholog prediction.提高高通量直系同源物预测的特异性。
BMC Bioinformatics. 2006 May 28;7:270. doi: 10.1186/1471-2105-7-270.
10
Perfect sorting by reversals is not always difficult.通过反转进行完美排序并不总是困难的。
IEEE/ACM Trans Comput Biol Bioinform. 2007 Jan-Mar;4(1):4-16. doi: 10.1109/TCBB.2007.1011.

引用本文的文献

1
An Exact and Fast SAT Formulation for the DCJ Distance.一种用于DCJ距离的精确且快速的SAT公式化方法。
bioRxiv. 2024 Nov 8:2024.11.05.622153. doi: 10.1101/2024.11.05.622153.
2
Approximation algorithm for rearrangement distances considering repeated genes and intergenic regions.考虑重复基因和基因间区域的重排距离近似算法。
Algorithms Mol Biol. 2021 Oct 13;16(1):21. doi: 10.1186/s13015-021-00200-w.
3
geneHummus: an R package to define gene families and their expression in legumes and beyond.geneHummus:一个用于定义豆科植物及其以外的基因家族及其表达的 R 包。
BMC Genomics. 2019 Jul 18;20(1):591. doi: 10.1186/s12864-019-5952-2.
4
Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers.利用有监督大数据分类器在相关酵母蛋白质组中检测直系同源物时,对无比对特征进行普查。
BMC Bioinformatics. 2018 May 3;19(1):166. doi: 10.1186/s12859-018-2148-8.
5
Representing and decomposing genomic structural variants as balanced integer flows on sequence graphs.将基因组结构变异表示并分解为序列图上的平衡整数流。
BMC Bioinformatics. 2016 Sep 29;17(1):400. doi: 10.1186/s12859-016-1258-4.
6
Genome-Wide Identification of Calcium Dependent Protein Kinase Gene Family in Plant Lineage Shows Presence of Novel D-x-D and D-E-L Motifs in EF-Hand Domain.植物谱系中钙依赖蛋白激酶基因家族的全基因组鉴定显示,EF-手型结构域中存在新型D-x-D和D-E-L基序。
Front Plant Sci. 2015 Dec 24;6:1146. doi: 10.3389/fpls.2015.01146. eCollection 2015.
7
An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.一种用于相关酵母物种直系同源物检测的有效大数据监督不平衡分类方法。
Biomed Res Int. 2015;2015:748681. doi: 10.1155/2015/748681. Epub 2015 Oct 29.
8
An Integer Programming Formulation of the Minimum Common String Partition Problem.最小公共字符串划分问题的整数规划公式化表述。
PLoS One. 2015 Jul 2;10(7):e0130266. doi: 10.1371/journal.pone.0130266. eCollection 2015.
9
Comparing genomes with rearrangements and segmental duplications.比较带有重排和片段重复的基因组。
Bioinformatics. 2015 Jun 15;31(12):i329-38. doi: 10.1093/bioinformatics/btv229.
10
Draft genome sequence of marine alphaproteobacterial strain HIMB11, the first cultivated representative of a unique lineage within the Roseobacter clade possessing an unusually small genome.海洋α-变形杆菌菌株HIMB11的基因组序列草图,它是玫瑰杆菌进化枝中一个独特谱系的首个培养代表,拥有异常小的基因组。
Stand Genomic Sci. 2014 Mar 15;9(3):632-45. doi: 10.4056/sigs.4998989. eCollection 2014 Jun 15.