• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MSOAR 2.0:基于基因组重排的串联重复整合到直系同源物分配中。

MSOAR 2.0: Incorporating tandem duplications into ortholog assignment based on genome rearrangement.

机构信息

Department of Computer Science, University of California, Riverside, CA 92521, USA.

出版信息

BMC Bioinformatics. 2010 Jan 6;11:10. doi: 10.1186/1471-2105-11-10.

DOI:10.1186/1471-2105-11-10
PMID:20053291
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2821317/
Abstract

BACKGROUND

Ortholog assignment is a critical and fundamental problem in comparative genomics, since orthologs are considered to be functional counterparts in different species and can be used to infer molecular functions of one species from those of other species. MSOAR is a recently developed high-throughput system for assigning one-to-one orthologs between closely related species on a genome scale. It attempts to reconstruct the evolutionary history of input genomes in terms of genome rearrangement and gene duplication events. It assumes that a gene duplication event inserts a duplicated gene into the genome of interest at a random location (i.e., the random duplication model). However, in practice, biologists believe that genes are often duplicated by tandem duplications, where a duplicated gene is located next to the original copy (i.e., the tandem duplication model).

RESULTS

In this paper, we develop MSOAR 2.0, an improved system for one-to-one ortholog assignment. For a pair of input genomes, the system first focuses on the tandemly duplicated genes of each genome and tries to identify among them those that were duplicated after the speciation (i.e., the so-called inparalogs), using a simple phylogenetic tree reconciliation method. For each such set of tandemly duplicated inparalogs, all but one gene will be deleted from the concerned genome (because they cannot possibly appear in any one-to-one ortholog pairs), and MSOAR is invoked. Using both simulated and real data experiments, we show that MSOAR 2.0 is able to achieve a better sensitivity and specificity than MSOAR. In comparison with the well-known genome-scale ortholog assignment tool InParanoid, Ensembl ortholog database, and the orthology information extracted from the well-known whole-genome multiple alignment program MultiZ, MSOAR 2.0 shows the highest sensitivity. Although the specificity of MSOAR 2.0 is slightly worse than that of InParanoid in the real data experiments, it is actually better than that of InParanoid in the simulation tests.

CONCLUSIONS

Our preliminary experimental results demonstrate that MSOAR 2.0 is a highly accurate tool for one-to-one ortholog assignment between closely related genomes. The software is available to the public for free and included as online supplementary material.

摘要

背景

直系同源物的分配是比较基因组学中的一个关键和基本问题,因为直系同源物被认为是不同物种中的功能对应物,可以用来从其他物种推断一个物种的分子功能。MSOAR 是一种最近开发的高通量系统,可在基因组范围内为密切相关的物种分配一对一的直系同源物。它试图根据基因组重排和基因复制事件来重建输入基因组的进化历史。它假设基因复制事件将一个复制的基因随机插入到感兴趣的基因组中(即随机复制模型)。然而,在实践中,生物学家认为基因通常通过串联复制进行复制,其中一个复制的基因位于原始拷贝的旁边(即串联复制模型)。

结果

在本文中,我们开发了 MSOAR 2.0,这是一种用于一对一直系同源物分配的改进系统。对于一对输入基因组,系统首先关注每个基因组中的串联重复基因,并尝试使用简单的系统发育树协调方法来识别其中那些在物种形成后复制的基因(即所谓的同基因)。对于每个这样的串联重复同基因集,除了一个基因之外,所有基因都将从相关基因组中删除(因为它们不可能出现在任何一对一的直系同源物对中),然后调用 MSOAR。使用模拟和真实数据实验,我们表明 MSOAR 2.0 能够实现比 MSOAR 更好的灵敏度和特异性。与著名的全基因组直系同源物分配工具 InParanoid、Ensembl 直系同源物数据库以及来自著名的全基因组多重比对程序 MultiZ 的同源信息相比,MSOAR 2.0 显示出最高的灵敏度。尽管在真实数据实验中,MSOAR 2.0 的特异性略低于 InParanoid,但实际上它在模拟测试中的特异性优于 InParanoid。

结论

我们的初步实验结果表明,MSOAR 2.0 是一种用于密切相关基因组之间一对一直系同源物分配的高度准确工具。该软件可供公众免费使用,并包含在在线补充材料中。

相似文献

1
MSOAR 2.0: Incorporating tandem duplications into ortholog assignment based on genome rearrangement.MSOAR 2.0:基于基因组重排的串联重复整合到直系同源物分配中。
BMC Bioinformatics. 2010 Jan 6;11:10. doi: 10.1186/1471-2105-11-10.
2
MSOAR: a high-throughput ortholog assignment system based on genome rearrangement.MSOAR:一种基于基因组重排的高通量直系同源物分配系统。
J Comput Biol. 2007 Nov;14(9):1160-75. doi: 10.1089/cmb.2007.0048.
3
MultiMSOAR 2.0: an accurate tool to identify ortholog groups among multiple genomes.MultiMSOAR 2.0:一种用于在多个基因组中识别直系同源物的精确工具。
PLoS One. 2011;6(6):e20892. doi: 10.1371/journal.pone.0020892. Epub 2011 Jun 21.
4
Clustering of main orthologs for multiple genomes.多个基因组主要直系同源基因的聚类
Comput Syst Bioinformatics Conf. 2007;6:195-201.
5
Clustering of main orthologs for multiple genomes.多个基因组主要直系同源基因的聚类
J Bioinform Comput Biol. 2008 Jun;6(3):573-84. doi: 10.1142/s0219720008003540.
6
Assignment of orthologous genes via genome rearrangement.通过基因组重排进行直系同源基因的分配。
IEEE/ACM Trans Comput Biol Bioinform. 2005 Oct-Dec;2(4):302-15. doi: 10.1109/TCBB.2005.48.
7
InParanoid 6: eukaryotic ortholog clusters with inparalogs.InParanoid 6:含旁系同源基因的真核直系同源簇
Nucleic Acids Res. 2008 Jan;36(Database issue):D263-6. doi: 10.1093/nar/gkm1020. Epub 2007 Nov 30.
8
Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.通过成对物种比较对直系同源基因和旁系同源基因进行自动聚类。
J Mol Biol. 2001 Dec 14;314(5):1041-52. doi: 10.1006/jmbi.2000.5197.
9
Genome rearrangements with duplications.基因组重排与重复。
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S27. doi: 10.1186/1471-2105-11-S1-S27.
10
Correlation between sequence conservation and the genomic context after gene duplication.基因复制后序列保守性与基因组背景之间的相关性。
Nucleic Acids Res. 2005 Oct 27;33(19):6164-71. doi: 10.1093/nar/gki913. Print 2005.

引用本文的文献

1
An Exact and Fast SAT Formulation for the DCJ Distance.一种用于DCJ距离的精确且快速的SAT公式化方法。
bioRxiv. 2024 Nov 8:2024.11.05.622153. doi: 10.1101/2024.11.05.622153.
2
Divergent evolutionary trajectories following speciation in two ectoparasitic honey bee mites.物种形成后两种寄生性蜜蜂螨虫的分歧进化轨迹。
Commun Biol. 2019 Oct 1;2:357. doi: 10.1038/s42003-019-0606-0. eCollection 2019.
3
Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers.利用有监督大数据分类器在相关酵母蛋白质组中检测直系同源物时,对无比对特征进行普查。

本文引用的文献

1
An empirical test of the midpoint rooting method.中点生根法的实证检验。
Biol J Linn Soc Lond. 2007 Dec;92(4):669-674. doi: 10.1111/j.1095-8312.2007.00864.x. Epub 2007 Dec 7.
2
Overview and comparison of ortholog databases.直系同源数据库概述与比较
Drug Discov Today Technol. 2006 Summer;3(2):137-43. doi: 10.1016/j.ddtec.2006.06.002.
3
EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates.EnsemblCompara基因树:脊椎动物中完整的、可识别基因复制的系统发育树。
BMC Bioinformatics. 2018 May 3;19(1):166. doi: 10.1186/s12859-018-2148-8.
4
Orthonome - a new pipeline for predicting high quality orthologue gene sets applicable to complete and draft genomes.Orthonome——一种用于预测适用于完整基因组和草图基因组的高质量直系同源基因集的新流程。
BMC Genomics. 2017 Aug 31;18(1):673. doi: 10.1186/s12864-017-4079-6.
5
Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation.利用玉米和水稻叶片在脱黄化过程中的转录组数据对C4光合作用基因中的顺式调控基序进行系统分析。
J Exp Bot. 2016 Sep;67(17):5105-17. doi: 10.1093/jxb/erw275. Epub 2016 Jul 19.
6
Inferring Orthologs: Open Questions and Perspectives.推断直系同源基因:未解决的问题与展望
Genomics Insights. 2016 Feb 25;9:17-28. doi: 10.4137/GEI.S37925. eCollection 2016.
7
An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.一种用于相关酵母物种直系同源物检测的有效大数据监督不平衡分类方法。
Biomed Res Int. 2015;2015:748681. doi: 10.1155/2015/748681. Epub 2015 Oct 29.
8
Multi-walled carbon nanotube-induced gene expression in vitro: concordance with in vivo studies.多壁碳纳米管在体外诱导基因表达:与体内研究的一致性。
Toxicology. 2015 Feb 3;328:66-74. doi: 10.1016/j.tox.2014.12.012. Epub 2014 Dec 13.
9
Analysis of micro-rearrangements in 25 eukaryotic species pairs by SyntenyMapper.利用SyntenyMapper分析25对真核生物物种中的微重排。
PLoS One. 2014 Nov 6;9(11):e112341. doi: 10.1371/journal.pone.0112341. eCollection 2014.
10
Comparative analyses of C₄ and C₃ photosynthesis in developing leaves of maize and rice.玉米和水稻发育叶片中 C₄ 和 C₃ 光合作用的比较分析。
Nat Biotechnol. 2014 Nov;32(11):1158-65. doi: 10.1038/nbt.3019. Epub 2014 Oct 12.
Genome Res. 2009 Feb;19(2):327-35. doi: 10.1101/gr.073585.107. Epub 2008 Nov 24.
4
The quest for orthologs: finding the corresponding gene across genomes.寻找直系同源基因:在不同基因组中找到对应的基因。
Trends Genet. 2008 Nov;24(11):539-51. doi: 10.1016/j.tig.2008.08.009. Epub 2008 Sep 24.
5
Tandemly arrayed genes in vertebrate genomes.脊椎动物基因组中的串联排列基因。
Comp Funct Genomics. 2008;2008:545269. doi: 10.1155/2008/545269.
6
Gene family evolution by duplication, speciation, and loss.通过基因复制、物种形成和基因丢失实现的基因家族进化。
J Comput Biol. 2008 Oct;15(8):1043-62. doi: 10.1089/cmb.2008.0054.
7
Mapping and sequencing of structural variation from eight human genomes.来自八个人类基因组的结构变异的图谱绘制与测序
Nature. 2008 May 1;453(7191):56-64. doi: 10.1038/nature06862.
8
InParanoid 6: eukaryotic ortholog clusters with inparalogs.InParanoid 6:含旁系同源基因的真核直系同源簇
Nucleic Acids Res. 2008 Jan;36(Database issue):D263-6. doi: 10.1093/nar/gkm1020. Epub 2007 Nov 30.
9
MSOAR: a high-throughput ortholog assignment system based on genome rearrangement.MSOAR:一种基于基因组重排的高通量直系同源物分配系统。
J Comput Biol. 2007 Nov;14(9):1160-75. doi: 10.1089/cmb.2007.0048.
10
Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes.通过跨多个完整基因组学习基因和物种特异性替换率进行准确的基因树重建。
Genome Res. 2007 Dec;17(12):1932-42. doi: 10.1101/gr.7105007. Epub 2007 Nov 7.