• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过整数规划在成对基因组比较中筛选同线性块。

Screening synteny blocks in pairwise genome comparisons through integer programming.

机构信息

Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.

出版信息

BMC Bioinformatics. 2011 Apr 18;12:102. doi: 10.1186/1471-2105-12-102.

DOI:10.1186/1471-2105-12-102
PMID:21501495
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3088904/
Abstract

BACKGROUND

It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events.

RESULTS

We have formulated the synteny block screening as an optimization problem known as "Binary Integer Programming" (BIP), which is solved using existing linear programming solvers. The computer program QUOTA-ALIGN performs this task by creating a clear objective function that maximizes the compatible set of synteny blocks under given constraints on overlaps and depths (corresponding to the duplication history in respective genomes). Such a procedure is useful for any pairwise synteny alignments, but is most useful in lineages affected by multiple WGDs, like plants or fish lineages. For example, there should be a 1:2 ploidy relationship between genome A and B if genome B had an independent WGD subsequent to the divergence of the two genomes. We show through simulations and real examples using plant genomes in the rosid superorder that the quota-based screening can eliminate ambiguous synteny blocks and focus on specific genomic evolutionary events, like the divergence of lineages (in cross-species comparisons) and the most recent WGD (in self comparisons).

CONCLUSIONS

The QUOTA-ALIGN algorithm screens a set of synteny blocks to retain only those compatible with a user specified ploidy relationship between two genomes. These blocks, in turn, may be used for additional downstream analyses such as identifying true orthologous regions in interspecific comparisons. There are two major contributions of QUOTA-ALIGN: 1) reducing the block screening task to a BIP problem, which is novel; 2) providing an efficient software pipeline starting from all-against-all BLAST to the screened synteny blocks with dot plot visualizations. Python codes and full documentations are publicly available http://github.com/tanghaibao/quota-alignment. QUOTA-ALIGN program is also integrated as a major component in SynMap http://genomevolution.com/CoGe/SynMap.pl, offering easier access to thousands of genomes for non-programmers.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/82ce70b143e4/1471-2105-12-102-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/ed5d4b9cace5/1471-2105-12-102-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/1d8314997ef5/1471-2105-12-102-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/f64fe5e3e4e5/1471-2105-12-102-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/a94ba99b7e46/1471-2105-12-102-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/dd9b04639e7d/1471-2105-12-102-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/82ce70b143e4/1471-2105-12-102-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/ed5d4b9cace5/1471-2105-12-102-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/1d8314997ef5/1471-2105-12-102-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/f64fe5e3e4e5/1471-2105-12-102-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/a94ba99b7e46/1471-2105-12-102-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/dd9b04639e7d/1471-2105-12-102-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8fd/3088904/82ce70b143e4/1471-2105-12-102-6.jpg
摘要

背景

由于与共同祖先的基因组存在显著差异,因此准确解释染色体对应关系(如真正的同源和旁系同源)具有一定难度。在经历了多次全基因组复制(WGD)事件的谱系中,分析尤其成问题。为了比较来自基因组复制的多个“亚基因组”,我们需要放宽对基因组区域“一一”同线性匹配的传统要求,以反映“一对多”或更一般地“多对多”匹配。但是,这种放宽可能会导致识别出源自非感兴趣的古老共享 WGD 的同线性块。对于许多下游分析,我们需要从两两基因组比较中消除弱的、低得分的比对。我们的目标是客观地选择同线性块的子集,在尊重比较基因组的复制历史的同时最大化其总得分。我们将这种方法称为基于配额的同线性块筛选,以便在一个基因组内或具有 WGD 事件的两个基因组之间适当填充同线性关系的配额。

结果

我们已经将同线性块筛选表述为一种称为“二进制整数规划”(BIP)的优化问题,该问题可以使用现有的线性规划求解器来解决。计算机程序 QUOTA-ALIGN 通过创建一个明确的目标函数来执行此任务,该目标函数最大化给定重叠和深度约束下(对应于各自基因组中的复制历史)的同线性块兼容集。这种方法对于任何两两同线性比对都很有用,但在受多次 WGD 影响的谱系中最有用,例如植物或鱼类谱系。例如,如果基因组 B 在两个基因组分化后经历了独立的 WGD,则基因组 A 和 B 之间应该存在 1:2 的倍性关系。我们通过在蔷薇超目植物基因组中的模拟和实际示例表明,基于配额的筛选可以消除模棱两可的同线性块,并专注于特定的基因组进化事件,例如谱系的分化(在种间比较中)和最近的 WGD(在自比较中)。

结论

QUOTA-ALIGN 算法筛选一组同线性块,只保留与两个基因组之间用户指定的倍性关系兼容的那些。这些块反过来又可以用于其他下游分析,例如在种间比较中识别真正的同源区域。QUOTA-ALIGN 的两个主要贡献是:1)将块筛选任务简化为 BIP 问题,这是新颖的;2)提供了一个从所有对所有 BLAST 到带有点图可视化的筛选同线性块的高效软件管道。Python 代码和完整文档可在 http://github.com/tanghaibao/quota-alignment 上公开获取。QUOTA-ALIGN 程序也作为 SynMap 的主要组件集成在内 http://genomevolution.com/CoGe/SynMap.pl,为非程序员提供了对数千个基因组的更简单访问。

相似文献

1
Screening synteny blocks in pairwise genome comparisons through integer programming.通过整数规划在成对基因组比较中筛选同线性块。
BMC Bioinformatics. 2011 Apr 18;12:102. doi: 10.1186/1471-2105-12-102.
2
Navigating the CoGe Online Software Suite for Polyploidy Research.多倍体研究的 CoGe 在线软件套件导航。
Methods Mol Biol. 2023;2545:19-45. doi: 10.1007/978-1-0716-2561-3_2.
3
halSynteny: a fast, easy-to-use conserved synteny block construction method for multiple whole-genome alignments.halSynteny:一种用于多组全基因组比对的快速、易用的保守同线性块构建方法。
Gigascience. 2020 Jun 1;9(6). doi: 10.1093/gigascience/giaa047.
4
Integrated syntenic and phylogenomic analyses reveal an ancient genome duplication in monocots.综合共线性和系统发育基因组分析揭示了单子叶植物中古老的基因组复制现象。
Plant Cell. 2014 Jul;26(7):2792-802. doi: 10.1105/tpc.114.127597. Epub 2014 Jul 31.
5
SynMap2 and SynMap3D: web-based whole-genome synteny browsers.SynMap2 和 SynMap3D:基于网络的全基因组同线性浏览器。
Bioinformatics. 2017 Jul 15;33(14):2197-2198. doi: 10.1093/bioinformatics/btx144.
6
SynChro: a fast and easy tool to reconstruct and visualize synteny blocks along eukaryotic chromosomes.SynChro:一种快速简便的工具,可用于重建和可视化真核生物染色体上的同线性块。
PLoS One. 2014 Mar 20;9(3):e92621. doi: 10.1371/journal.pone.0092621. eCollection 2014.
7
Gene order in rosid phylogeny, inferred from pairwise syntenies among extant genomes.蔷薇目系统发育中的基因顺序,根据现存基因组之间的成对同线性推断。
BMC Bioinformatics. 2012 Jun 25;13 Suppl 10(Suppl 10):S9. doi: 10.1186/1471-2105-13-S10-S9.
8
A universal genomic coordinate translator for comparative genomics.用于比较基因组学的通用基因组坐标转换器。
BMC Bioinformatics. 2014 Jun 30;15:227. doi: 10.1186/1471-2105-15-227.
9
Genome evolutionary dynamics followed by diversifying selection explains the complexity of the Sesamum indicum genome.基因组进化动力学随后的多样化选择解释了芝麻基因组的复杂性。
BMC Genomics. 2017 Mar 24;18(1):257. doi: 10.1186/s12864-017-3599-4.
10
Evolutionary history and functional divergence of the cytochrome P450 gene superfamily between Arabidopsis thaliana and Brassica species uncover effects of whole genome and tandem duplications.拟南芥和芸苔属物种之间细胞色素P450基因超家族的进化历史和功能分化揭示了全基因组和串联重复的影响。
BMC Genomics. 2017 Sep 18;18(1):733. doi: 10.1186/s12864-017-4094-7.

引用本文的文献

1
Genome-Wide identification and expression profiles of the WRKY transcription factor family in Artocarpus nanchuanensis.南川木菠萝WRKY转录因子家族的全基因组鉴定及表达谱分析
BMC Plant Biol. 2025 Jul 17;25(1):922. doi: 10.1186/s12870-025-06879-y.
2
SOI: robust identification of orthologous synteny with the Orthology Index and broad applications in evolutionary genomics.SOI:利用直系同源索引对直系同源同线性进行可靠识别及其在进化基因组学中的广泛应用。
Nucleic Acids Res. 2025 Apr 10;53(7). doi: 10.1093/nar/gkaf320.
3
A chromosome-level genome assembly of the varied leaved jewelflower, Streptanthus diversifolius, reveals a recent whole genome duplication.

本文引用的文献

1
The flowering world: a tale of duplications.繁花世界:复制的故事
Trends Plant Sci. 2009 Dec;14(12):680-8. doi: 10.1016/j.tplants.2009.09.001. Epub 2009 Oct 7.
2
Automated identification of conserved synteny after whole-genome duplication.全基因组复制后保守同线性的自动识别。
Genome Res. 2009 Aug;19(8):1497-505. doi: 10.1101/gr.090480.108. Epub 2009 May 22.
3
Genome aliquoting with double cut and join.采用双切割与连接的基因组二等分
多变叶珠宝花(Streptanthus diversifolius)的染色体水平基因组组装揭示了近期的全基因组复制事件。
G3 (Bethesda). 2025 Apr 17;15(4). doi: 10.1093/g3journal/jkaf022.
4
Constraint of accessible chromatins maps regulatory loci involved in maize speciation and domestication.可及染色质的限制映射了参与玉米物种形成和驯化的调控位点。
Nat Commun. 2025 Mar 12;16(1):2477. doi: 10.1038/s41467-025-57932-1.
5
Cross-species single-nucleus analysis reveals the potential role of whole-genome duplication in the evolution of maize flower development.跨物种单核分析揭示了全基因组复制在玉米花发育进化中的潜在作用。
BMC Genomics. 2025 Jan 3;26(1):3. doi: 10.1186/s12864-024-11186-1.
6
Exploring genetic diversity, population structure, and subgenome differences in the allopolyploid : implications for future breeding and research studies.探索异源多倍体中的遗传多样性、种群结构和亚基因组差异:对未来育种和研究的启示。
Hortic Res. 2024 Sep 9;11(11):uhae247. doi: 10.1093/hr/uhae247. eCollection 2024 Nov.
7
JCVI: A versatile toolkit for comparative genomics analysis.JCVI:用于比较基因组学分析的多功能工具包。
Imeta. 2024 Jun 12;3(4):e211. doi: 10.1002/imt2.211. eCollection 2024 Aug.
8
Dipterocarpoidae genomics reveal their demography and adaptations to Asian rainforests.豆科基因组学揭示了它们对亚洲热带雨林的种群动态和适应。
Nat Commun. 2024 Feb 23;15(1):1683. doi: 10.1038/s41467-024-45836-5.
9
Genome-wide identification and characterization of flowering genes in Citrus sinensis (L.) Osbeck: a comparison among C. Medica L., C. Reticulata Blanco, C. Grandis (L.) Osbeck and C. Clementina.柑橘属植物开花基因的全基因组鉴定和特征分析:比较药用柑橘、红橘、甜橙和克莱门氏小柑橘。
BMC Genom Data. 2024 Feb 20;25(1):20. doi: 10.1186/s12863-024-01201-5.
10
High quality genomes produced from single MinION flow cells clarify polyploid and demographic histories of critically endangered Fraxinus (ash) species.从单个 MinION 流动池生成的高质量基因组阐明了极度濒危的 Fraxinus(榆属)物种的多倍体和人口历史。
Commun Biol. 2024 Jan 6;7(1):54. doi: 10.1038/s42003-023-05748-4.
BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-10-S1-S2.
4
Unraveling ancient hexaploidy through multiply-aligned angiosperm gene maps.通过多重比对的被子植物基因图谱解析古老的六倍体现象。
Genome Res. 2008 Dec;18(12):1944-54. doi: 10.1101/gr.080978.108. Epub 2008 Oct 2.
5
Synteny and collinearity in plant genomes.植物基因组中的共线性和同线性
Science. 2008 Apr 25;320(5875):486-8. doi: 10.1126/science.1153917.
6
How to usefully compare homologous plant genes and chromosomes as DNA sequences.如何将同源植物基因和染色体作为DNA序列进行有效比较。
Plant J. 2008 Feb;53(4):661-73. doi: 10.1111/j.1365-313X.2007.03326.x.
7
28-way vertebrate alignment and conservation track in the UCSC Genome Browser.加州大学圣克鲁兹分校基因组浏览器中的28种脊椎动物序列比对与保守性追踪。
Genome Res. 2007 Dec;17(12):1797-808. doi: 10.1101/gr.6761107. Epub 2007 Nov 5.
8
i-ADHoRe 2.0: an improved tool to detect degenerated genomic homology using genomic profiles.i-ADHoRe 2.0:一种利用基因组图谱检测退化基因组同源性的改进工具。
Bioinformatics. 2008 Jan 1;24(1):127-8. doi: 10.1093/bioinformatics/btm449. Epub 2007 Oct 17.
9
The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla.葡萄基因组序列表明主要被子植物门中存在祖先六倍体化现象。
Nature. 2007 Sep 27;449(7161):463-7. doi: 10.1038/nature06148. Epub 2007 Aug 26.
10
Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia.通过纤毛虫四膜虫揭示的全基因组复制的全球趋势。
Nature. 2006 Nov 9;444(7116):171-8. doi: 10.1038/nature05230. Epub 2006 Nov 1.