IsoFinder：基因组序列中同线区的计算预测

IsoFinder: computational prediction of isochores in genome sequences.

作者信息

Oliver José L, Carpena Pedro, Hackenberg Michael, Bernaola-Galván Pedro

机构信息

Departamento de Genética, Instituto de Biotecnología, Facultad de Ciencias, Universidad de Granada, Spain.

出版信息

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W287-92. doi: 10.1093/nar/gkh399.

DOI:10.1093/nar/gkh399

PMID:15215396

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC441537/

Abstract

Isochores are long genome segments homogeneous in G+C. Here, we describe an algorithm (IsoFinder) running on the web (http://bioinfo2.ugr.es/IsoF/isofinder.html) able to predict isochores at the sequence level. We move a sliding pointer from left to right along the DNA sequence. At each position of the pointer, we compute the mean G+C values to the left and to the right of the pointer. We then determine the position of the pointer for which the difference between left and right mean values (as measured by the t-statistic) reaches its maximum. Next, we determine the statistical significance of this potential cutting point, after filtering out short-scale heterogeneities below 3 kb by applying a coarse-graining technique. Finally, the program checks whether this significance exceeds a probability threshold. If so, the sequence is cut at this point into two subsequences; otherwise, the sequence remains undivided. The procedure continues recursively for each of the two resulting subsequences created by each cut. This leads to the decomposition of a chromosome sequence into long homogeneous genome regions (LHGRs) with well-defined mean G+C contents, each significantly different from the G+C contents of the adjacent LHGRs. Most LHGRs can be identified with Bernardi's isochores, given their correlation with biological features such as gene density, SINE and LINE (short, long interspersed repetitive elements) densities, recombination rate or single nucleotide polymorphism variability. The resulting isochore maps are available at our web site (http://bioinfo2.ugr.es/isochores/), and also at the UCSC Genome Browser (http://genome.cse.ucsc.edu/).

摘要

等密度区是基因组中G+C含量均匀的长片段。在此，我们描述了一种在网页上运行的算法（IsoFinder，网址为http://bioinfo2.ugr.es/IsoF/isofinder.html），它能够在序列水平上预测等密度区。我们沿着DNA序列从左到右移动一个滑动指针。在指针的每个位置，我们计算指针左侧和右侧的平均G+C值。然后，我们确定指针的位置，此时左右平均值之间的差异（通过t统计量测量）达到最大值。接下来，在通过应用粗粒化技术滤除3 kb以下的短尺度异质性后，我们确定这个潜在切割点的统计显著性。最后，程序检查这个显著性是否超过概率阈值。如果是，则在这一点将序列切割成两个子序列；否则，序列保持未分割状态。对于每次切割产生的两个子序列中的每一个，该过程都递归继续。这导致染色体序列分解为具有明确平均G+C含量的长均质基因组区域（LHGRs），每个区域与相邻LHGRs的G+C含量有显著差异。考虑到大多数LHGRs与基因密度、SINE和LINE（短、长散布重复元件）密度、重组率或单核苷酸多态性变异性等生物学特征的相关性，它们可以被识别为伯纳迪的等密度区。生成的等密度区图谱可在我们的网站（http://bioinfo2.ugr.es/isochores/）以及UCSC基因组浏览器（http://genome.cse.ucsc.edu/）上获取。

相似文献

IsoFinder: computational prediction of isochores in genome sequences.IsoFinder：基因组序列中同线区的计算预测

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W287-92. doi: 10.1093/nar/gkh399.

Isochore chromosome maps of the human genome.人类基因组的等臂染色体图谱。

Gene. 2002 Oct 30;300(1-2):117-27. doi: 10.1016/s0378-1119(02)01034-x.

Isochore chromosome maps of eukaryotic genomes.真核生物基因组的等臂染色体图谱。

Gene. 2001 Oct 3;276(1-2):47-56. doi: 10.1016/s0378-1119(01)00641-2.

Discovering isochores by least-squares optimal segmentation.通过最小二乘最优分割发现等容线。

Gene. 2007 Jun 1;394(1-2):53-60. doi: 10.1016/j.gene.2007.01.028. Epub 2007 Feb 16.

Isochore structures in the chicken genome.鸡基因组中的等容线结构。

FEBS J. 2006 Apr;273(8):1637-48. doi: 10.1111/j.1742-4658.2006.05178.x.

Isochore structures in the mouse genome.小鼠基因组中的等容线结构。

Genomics. 2004 Mar;83(3):384-94. doi: 10.1016/j.ygeno.2003.09.011.

A computational prediction of isochores based on hidden Markov models.基于隐马尔可夫模型对等值线的计算预测。

Gene. 2006 Dec 30;385:41-9. doi: 10.1016/j.gene.2006.04.032. Epub 2006 Aug 17.

GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences.GC-Profile：一个用于可视化和分析基因组序列中GC含量变化的基于网络的工具。

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W686-91. doi: 10.1093/nar/gkl040.

The biased distribution of Alus in human isochores might be driven by recombination.人类等密度区中Alu元件的偏向性分布可能是由重组驱动的。

J Mol Evol. 2005 Mar;60(3):365-77. doi: 10.1007/s00239-004-0197-2.

Prediction of replication time zones at single nucleotide resolution in the human genome.

FEBS Lett. 2008 Jul 9;582(16):2441-4. doi: 10.1016/j.febslet.2008.06.008. Epub 2008 Jun 12.

引用本文的文献

Inference of genomic landscapes using ordered Hidden Markov Models with emission densities (oHMMed).使用具有发射密度的有序隐马尔可夫模型（oHMMed）进行基因组景观推断。

BMC Bioinformatics. 2024 Apr 16;25(1):151. doi: 10.1186/s12859-024-05751-4.

De Novo Genome Assembly at Chromosome-Scale of (Diptera Stratiomyidae) via PacBio and Omni-C Proximity Ligation Technology.利用PacBio和全基因组染色质构象捕获技术对（双翅目：水虻科）进行染色体水平的从头基因组组装

Insects. 2024 Feb 17;15(2):133. doi: 10.3390/insects15020133.

Whole-genome sequencing provides novel insights into the evolutionary history and genetic adaptation of reindeer populations in northern Eurasia.全基因组测序为了解北亚驯鹿种群的进化历史和遗传适应提供了新的视角。

Sci Rep. 2023 Dec 27;13(1):23019. doi: 10.1038/s41598-023-50253-7.

Compositional Structure of the Genome: A Review.基因组的组成结构：综述

Biology (Basel). 2023 Jun 13;12(6):849. doi: 10.3390/biology12060849.

Detection and characterization of constitutive replication origins defined by DNA polymerase epsilon.检测和鉴定 DNA 聚合酶 ε 定义的组成型复制起点。

BMC Biol. 2023 Feb 24;21(1):41. doi: 10.1186/s12915-023-01527-z.

Chromosomal-level assembly of Bactericera cockerelli reveals rampant gene family expansions impacting genome structure, function and insect-microbe-plant-interactions.小菜蛾染色体水平基因组组装揭示了基因家族的广泛扩张，这些扩张影响了基因组结构、功能以及昆虫-微生物-植物相互作用。

Mol Ecol Resour. 2023 Jan;23(1):233-252. doi: 10.1111/1755-0998.13693. Epub 2022 Aug 16.

Driven progressive evolution of genome sequence complexity in Cyanobacteria.蓝细菌基因组序列复杂性的驱动进化。

Sci Rep. 2020 Nov 4;10(1):19073. doi: 10.1038/s41598-020-76014-4.

Characterizing the interplay between gene nucleotide composition bias and splicing.描述基因核苷酸组成偏向与剪接之间的相互作用。

Genome Biol. 2019 Nov 29;20(1):259. doi: 10.1186/s13059-019-1869-y.

Transgenerational Self-Reconstruction of Disrupted Chromatin Organization After Exposure To An Environmental Stressor in Mice.环境应激暴露后小鼠染色质组织紊乱的跨代自我重建。

Sci Rep. 2019 Sep 10;9(1):13057. doi: 10.1038/s41598-019-49440-2.

Implications of CpG islands on chromosomal architectures and modes of global gene regulation.CpG 岛对染色体结构和全局基因调控模式的影响。

Nucleic Acids Res. 2018 May 18;46(9):4382-4391. doi: 10.1093/nar/gky147.

本文引用的文献

Isochore structures in the mouse genome.小鼠基因组中的等容线结构。

Genomics. 2004 Mar;83(3):384-94. doi: 10.1016/j.ygeno.2003.09.011.

Identification of isochore boundaries in the human genome using the technique of wavelet multiresolution analysis.运用小波多分辨率分析技术识别人类基因组中的等容线边界。

Biochem Biophys Res Commun. 2003 Nov 7;311(1):215-22. doi: 10.1016/j.bbrc.2003.09.198.

Isochores merit the prefix 'iso'.等密度区值得加上前缀“等”。

Comput Biol Chem. 2003 Feb;27(1):5-10. doi: 10.1016/s1476-9271(02)00090-7.

Isochore chromosome maps of the human genome.人类基因组的等臂染色体图谱。

Gene. 2002 Oct 30;300(1-2):117-27. doi: 10.1016/s0378-1119(02)01034-x.

A simple and species-independent coding measure.

Gene. 2002 Oct 30;300(1-2):97-104. doi: 10.1016/s0378-1119(02)01041-7.

Segmentation of genomic DNA through entropic divergence: power laws and scaling.基于熵散度的基因组DNA分割：幂律与标度

Phys Rev E Stat Nonlin Soft Matter Phys. 2002 May;65(5 Pt 1):051909. doi: 10.1103/PhysRevE.65.051909. Epub 2002 May 8.

Isochores: dream or reality?

Trends Biotechnol. 2002 Jun;20(6):237. doi: 10.1016/s0167-7799(02)01951-0.

Analysis of symbolic sequences using the Jensen-Shannon divergence.使用詹森 - 香农散度分析符号序列。

Phys Rev E Stat Nonlin Soft Matter Phys. 2002 Apr;65(4 Pt 1):041905. doi: 10.1103/PhysRevE.65.041905. Epub 2002 Mar 25.

Expected relationship between the silent substitution rate and the GC content: implications for the evolution of isochores.沉默替换率与GC含量之间的预期关系：对等位基因进化的影响

J Mol Evol. 2002 Jan;54(1):129-33. doi: 10.1007/s00239-001-0011-3.

Scale invariance in the nonstationarity of human heart rate.人类心率非平稳性中的尺度不变性。

Phys Rev Lett. 2001 Oct 15;87(16):168105. doi: 10.1103/PhysRevLett.87.168105. Epub 2001 Oct 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验