• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用迭代程序在细菌基因组中搜索分散重复序列。

Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure.

机构信息

Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Bld. 2, 33 Leninsky Ave., 119071 Moscow, Russia.

Moscow Engineering Physics Institute, National Research Nuclear University MEPhI, 31 Kashirskoye Shosse, 115409 Moscow, Russia.

出版信息

Int J Mol Sci. 2023 Jun 30;24(13):10964. doi: 10.3390/ijms241310964.

DOI:10.3390/ijms241310964
PMID:37446142
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10341722/
Abstract

We have developed a de novo method for the identification of dispersed repeats based on the use of random position-weight matrices (PWMs) and an iterative procedure (IP). The created algorithm (IP method) allows detection of dispersed repeats for which the average number of substitutions between any two repeats per nucleotide () is less than or equal to 1.5. We have shown that all previously developed methods and algorithms (RED, RECON, and some others) can only find dispersed repeats for x ≤ 1.0. We applied the IP method to find dispersed repeats in the genomes of and nine other bacterial species. We identify three families of approximately 1.09 × 10, 0.64 × 10, and 0.58 × 10 DNA bases, respectively, constituting almost 50% of the complete genome. The length of the repeats is in the range of 400 to 600 bp. Other analyzed bacterial genomes contain one to three families of dispersed repeats with a total number of 10 to 6 × 10 copies. The existence of such highly divergent repeats could be associated with the presence of a single-type triplet periodicity in various genes or with the packing of bacterial DNA into a nucleoid.

摘要

我们开发了一种新的方法,用于识别分散重复,该方法基于随机位置权重矩阵 (PWMs) 和迭代过程 (IP)。创建的算法 (IP 方法) 可以检测到平均每个核苷酸的两个重复之间的替换数 () 小于或等于 1.5 的分散重复。我们已经表明,以前开发的所有方法和算法 (RED、RECON 等) 只能找到 x ≤ 1.0 的分散重复。我们将 IP 方法应用于和其他九种细菌基因组中的分散重复的寻找。我们鉴定了三个家族,分别约为 1.09×10 、 0.64×10 和 0.58×10 个 DNA 碱基,分别构成了完整 基因组的近 50%。重复的长度在 400 到 600 bp 之间。其他分析的细菌基因组包含一个到三个家族的分散重复,总数为 10 到 6×10 个拷贝。如此高度变异的重复的存在可能与各种基因中存在单一类型的三联体周期性或细菌 DNA 包装成核小体有关。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/c2df511e34ba/ijms-24-10964-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/cf1f55413cff/ijms-24-10964-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/ea76ac10c1fe/ijms-24-10964-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/3ca915bc7f95/ijms-24-10964-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/dbf537ba0a24/ijms-24-10964-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/ee5c423cbcae/ijms-24-10964-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/c2df511e34ba/ijms-24-10964-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/cf1f55413cff/ijms-24-10964-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/ea76ac10c1fe/ijms-24-10964-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/3ca915bc7f95/ijms-24-10964-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/dbf537ba0a24/ijms-24-10964-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/ee5c423cbcae/ijms-24-10964-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b948/10341722/c2df511e34ba/ijms-24-10964-g006.jpg

相似文献

1
Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure.使用迭代程序在细菌基因组中搜索分散重复序列。
Int J Mol Sci. 2023 Jun 30;24(13):10964. doi: 10.3390/ijms241310964.
2
Study of Dispersed Repeats in the Genome.基因组中分散重复序列的研究
Int J Mol Sci. 2024 Apr 18;25(8):4441. doi: 10.3390/ijms25084441.
3
Detection of Highly Divergent Tandem Repeats in the Rice Genome.检测水稻基因组中的高度变异串联重复序列。
Genes (Basel). 2021 Mar 25;12(4):473. doi: 10.3390/genes12040473.
4
Unique Features of Tandem Repeats in Bacteria.细菌串联重复的独特特征。
J Bacteriol. 2020 Oct 8;202(21). doi: 10.1128/JB.00229-20.
5
Red: an intelligent, rapid, accurate tool for detecting repeats de-novo on the genomic scale.Red:一种用于在基因组规模上从头检测重复序列的智能、快速且准确的工具。
BMC Bioinformatics. 2015 Jul 24;16:227. doi: 10.1186/s12859-015-0654-5.
6
GTAG- and CGTC-tagged palindromic DNA repeats in prokaryotes.原核生物中 GTAG 和 CGTC 标记的回文 DNA 重复序列。
BMC Genomics. 2013 Jul 31;14:522. doi: 10.1186/1471-2164-14-522.
7
Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains: a web-based resource.通过直接比较不同细菌菌株的基因组序列鉴定多态性串联重复序列:一个基于网络的资源。
BMC Bioinformatics. 2004 Jan 12;5:4. doi: 10.1186/1471-2105-5-4.
8
Improving prokaryotic transposable elements identification using a combination of de novo and profile HMM methods.利用从头预测和 Profile-HMM 方法的组合提高原核转座元件的识别。
BMC Genomics. 2013 Oct 11;14:700. doi: 10.1186/1471-2164-14-700.
9
A New Noncoding RNA Arranges Bacterial Chromosome Organization.一种新型非编码RNA调控细菌染色体组织
mBio. 2015 Aug 25;6(4):e00998-15. doi: 10.1128/mBio.00998-15.
10
Color-coding reveals tandem repeats in the Escherichia coli genome.颜色编码揭示了大肠杆菌基因组中的串联重复序列。
J Mol Biol. 2000 May 5;298(3):343-9. doi: 10.1006/jmbi.2000.3667.

引用本文的文献

1
Study of Dispersed Repeats in the Genome.基因组中分散重复序列的研究
Int J Mol Sci. 2024 Apr 18;25(8):4441. doi: 10.3390/ijms25084441.
2
Bioinformatics tools for the sequence complexity estimates.用于序列复杂性估计的生物信息学工具。
Biophys Rev. 2023 Sep 15;15(5):1367-1378. doi: 10.1007/s12551-023-01140-y. eCollection 2023 Oct.

本文引用的文献

1
Earth Biogenome Project: present status and future plans.地球生物基因组计划:现状与未来规划。
Trends Genet. 2022 Aug;38(8):811-820. doi: 10.1016/j.tig.2022.04.008. Epub 2022 May 19.
2
Methodologies for the Discovery of Transposable Element Families.转座元件家族发现方法学
Genes (Basel). 2022 Apr 17;13(4):709. doi: 10.3390/genes13040709.
3
Application of the MAHDS Method for Multiple Alignment of Highly Diverged Amino Acid Sequences.MAHDS方法在高度分化氨基酸序列多重比对中的应用。
Int J Mol Sci. 2022 Mar 29;23(7):3764. doi: 10.3390/ijms23073764.
4
Search for SINE repeats in the rice genome using correlation-based position weight matrices.利用基于相关性的位置权重矩阵在水稻基因组中搜索 SINE 重复序列。
BMC Bioinformatics. 2021 Feb 2;22(1):42. doi: 10.1186/s12859-021-03977-0.
5
Multiple Alignment of Promoter Sequences from the L. Genome.从 L. 基因组中启动子序列的多重比对。
Genes (Basel). 2021 Jan 21;12(2):135. doi: 10.3390/genes12020135.
6
Giant lungfish genome elucidates the conquest of land by vertebrates.巨型肺鱼基因组揭示了脊椎动物征服陆地的过程。
Nature. 2021 Feb;590(7845):284-289. doi: 10.1038/s41586-021-03198-8. Epub 2021 Jan 18.
7
Architecture of the Escherichia coli nucleoid.大肠杆菌核区的结构。
PLoS Genet. 2019 Dec 12;15(12):e1008456. doi: 10.1371/journal.pgen.1008456. eCollection 2019 Dec.
8
Model-based genome-wide determination of RNA chain elongation rates in Escherichia coli.基于模型的大肠杆菌 RNA 链延伸速率的全基因组测定。
Sci Rep. 2017 Dec 8;7(1):17213. doi: 10.1038/s41598-017-17408-9.
9
[Ordering of double-stranded DNA molecules in a cholesteric liquid-crystalline phase and in dispersion particles of this phase].[双链DNA分子在胆甾相液晶相及其该相分散颗粒中的排列]
Mol Biol (Mosk). 2016 Sep-Oct;50(5):887-896. doi: 10.7868/S0026898416040121.
10
Search of latent periodicity in amino acid sequences by means of genetic algorithm and dynamic programming.利用遗传算法和动态规划搜索氨基酸序列中的潜在周期性。
Stat Appl Genet Mol Biol. 2016 Oct 1;15(5):381-400. doi: 10.1515/sagmb-2015-0079.