• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用基于相关性的位置权重矩阵在水稻基因组中搜索 SINE 重复序列。

Search for SINE repeats in the rice genome using correlation-based position weight matrices.

机构信息

Research Center of Biotechnology of the Russian Academy of Sciences, 60 let Oktjabrja pr-t, 7, bld. 1, Moscow, Russia.

出版信息

BMC Bioinformatics. 2021 Feb 2;22(1):42. doi: 10.1186/s12859-021-03977-0.

DOI:10.1186/s12859-021-03977-0
PMID:33530928
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7852121/
Abstract

BACKGROUND

Transposable elements (TEs) constitute a significant part of eukaryotic genomes. Short interspersed nuclear elements (SINEs) are non-autonomous TEs, which are widely represented in mammalian genomes and also found in plants. After insertion in a new position in the genome, TEs quickly accumulate mutations, which complicate their identification and annotation by modern bioinformatics methods. In this study, we searched for highly divergent SINE copies in the genome of rice (Oryza sativa subsp. japonica) using the Highly Divergent Repeat Search Method (HDRSM).

RESULTS

The HDRSM considers correlations of neighboring symbols to construct position weight matrix (PWM) for a SINE family, which is then used to perform a search for new copies. In order to evaluate the accuracy of the method and compare it with the RepeatMasker program, we generated a set of SINE copies containing nucleotide substitutions and indels and inserted them into an artificial chromosome for analysis. The HDRSM showed better results both in terms of the number of identified inserted repeats and the accuracy of determining their boundaries. A search for the copies of 39 SINE families in the rice genome produced 14,030 hits; among them, 5704 were not detected by RepeatMasker.

CONCLUSIONS

The HDRSM could find divergent SINE copies, correctly determine their boundaries, and offer a high level of statistical significance. We also found that RepeatMasker is able to find relatively short copies of the SINE families with a higher level of similarity, while HDRSM is able to find more diverged copies. To obtain a comprehensive profile of SINE distribution in the genome, combined application of the HDRSM and RepeatMasker is recommended.

摘要

背景

转座元件(TEs)构成了真核生物基因组的重要组成部分。短散布核元件(SINEs)是非自主 TEs,广泛存在于哺乳动物基因组中,也存在于植物中。TE 在基因组的新位置插入后,会迅速积累突变,这使得现代生物信息学方法难以对其进行识别和注释。在这项研究中,我们使用高度分化重复搜索方法(HDRSM)在水稻(Oryza sativa subsp. japonica)基因组中搜索高度分化的 SINE 拷贝。

结果

HDRSM 考虑了相邻符号之间的相关性,构建了 SINE 家族的位置权重矩阵(PWM),然后使用该矩阵来搜索新的拷贝。为了评估该方法的准确性并将其与 RepeatMasker 程序进行比较,我们生成了一组包含核苷酸取代和插入缺失的 SINE 拷贝,并将其插入人工染色体进行分析。HDRSM 在识别插入重复的数量和确定其边界的准确性方面都表现出更好的结果。在水稻基因组中搜索 39 个 SINE 家族的拷贝产生了 14030 个命中;其中,RepeatMasker 未检测到 5704 个。

结论

HDRSM 能够找到分化的 SINE 拷贝,正确确定其边界,并提供高水平的统计显著性。我们还发现,RepeatMasker 能够找到具有更高相似性的相对较短的 SINE 家族拷贝,而 HDRSM 能够找到更多分化的拷贝。为了获得 SINE 在基因组中分布的全面概况,建议联合使用 HDRSM 和 RepeatMasker。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/4d51d6553f61/12859_2021_3977_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/875c408b79a6/12859_2021_3977_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/86eac74da7b3/12859_2021_3977_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/754bad29464d/12859_2021_3977_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/06f64a6cad9c/12859_2021_3977_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/322c280d145c/12859_2021_3977_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/031c4c902511/12859_2021_3977_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/4d51d6553f61/12859_2021_3977_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/875c408b79a6/12859_2021_3977_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/86eac74da7b3/12859_2021_3977_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/754bad29464d/12859_2021_3977_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/06f64a6cad9c/12859_2021_3977_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/322c280d145c/12859_2021_3977_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/031c4c902511/12859_2021_3977_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b281/7852121/4d51d6553f61/12859_2021_3977_Fig7_HTML.jpg

相似文献

1
Search for SINE repeats in the rice genome using correlation-based position weight matrices.利用基于相关性的位置权重矩阵在水稻基因组中搜索 SINE 重复序列。
BMC Bioinformatics. 2021 Feb 2;22(1):42. doi: 10.1186/s12859-021-03977-0.
2
Evolutionary modes of emergence of short interspersed nuclear element (SINE) families in grasses.短散在核元件(SINE)家族在禾本科植物中的出现的进化模式。
Plant J. 2017 Nov;92(4):676-695. doi: 10.1111/tpj.13676. Epub 2017 Oct 9.
3
Transposable element annotation of the rice genome.水稻基因组的转座元件注释
Bioinformatics. 2004 Jan 22;20(2):155-60. doi: 10.1093/bioinformatics/bth019.
4
Distribution, Diversity, and Long-Term Retention of Grass Short Interspersed Nuclear Elements (SINEs).草类短散布核元件(SINEs)的分布、多样性和长期保留。
Genome Biol Evol. 2017 Aug 1;9(8):2048-2056. doi: 10.1093/gbe/evx145.
5
AnnoSINE: a short interspersed nuclear elements annotation tool for plant genomes.AnnoSINE:一种用于植物基因组的短散在核元件注释工具。
Plant Physiol. 2022 Feb 4;188(2):955-970. doi: 10.1093/plphys/kiab524.
6
Comparative evolution history of SINEs in Arabidopsis thaliana and Brassica oleracea: evidence for a high rate of SINE loss.拟南芥和甘蓝型油菜中短散在重复序列(SINEs)的比较进化史:SINEs高丢失率的证据
Cytogenet Genome Res. 2005;110(1-4):441-7. doi: 10.1159/000084976.
7
MetaSINEs: Broad Distribution of a Novel SINE Superfamily in Animals.元短散在重复序列:动物中一个新型短散在重复序列超家族的广泛分布
Genome Biol Evol. 2016 Feb 12;8(3):528-39. doi: 10.1093/gbe/evw029.
8
Isolation and Characterization of Interspersed Repeated Sequences in the Common Lizard, Zootoca vivipara, and Their Conservation in Squamata.普通蜥蜴(胎生蜥蜴,Zootoca vivipara)中散布重复序列的分离与特征分析及其在有鳞目动物中的保守性
Cytogenet Genome Res. 2019;157(1-2):65-76. doi: 10.1159/000497304. Epub 2019 Mar 6.
9
New SINE families from rice, OsSN, with poly(A) at the 3' ends.来自水稻的新SINE家族OsSN,其3'端带有poly(A)。
Genes Genet Syst. 2008 Jun;83(3):227-36. doi: 10.1266/ggs.83.227.
10
Two new SINE elements, p-SINE2 and p-SINE3, from rice.来自水稻的两个新的短散在重复元件,p-SINE2和p-SINE3。
Genes Genet Syst. 2005 Jun;80(3):161-71. doi: 10.1266/ggs.80.161.

引用本文的文献

1
Large tandem repeats of grass frog (Rana temporaria) in silico and in situ.草蛙(欧洲林蛙)大串联重复序列的电子克隆和原位分析
BMC Genomics. 2025 May 6;26(1):445. doi: 10.1186/s12864-025-11643-5.
2
Study of Dispersed Repeats in the Genome.基因组中分散重复序列的研究
Int J Mol Sci. 2024 Apr 18;25(8):4441. doi: 10.3390/ijms25084441.
3
Bioinformatics tools for the sequence complexity estimates.用于序列复杂性估计的生物信息学工具。

本文引用的文献

1
RepeatModeler2 for automated genomic discovery of transposable element families.RepeatModeler2 用于自动发现转座元件家族的基因组。
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457. doi: 10.1073/pnas.1921046117. Epub 2020 Apr 16.
2
Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline.针对可转座元件注释方法进行基准测试,以创建简化、全面的流水线。
Genome Biol. 2019 Dec 16;20(1):275. doi: 10.1186/s13059-019-1905-y.
3
Retrotransposons in Plant Genomes: Structure, Identification, and Classification through Bioinformatics and Machine Learning.
Biophys Rev. 2023 Sep 15;15(5):1367-1378. doi: 10.1007/s12551-023-01140-y. eCollection 2023 Oct.
4
Search for Dispersed Repeats in Bacterial Genomes Using an Iterative Procedure.使用迭代程序在细菌基因组中搜索分散重复序列。
Int J Mol Sci. 2023 Jun 30;24(13):10964. doi: 10.3390/ijms241310964.
植物基因组中的逆转座子:通过生物信息学和机器学习进行结构、鉴定和分类。
Int J Mol Sci. 2019 Aug 6;20(15):3837. doi: 10.3390/ijms20153837.
4
Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes.搜索拟南芥和其他基因组中的 cds 潜在移码突变。
DNA Res. 2019 Apr 1;26(2):157-170. doi: 10.1093/dnares/dsy046.
5
Ten things you should know about transposable elements.转座元件的十件必知事项
Genome Biol. 2018 Nov 19;19(1):199. doi: 10.1186/s13059-018-1577-z.
6
The future of transposable element annotation and their classification in the light of functional genomics - what we can learn from the fables of Jean de la Fontaine?从功能基因组学角度看转座元件注释及其分类的未来——我们能从让·德·拉·封丹的寓言中学到什么?
Mob Genet Elements. 2016 Nov 4;6(6):e1256852. doi: 10.1080/2159256X.2016.1256852. eCollection 2016.
7
SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets.SINE扫描:一种在大规模基因组数据集中发现短散在核元件(SINEs)的有效工具。
Bioinformatics. 2017 Mar 1;33(5):743-745. doi: 10.1093/bioinformatics/btw718.
8
CTRL+INSERT: retrotransposons and their contribution to regulation and innovation of the transcriptome.CTRL+INSERT:逆转录转座子及其对转录组调控与创新的贡献。
EMBO Rep. 2016 Aug;17(8):1131-44. doi: 10.15252/embr.201642743. Epub 2016 Jul 11.
9
Search of latent periodicity in amino acid sequences by means of genetic algorithm and dynamic programming.利用遗传算法和动态规划搜索氨基酸序列中的潜在周期性。
Stat Appl Genet Mol Biol. 2016 Oct 1;15(5):381-400. doi: 10.1515/sagmb-2015-0079.
10
Retrotransposons as regulators of gene expression.逆转录转座子作为基因表达的调控因子。
Science. 2016 Feb 12;351(6274):aac7247. doi: 10.1126/science.aac7247. Epub 2016 Feb 11.