• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

搜索拟南芥和其他基因组中的 cds 潜在移码突变。

Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes.

机构信息

Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia.

National Research Nuclear University MEPhI (Moscow Engineering Physics Institute), Moscow, Russia.

出版信息

DNA Res. 2019 Apr 1;26(2):157-170. doi: 10.1093/dnares/dsy046.

DOI:10.1093/dnares/dsy046
PMID:30726896
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6476729/
Abstract

A new mathematical method for potential reading frameshift detection in protein-coding sequences (cds) was developed. The algorithm is adjusted to the triplet periodicity of each analysed sequence using dynamic programming and a genetic algorithm. This does not require any preliminary training. Using the developed method, cds from the Arabidopsis thaliana genome were analysed. In total, the algorithm found 9,930 sequences containing one or more potential reading frameshift(s). This is ∼21% of all analysed sequences of the genome. The Type I and Type II error rates were estimated as 11% and 30%, respectively. Similar results were obtained for the genomes of Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Rattus norvegicus and Xenopus tropicalis. Also, the developed algorithm was tested on 17 bacterial genomes. We compared our results with the previously obtained data on the search for potential reading frameshifts in these genomes. This study discussed the possibility that the reading frameshift seems like a relatively frequently encountered mutation; and this mutation could participate in the creation of new genes and proteins.

摘要

我们开发了一种新的数学方法,用于检测蛋白质编码序列(cds)中的潜在读框移码。该算法使用动态规划和遗传算法对每个分析序列的三联体周期性进行调整,无需任何预先训练。使用开发的方法,我们分析了拟南芥基因组的 cds。总共,该算法发现了 9930 个序列包含一个或多个潜在的读框移码。这大约是基因组中所有分析序列的 21%。分别估计了 I 型和 II 型错误率为 11%和 30%。对于秀丽隐杆线虫、黑腹果蝇、智人、大鼠和爪蟾的基因组也获得了类似的结果。此外,我们还在 17 个细菌基因组上测试了开发的算法。我们将我们的结果与之前在这些基因组中搜索潜在读框移码获得的数据进行了比较。本研究讨论了读框移码似乎是一种相对常见的突变的可能性,这种突变可能参与新基因和蛋白质的产生。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/cbbe9ab0a128/dsy046f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/f7647e233716/dsy046f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/829418650b8f/dsy046f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/17fb12f0cc28/dsy046f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/15de100a0d52/dsy046f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/74b3bc2ed661/dsy046f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/8b7dedf8ff85/dsy046f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/cbbe9ab0a128/dsy046f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/f7647e233716/dsy046f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/829418650b8f/dsy046f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/17fb12f0cc28/dsy046f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/15de100a0d52/dsy046f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/74b3bc2ed661/dsy046f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/8b7dedf8ff85/dsy046f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04b0/6476729/cbbe9ab0a128/dsy046f7.jpg

相似文献

1
Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes.搜索拟南芥和其他基因组中的 cds 潜在移码突变。
DNA Res. 2019 Apr 1;26(2):157-170. doi: 10.1093/dnares/dsy046.
2
Database of Periodic DNA Regions in Major Genomes.主要基因组中周期性DNA区域数据库。
Biomed Res Int. 2017;2017:7949287. doi: 10.1155/2017/7949287. Epub 2017 Jan 15.
3
Investigation of phase shifts for different period lengths in the genomes of C. elegans, D. melanogaster and S. cerevisiae.对秀丽隐杆线虫、黑腹果蝇和酿酒酵母基因组中不同周期长度的相移进行研究。
Comput Biol Chem. 2014 Aug;51:12-21. doi: 10.1016/j.compbiolchem.2014.03.004. Epub 2014 Apr 13.
4
Genetack: frameshift identification in protein-coding sequences by the Viterbi algorithm.Genetack:通过维特比算法识别蛋白质编码序列中的移码突变。
J Bioinform Comput Biol. 2010 Jun;8(3):535-51. doi: 10.1142/s0219720010004847.
5
Detection of periodicity in eukaryotic genomes on the basis of power spectrum analysis.基于功率谱分析的真核生物基因组周期性检测
Genome Inform. 2002;13:21-9.
6
Comparisons with Caenorhabditis (approximately 100 Mb) and Drosophila (approximately 175 Mb) using flow cytometry show genome size in Arabidopsis to be approximately 157 Mb and thus approximately 25% larger than the Arabidopsis genome initiative estimate of approximately 125 Mb.利用流式细胞仪与秀丽隐杆线虫(约100兆碱基)和果蝇(约175兆碱基)进行比较,结果显示拟南芥的基因组大小约为157兆碱基,因此比拟南芥基因组计划估计的约125兆碱基大出约25%。
Ann Bot. 2003 Apr;91(5):547-57. doi: 10.1093/aob/mcg057.
7
Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples.序列空间覆盖度、基因组熵以及在人类样本中检测非人类DNA的潜力。
BMC Genomics. 2008 Oct 30;9:509. doi: 10.1186/1471-2164-9-509.
8
Different age distribution patterns of human, nematode, and Arabidopsis duplicate genes.人类、线虫和拟南芥重复基因的不同年龄分布模式。
Gene. 2004 Nov 24;342(2):263-8. doi: 10.1016/j.gene.2004.08.001.
9
A large number of novel coding small open reading frames in the intergenic regions of the Arabidopsis thaliana genome are transcribed and/or under purifying selection.拟南芥基因组基因间隔区中大量新的编码小开放阅读框被转录和/或处于纯化选择之下。
Genome Res. 2007 May;17(5):632-40. doi: 10.1101/gr.5836207. Epub 2007 Mar 29.
10
Codon usage bias is correlated with gene expression levels in the fission yeast Schizosaccharomyces pombe.密码子使用偏好与裂殖酵母粟酒裂殖酵母中的基因表达水平相关。
Genes Cells. 2009 Apr;14(4):499-509. doi: 10.1111/j.1365-2443.2009.01284.x.

引用本文的文献

1
Detection of Highly Divergent Tandem Repeats in the Rice Genome.检测水稻基因组中的高度变异串联重复序列。
Genes (Basel). 2021 Mar 25;12(4):473. doi: 10.3390/genes12040473.
2
Use of Mathematical Methods for the Biosafety Assessment of Agricultural Crops.数学方法在农作物生物安全性评估中的应用
Appl Biochem Microbiol. 2021;57(2):271-279. doi: 10.1134/S000368382102006X. Epub 2021 Mar 12.
3
Search for SINE repeats in the rice genome using correlation-based position weight matrices.利用基于相关性的位置权重矩阵在水稻基因组中搜索 SINE 重复序列。

本文引用的文献

1
Improve homology search sensitivity of PacBio data by correcting frameshifts.通过校正移码来提高PacBio数据的同源性搜索灵敏度。
Bioinformatics. 2016 Sep 1;32(17):i529-i537. doi: 10.1093/bioinformatics/btw458.
2
Search of latent periodicity in amino acid sequences by means of genetic algorithm and dynamic programming.利用遗传算法和动态规划搜索氨基酸序列中的潜在周期性。
Stat Appl Genet Mol Biol. 2016 Oct 1;15(5):381-400. doi: 10.1515/sagmb-2015-0079.
3
Ensembl 2015.Ensembl 2015.
BMC Bioinformatics. 2021 Feb 2;22(1):42. doi: 10.1186/s12859-021-03977-0.
4
Multiple Alignment of Promoter Sequences from the L. Genome.从 L. 基因组中启动子序列的多重比对。
Genes (Basel). 2021 Jan 21;12(2):135. doi: 10.3390/genes12020135.
Nucleic Acids Res. 2015 Jan;43(Database issue):D662-9. doi: 10.1093/nar/gku1010. Epub 2014 Oct 28.
4
Frameshift alignment: statistics and post-genomic applications.移码校正:统计与后基因组学应用。
Bioinformatics. 2014 Dec 15;30(24):3575-82. doi: 10.1093/bioinformatics/btu576. Epub 2014 Aug 28.
5
Identification of the nature of reading frame transitions observed in prokaryotic genomes.鉴定原核生物基因组中观察到的阅读框转换的性质。
Nucleic Acids Res. 2013 Jul;41(13):6514-30. doi: 10.1093/nar/gkt274. Epub 2013 May 6.
6
Identification of somatic mutations in human prostate cancer by RNA-Seq.通过 RNA-Seq 鉴定人前列腺癌中的体细胞突变。
Gene. 2013 May 1;519(2):343-7. doi: 10.1016/j.gene.2013.01.046. Epub 2013 Feb 19.
7
On programmed ribosomal frameshifting: the alternative proteomes.关于核糖体框架移位的程序性:选择性蛋白质组。
Front Genet. 2012 Nov 19;3:242. doi: 10.3389/fgene.2012.00242. eCollection 2012.
8
GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.Genetack 数据库:原核基因组和真核 mRNA 序列中基因移码的基因。
Nucleic Acids Res. 2013 Jan;41(Database issue):D152-6. doi: 10.1093/nar/gks1062. Epub 2012 Nov 17.
9
An approach for searching insertions in bacterial genes leading to the phase shift of triplet periodicity.一种搜索导致三联体周期性相移的细菌基因插入的方法。
Genomics Proteomics Bioinformatics. 2011 Oct;9(4-5):158-70. doi: 10.1016/S1672-0229(11)60019-3.
10
HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors.HMM-FRAME:用于分类含有移码错误的宏基因组序列的蛋白质结构域。
BMC Bioinformatics. 2011 May 24;12:198. doi: 10.1186/1471-2105-12-198.