• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用平均互信息及派生度量来寻找编码区域。

Use of Average Mutual Information and Derived Measures to Find Coding Regions.

作者信息

Newcomb Garin, Sayood Khalid

机构信息

Department of Electrical and Computer Engineering, University of Nebraska, Lincoln, NE 68588-0511, USA.

出版信息

Entropy (Basel). 2021 Oct 11;23(10):1324. doi: 10.3390/e23101324.

DOI:10.3390/e23101324
PMID:34682048
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8534840/
Abstract

One of the important steps in the annotation of genomes is the identification of regions in the genome which code for proteins. One of the tools used by most annotation approaches is the use of signals extracted from genomic regions that can be used to identify whether the region is a protein coding region. Motivated by the fact that these regions are information bearing structures we propose signals based on measures motivated by the average mutual information for use in this task. We show that these signals can be used to identify coding and noncoding sequences with high accuracy. We also show that these signals are robust across species, phyla, and kingdom and can, therefore, be used in species agnostic genome annotation algorithms for identifying protein coding regions. These in turn could be used for gene identification.

摘要

基因组注释中的一个重要步骤是识别基因组中编码蛋白质的区域。大多数注释方法使用的工具之一是利用从基因组区域提取的信号,这些信号可用于识别该区域是否为蛋白质编码区域。鉴于这些区域是承载信息的结构,我们基于平均互信息所激发的度量提出了用于此任务的信号。我们表明,这些信号可用于高精度地识别编码和非编码序列。我们还表明,这些信号在物种、门和界之间具有稳健性,因此可用于不依赖物种的基因组注释算法来识别蛋白质编码区域。这些反过来又可用于基因识别。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/9ac94973e7ac/entropy-23-01324-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/bf5c5b95b51e/entropy-23-01324-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/dee1c4235643/entropy-23-01324-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/ee2fd95a6841/entropy-23-01324-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/e7cadca316e6/entropy-23-01324-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/bfb686e5457d/entropy-23-01324-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/4d3581e94f36/entropy-23-01324-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/41dcaf7d9959/entropy-23-01324-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/a3abaf8d9ac2/entropy-23-01324-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/cca8e971a97b/entropy-23-01324-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/a542b1fdd308/entropy-23-01324-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/9ac94973e7ac/entropy-23-01324-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/bf5c5b95b51e/entropy-23-01324-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/dee1c4235643/entropy-23-01324-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/ee2fd95a6841/entropy-23-01324-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/e7cadca316e6/entropy-23-01324-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/bfb686e5457d/entropy-23-01324-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/4d3581e94f36/entropy-23-01324-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/41dcaf7d9959/entropy-23-01324-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/a3abaf8d9ac2/entropy-23-01324-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/cca8e971a97b/entropy-23-01324-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/a542b1fdd308/entropy-23-01324-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b22/8534840/9ac94973e7ac/entropy-23-01324-g011.jpg

相似文献

1
Use of Average Mutual Information and Derived Measures to Find Coding Regions.使用平均互信息及派生度量来寻找编码区域。
Entropy (Basel). 2021 Oct 11;23(10):1324. doi: 10.3390/e23101324.
2
Average mutual information of coding and noncoding DNA.编码和非编码DNA的平均互信息。
Pac Symp Biocomput. 2000:614-23. doi: 10.1142/9789814447331_0059.
3
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
4
Species independence of mutual information in coding and noncoding DNA.编码和非编码DNA中互信息的物种独立性
Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 2000 May;61(5 Pt B):5624-9. doi: 10.1103/physreve.61.5624.
5
Enrichment of transcriptional regulatory sites in non-coding genomic region.非编码基因组区域中转录调控位点的富集
Bioinformatics. 2004 Mar 1;20(4):569-75. doi: 10.1093/bioinformatics/btg450. Epub 2004 Jan 22.
6
An Updated Functional Annotation of Protein-Coding Genes in the Cucumber Genome.黄瓜基因组中蛋白质编码基因的更新功能注释
Front Plant Sci. 2018 Mar 15;9:325. doi: 10.3389/fpls.2018.00325. eCollection 2018.
7
[Comprehensive re-annotation of protein-coding genes for prokaryotic genomes by Z-curve and similarity-based methods].[基于Z曲线和相似性方法对原核生物基因组蛋白质编码基因进行全面重新注释]
Yi Chuan. 2020 Jul 20;42(7):691-702. doi: 10.16288/j.yczz.20-022.
8
Comparative genomics in cyprinids: common carp ESTs help the annotation of the zebrafish genome.鲤科鱼类的比较基因组学:鲤鱼EST有助于斑马鱼基因组的注释。
BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S2. doi: 10.1186/1471-2105-7-S5-S2.
9
Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats.大麦基因组的低通量鸟枪法测序有助于快速鉴定基因、保守非编码序列和新型重复序列。
BMC Genomics. 2008 Oct 31;9:518. doi: 10.1186/1471-2164-9-518.
10
Ancient evolutionary signals of protein-coding sequences allow the discovery of new genes in the Drosophila melanogaster genome.蛋白质编码序列的古老进化信号可用于发现果蝇基因组中的新基因。
BMC Genomics. 2020 Mar 5;21(1):210. doi: 10.1186/s12864-020-6632-y.

引用本文的文献

1
Mutual information stacking method for prediction of the growth traits in pigs.用于预测猪生长性状的互信息堆叠方法
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf231.

本文引用的文献

1
A new and updated resource for codon usage tables.密码子使用表的全新更新资源。
BMC Bioinformatics. 2017 Sep 2;18(1):391. doi: 10.1186/s12859-017-1793-7.
2
Accurate prediction of human essential genes using only nucleotide composition and association information.仅利用核苷酸组成和关联信息对人类必需基因进行准确预测。
Bioinformatics. 2017 Jun 15;33(12):1758-1764. doi: 10.1093/bioinformatics/btx055.
3
Genomic signatures in viral sequences by in-frame and out-frame mutual information.通过读码框内和读码框外互信息分析病毒序列中的基因组特征。
J Theor Biol. 2016 Aug 21;403:1-9. doi: 10.1016/j.jtbi.2016.05.014. Epub 2016 May 10.
4
Use of average mutual information for studying changes in HIV populations.利用平均互信息研究HIV群体的变化。
Annu Int Conf IEEE Eng Med Biol Soc. 2009;2009:3861-4. doi: 10.1109/IEMBS.2009.5332579.
5
Computational methods for discovering gene networks from expression data.从表达数据中发现基因网络的计算方法。
Brief Bioinform. 2009 Jul;10(4):408-23. doi: 10.1093/bib/bbp028.
6
The average mutual information profile as a genomic signature.作为基因组特征的平均互信息概况。
BMC Bioinformatics. 2008 Jan 25;9:48. doi: 10.1186/1471-2105-9-48.
7
Identifying bacterial genes and endosymbiont DNA with Glimmer.使用Glimmer识别细菌基因和内共生体DNA。
Bioinformatics. 2007 Mar 15;23(6):673-9. doi: 10.1093/bioinformatics/btm009. Epub 2007 Jan 19.
8
TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders.TigrScan和GlimmerHMM:两款开源的从头开始的真核生物基因预测工具。
Bioinformatics. 2004 Nov 1;20(16):2878-9. doi: 10.1093/bioinformatics/bth315. Epub 2004 May 14.
9
A new sequence distance measure for phylogenetic tree construction.一种用于构建系统发育树的新序列距离度量方法。
Bioinformatics. 2003 Nov 1;19(16):2122-30. doi: 10.1093/bioinformatics/btg295.
10
A divide-and-conquer approach to fragment assembly.一种用于片段组装的分治方法。
Bioinformatics. 2003 Jan;19(1):22-9. doi: 10.1093/bioinformatics/19.1.22.