• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基因组聚类的信息论视角

Information theoretic perspective on genome clustering.

作者信息

Veluchamy Alaguraj, Mehta Preeti, Srividhya K V, Vikram Hirendra, Govind M K, Gupta Ramneek, Aziz Bin Dukhyil Abdul, Abdullah Alharbi Raed, Abdullah Aloyuni Saleh, Hassan Mohamed M, Krishnaswamy S

机构信息

Centre of Excellence in Bioinformatics, School of Biotechnology, Madurai Kamaraj University, Madurai 625021, India.

Department of Computational Biology, St. Jude Children's Research Hospital, Danny Thomas Place, Memphis 38105, Tennesse, United States of America.

出版信息

Saudi J Biol Sci. 2021 Mar;28(3):1867-1889. doi: 10.1016/j.sjbs.2020.12.039. Epub 2020 Dec 31.

DOI:10.1016/j.sjbs.2020.12.039
PMID:33732074
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7938122/
Abstract

Shannon's information theoretic perspective of communication helps one to understand the storage and processing of information in one-dimensional sequences. An information theoretic analysis of 937 available completely sequenced prokaryotic genomes and 238 eukaryotic chromosomes is presented. Information content (Id) values were used to cluster these chromosomes. Chargaff's second parity rule i.e compositional self-complementarity, an empirical fact is observed in all the genomes, except for the proteobacteria Hodgkinia cicadicola. High information content, arising out of biased base composition in all the 14 chromosomes of is found among two other genomes of prokaryotes viz. str. Cc () and Carsonella ruddii PV. Despite size and compositional variations, both prokaryotic and eukaryotic genomes do not deviate significantly from an equiprobable and random situation. Eukaryotic chromosomes of an organism tend to have similar informational restraints as seen when a simple distance based method is used to cluster them. In eukaryotes, in certain cases, Id values are also similar for the two arms (p and q arm) of the chromosomes. The results of this current study confirm that the information content can provide insights into the clustering of genomes and the evolution of messaging strategies of the genomes. An efficient and robust Perl CGI standalone tool is created based on this information theory algorithm for the analysis of the whole genomes and is made available at https://github.com/AlagurajVeluchamy/InformationTheory.

摘要

香农的信息论通信视角有助于人们理解一维序列中信息的存储和处理。本文对937个可用的完全测序原核生物基因组和238条真核生物染色体进行了信息论分析。信息含量(Id)值被用于对这些染色体进行聚类。除了嗜菌属的霍奇金氏菌外,在所有基因组中都观察到了查加夫第二奇偶规则,即组成自互补性这一经验事实。在另外两个原核生物基因组,即嗜热栖热菌(Thermus thermophilus str. HB8)和鲁氏卡森氏菌(Carsonella ruddii PV)的所有14条染色体中,发现由于碱基组成偏向而产生的高信息含量。尽管存在大小和组成上的差异,但原核生物和真核生物基因组与等概率和随机情况相比并没有显著偏差。当使用基于简单距离的方法对生物体的真核染色体进行聚类时,会发现它们往往具有相似的信息限制。在真核生物中,在某些情况下,染色体的两条臂(p臂和q臂)的Id值也相似。本研究结果证实,信息含量可以为基因组聚类和基因组信息传递策略的进化提供见解。基于此信息论算法创建了一个高效且强大的Perl CGI独立工具,用于分析全基因组,该工具可在https://github.com/AlagurajVeluchamy/InformationTheory上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/a3620a6944ea/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/cccbfabe5298/gr1a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/97005ea78c87/gr1b.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/c4691908454b/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/ebd5f5fa60e0/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/a3620a6944ea/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/cccbfabe5298/gr1a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/97005ea78c87/gr1b.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/c4691908454b/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/ebd5f5fa60e0/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0353/7938122/a3620a6944ea/gr4.jpg

相似文献

1
Information theoretic perspective on genome clustering.基因组聚类的信息论视角
Saudi J Biol Sci. 2021 Mar;28(3):1867-1889. doi: 10.1016/j.sjbs.2020.12.039. Epub 2020 Dec 31.
2
Deviations from Chargaff's second parity rule in organellar DNA Insights into the evolution of organellar genomes.细胞器DNA中与查加夫第二对等规则的偏差:对细胞器基因组进化的见解
Gene. 2006 Oct 15;381:34-41. doi: 10.1016/j.gene.2006.06.010. Epub 2006 Jun 28.
3
Shannon information in complete genomes.完整基因组中的香农信息。
Proc IEEE Comput Syst Bioinform Conf. 2004:20-30. doi: 10.1109/csb.2004.1332413.
4
Shannon information in complete genomes.完整基因组中的香农信息。
J Bioinform Comput Biol. 2005 Jun;3(3):587-608. doi: 10.1142/s0219720005001181.
5
Trinucleotide's quadruplet symmetries and natural symmetry law of DNA creation ensuing Chargaff's second parity rule.三核苷酸的四重对称性以及遵循查加夫第二对等规则的DNA生成自然对称法则。
J Biomol Struct Dyn. 2016 Jul;34(7):1383-94. doi: 10.1080/07391102.2015.1080628. Epub 2016 May 4.
6
Plasmodium parasites of birds have the most AT-rich genes of eukaryotes.鸟类疟原虫拥有真核生物中 AT 含量最高的基因。
Microb Genom. 2018 Feb;4(2). doi: 10.1099/mgen.0.000150. Epub 2018 Jan 23.
7
Patterns of nucleotide asymmetries in plant and animal genomes.植物和动物基因组中的核苷酸不对称模式。
Biosystems. 2013 Mar;111(3):181-9. doi: 10.1016/j.biosystems.2013.02.001. Epub 2013 Feb 21.
8
Cicada Endosymbionts Have tRNAs That Are Correctly Processed Despite Having Genomes That Do Not Encode All of the tRNA Processing Machinery.蝉共生体拥有经过正确加工的 tRNA,尽管它们的基因组并不编码所有的 tRNA 加工机制。
mBio. 2019 Jun 18;10(3):e01950-18. doi: 10.1128/mBio.01950-18.
9
A small microbial genome: the end of a long symbiotic relationship?一个微小的微生物基因组:一段漫长共生关系的终结?
Science. 2006 Oct 13;314(5797):312-3. doi: 10.1126/science.1130441.
10
Compensatory nature of Chargaff's second parity rule.查伽夫第二碱基配对规律的补偿性。
J Biomol Struct Dyn. 2013;31(11):1324-36. doi: 10.1080/07391102.2012.736757. Epub 2012 Nov 12.

引用本文的文献

1
Bioinformatics tools for the sequence complexity estimates.用于序列复杂性估计的生物信息学工具。
Biophys Rev. 2023 Sep 15;15(5):1367-1378. doi: 10.1007/s12551-023-01140-y. eCollection 2023 Oct.

本文引用的文献

1
CHILDREN'S SPEECH.
Science. 1942 Oct 9;96(2493):344-5. doi: 10.1126/science.96.2493.344.
2
Comparative analysis of core promoter region: information content from mono and dinucleotide substitution matrices.核心启动子区域的比较分析:来自单核苷酸和双核苷酸替换矩阵的信息内容
Comput Biol Chem. 2006 Feb;30(1):58-62. doi: 10.1016/j.compbiolchem.2005.10.004.
3
Shannon information in complete genomes.完整基因组中的香农信息。
J Bioinform Comput Biol. 2005 Jun;3(3):587-608. doi: 10.1142/s0219720005001181.
4
The language of genes.基因的语言。
Nature. 2002 Nov 14;420(6912):211-7. doi: 10.1038/nature01255.
5
Shannon information theoretic computation of synonymous codon usage biases in coding regions of human and mouse genomes.人类和小鼠基因组编码区同义密码子使用偏好的香农信息理论计算
Genome Res. 2002 Jun;12(6):944-55. doi: 10.1101/gr.213402.
6
Two levels of information in DNA: relationship of Romanes' "intrinsic" variability of the reproductive system, and Bateson's "residue" to the species-dependent component of the base composition, (C+G)%.DNA中的两个信息层次:罗曼斯生殖系统的“内在”变异性以及贝特森的“残差”与碱基组成的物种依赖性成分(C+G)%的关系。
J Theor Biol. 1999 Nov 7;201(1):47-61. doi: 10.1006/jtbi.1999.1013.
7
Information Theory, Scaling Laws and the Thermodynamics of Evolution.
J Theor Biol. 1998 Jun 21;192(4):545-559. doi: 10.1006/jtbi.1998.0680.
8
Compositional heterogeneity within, and uniformity between, DNA sequences of yeast chromosomes.酵母染色体DNA序列内部的组成异质性以及序列之间的一致性。
Genome Res. 1998 Sep;8(9):916-28. doi: 10.1101/gr.8.9.916.
9
An information theoretic view of gapped and other alignments.
Pac Symp Biocomput. 1998:561-72.
10
Information content of individual genetic sequences.单个基因序列的信息内容。
J Theor Biol. 1997 Dec 21;189(4):427-41. doi: 10.1006/jtbi.1997.0540.