• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对环境和临床样本中未培养微生物混合物来源的基因组序列片段进行的新型系统发育研究。

Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples.

作者信息

Abe Takashi, Sugawara Hideaki, Kinouchi Makoto, Kanaya Shigehiko, Ikemura Toshimichi

机构信息

Center for Information Biology, National Institute of Genetics, The Graduate University for Advanced Studies (Sokendai) Mishima, Shizuoka, Japan.

出版信息

DNA Res. 2005;12(5):281-90. doi: 10.1093/dnares/dsi015. Epub 2006 Jan 10.

DOI:10.1093/dnares/dsi015
PMID:16769690
Abstract

A self-organizing map (SOM) was developed as a novel bioinformatics strategy for phylogenetic classification of sequence fragments obtained from pooled genome samples of uncultured microbes in environmental and clinical samples. This phylogenetic classification was possible without either orthologous sequence sets or sequence alignments. We first constructed SOMs for tetranucleotide frequencies in 210,000 5 kb sequence fragments obtained from 1502 prokaryotes for which at least 10 kb of genomic sequence has been deposited in public DNA databases. The sequences could be classified primarily according to phylogenetic groups without information regarding the species. We used the SOM method to classify sequence fragments derived from environmental samples of the Sargasso Sea and of an acidophilic biofilm growing in acid mine drainage. Phylogenetic diversity of the environmental sequences was effectively visualized on a single map. Sequences that were derived from a single genome but cloned independently could be reassociated in silico. G + C% has been used for a long period as a fundamental parameter for phylogenetic classification of microbes, but the G + C% is apparently too simple a parameter to differentiate a wide variety of known species. Oligonucleotide frequency can be used to distinguish the species because oligonucleotide frequencies vary significantly among their genomes.

摘要

自组织映射(SOM)作为一种新型生物信息学策略被开发出来,用于对从环境和临床样本中未培养微生物的混合基因组样本获得的序列片段进行系统发育分类。这种系统发育分类无需直系同源序列集或序列比对即可实现。我们首先针对从1502种原核生物获得的210,000个5 kb序列片段中的四核苷酸频率构建了SOM,这些原核生物至少有10 kb的基因组序列已存于公共DNA数据库中。这些序列可以主要根据系统发育组进行分类,而无需物种信息。我们使用SOM方法对源自马尾藻海环境样本和酸性矿山排水中生长的嗜酸生物膜的序列片段进行分类。环境序列的系统发育多样性在单张图谱上得到了有效呈现。源自单个基因组但独立克隆的序列可以在计算机上重新关联。长期以来,G + C%一直被用作微生物系统发育分类的基本参数,但G + C%显然是一个过于简单的参数,无法区分各种各样的已知物种。寡核苷酸频率可用于区分物种,因为寡核苷酸频率在其基因组之间存在显著差异。

相似文献

1
Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples.对环境和临床样本中未培养微生物混合物来源的基因组序列片段进行的新型系统发育研究。
DNA Res. 2005;12(5):281-90. doi: 10.1093/dnares/dsi015. Epub 2006 Jan 10.
2
Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes.自组织映射(SOM)揭示并可视化了多种真核生物基因组的隐藏序列特征。
Gene. 2006 Jan 3;365:27-34. doi: 10.1016/j.gene.2005.09.040. Epub 2005 Dec 20.
3
A novel bioinformatic strategy for unveiling hidden genome signatures of eukaryotes: self-organizing map of oligonucleotide frequency.一种揭示真核生物隐藏基因组特征的新型生物信息学策略:寡核苷酸频率的自组织映射图。
Genome Inform. 2002;13:12-20.
4
Hyperbolic SOM-based clustering of DNA fragment features for taxonomic visualization and classification.基于双曲自组织映射的DNA片段特征聚类用于分类可视化和分类。
Bioinformatics. 2008 Jul 15;24(14):1568-74. doi: 10.1093/bioinformatics/btn257. Epub 2008 Jun 5.
5
Application of tetranucleotide frequencies for the assignment of genomic fragments.四核苷酸频率在基因组片段分配中的应用。
Environ Microbiol. 2004 Sep;6(9):938-47. doi: 10.1111/j.1462-2920.2004.00624.x.
6
A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm.一个使用黑马算法识别出的古菌和细菌基因组中系统发育非典型基因的数据库。
BMC Bioinformatics. 2008 Oct 7;9:419. doi: 10.1186/1471-2105-9-419.
7
CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes.CGAT:一种用于在分析密切相关基因组之间复杂进化变化时可视化比对结果的比较基因组分析工具。
BMC Bioinformatics. 2006 Oct 24;7:472. doi: 10.1186/1471-2105-7-472.
8
Assessment of phylogenomic and orthology approaches for phylogenetic inference.用于系统发育推断的系统发育基因组学和直系同源方法评估。
Bioinformatics. 2007 Apr 1;23(7):815-24. doi: 10.1093/bioinformatics/btm015. Epub 2007 Jan 19.
9
Phylogenetic signals in DNA composition: limitations and prospects.DNA 组成中的系统发育信号:局限性与前景
Mol Biol Evol. 2009 May;26(5):1163-9. doi: 10.1093/molbev/msp032. Epub 2009 Feb 20.
10
A novel retrieval system for nearly complete microbial genomic fragments from soil samples.一种用于从土壤样本中检索近乎完整微生物基因组片段的新型系统。
J Microbiol Methods. 2008 Feb;72(2):197-205. doi: 10.1016/j.mimet.2007.11.022. Epub 2007 Dec 8.

引用本文的文献

1
Unsupervised AI reveals insect species-specific genome signatures.无监督人工智能揭示昆虫物种特异性基因组特征。
PeerJ. 2024 Mar 6;12:e17025. doi: 10.7717/peerj.17025. eCollection 2024.
2
Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach.利用深度学习增强DNA序列的分类:一种多标签方法。
Bioengineering (Basel). 2023 Nov 8;10(11):1293. doi: 10.3390/bioengineering10111293.
3
Baseline metagenome-assembled genome (MAG) data of Sikkim hot springs from Indian Himalayan geothermal belt (IHGB) showcasing its potential CAZymes, and sulfur-nitrogen metabolic activity.
印度喜马拉雅地热带(IHGB)中来自锡金温泉的宏基因组组装基因组(MAG)的基线数据,展示了其潜在的碳水化合物活性酶(CAZymes)和硫-氮代谢活性。
World J Microbiol Biotechnol. 2023 May 3;39(7):179. doi: 10.1007/s11274-023-03631-2.
4
A convenient correspondence between k-mer-based metagenomic distances and phylogenetically-informed β-diversity measures.基于 k-mer 的宏基因组距离与基于系统发育信息的 β 多样性测度之间的便捷对应关系。
PLoS Comput Biol. 2023 Jan 6;19(1):e1010821. doi: 10.1371/journal.pcbi.1010821. eCollection 2023 Jan.
5
AI-based search for convergently expanding, advantageous mutations in SARS-CoV-2 by focusing on oligonucleotide frequencies.基于人工智能的方法通过关注寡核苷酸频率来搜索 SARS-CoV-2 中趋同扩张的有利突变。
PLoS One. 2022 Aug 31;17(8):e0273860. doi: 10.1371/journal.pone.0273860. eCollection 2022.
6
Comparative genomic analysis of the human genome and six bat genomes using unsupervised machine learning: Mb-level CpG and TFBS islands.使用无监督机器学习对人类基因组和六倍体蝙蝠基因组进行比较基因组分析:Mb 级 CpG 和 TFBS 岛。
BMC Genomics. 2022 Jul 8;23(1):497. doi: 10.1186/s12864-022-08664-9.
7
Unsupervised explainable AI for molecular evolutionary study of forty thousand SARS-CoV-2 genomes.用于四万 SARS-CoV-2 基因组的分子进化研究的无监督可解释人工智能。
BMC Microbiol. 2022 Mar 10;22(1):73. doi: 10.1186/s12866-022-02484-3.
8
Human cell-dependent, directional, time-dependent changes in the mono- and oligonucleotide compositions of SARS-CoV-2 genomes.新冠病毒基因组单核苷酸和寡核苷酸组成中依赖人类细胞的、定向的、随时间变化的改变。
BMC Microbiol. 2021 Mar 23;21(1):89. doi: 10.1186/s12866-021-02158-6.
9
Comparative genomics of using unsupervised AI reveals a high CG frequency.利用无监督人工智能进行比较基因组学研究揭示了 的高 CG 频率。
Life Sci Alliance. 2021 Mar 12;4(5). doi: 10.26508/lsa.202000905. Print 2021 May.
10
Viral population analysis of the taiga tick, Ixodes persulcatus, by using Batch Learning Self-Organizing Maps and BLAST search.利用批量学习自组织映射和BLAST搜索对远东硬蜱(全沟硬蜱)进行病毒种群分析。
J Vet Med Sci. 2019 Mar 20;81(3):401-410. doi: 10.1292/jvms.18-0483. Epub 2019 Jan 23.