• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自组织映射(SOM)揭示并可视化了多种真核生物基因组的隐藏序列特征。

Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes.

作者信息

Abe Takashi, Sugawara Hideaki, Kanaya Shigehiko, Kinouchi Makoto, Ikemura Toshimichi

机构信息

Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, and The Graduate University for Advanced Studies (Sokendai), Mishima, Shizuoka 411-8540, Japan.

出版信息

Gene. 2006 Jan 3;365:27-34. doi: 10.1016/j.gene.2005.09.040. Epub 2005 Dec 20.

DOI:10.1016/j.gene.2005.09.040
PMID:16364569
Abstract

Novel tools are needed for comprehensive comparisons of interspecies characteristics of massive amounts of genomic sequences currently available. An unsupervised neural network algorithm, Self-Organizing Map (SOM), is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We modified the conventional SOM, on the basis of batch-learning SOM, for genome informatics making the learning process and resulting map independent of the order of data input. We generated the SOMs for tri- and tetranucleotide frequencies in 10- and 100-kb sequence fragments from 38 eukaryotes for which almost complete genome sequences are available. SOM recognized species-specific characteristics (key combinations of oligonucleotide frequencies) in the genomic sequences, permitting species-specific classification of the sequences without any information regarding the species. We also generated the SOM for tetranucleotide frequencies in 1-kb sequence fragments from the human genome and found sequences for four functional categories (5' and 3' UTRs, CDSs and introns) were classified primarily according to the categories. Because the classification and visualization power is very high, SOM is an efficient and powerful tool for extracting a wide range of genome information.

摘要

目前需要新的工具来全面比较现有的大量基因组序列的种间特征。一种无监督神经网络算法——自组织映射(SOM),是在单个地图上对高维复杂数据进行聚类和可视化的有效工具。我们在批处理学习SOM的基础上对传统SOM进行了修改,用于基因组信息学,使学习过程和生成的地图独立于数据输入顺序。我们针对38种真核生物10 kb和100 kb序列片段中的三核苷酸和四核苷酸频率生成了SOM,这些真核生物几乎拥有完整的基因组序列。SOM识别基因组序列中的物种特异性特征(寡核苷酸频率的关键组合),无需任何关于物种的信息即可对序列进行物种特异性分类。我们还针对人类基因组1 kb序列片段中的四核苷酸频率生成了SOM,发现四个功能类别(5'和3'非翻译区、编码区和内含子)的序列主要根据类别进行分类。由于分类和可视化能力非常高,SOM是提取广泛基因组信息的高效且强大的工具。

相似文献

1
Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes.自组织映射(SOM)揭示并可视化了多种真核生物基因组的隐藏序列特征。
Gene. 2006 Jan 3;365:27-34. doi: 10.1016/j.gene.2005.09.040. Epub 2005 Dec 20.
2
A novel bioinformatic strategy for unveiling hidden genome signatures of eukaryotes: self-organizing map of oligonucleotide frequency.一种揭示真核生物隐藏基因组特征的新型生物信息学策略:寡核苷酸频率的自组织映射图。
Genome Inform. 2002;13:12-20.
3
Informatics for unveiling hidden genome signatures.用于揭示隐藏基因组特征的信息学。
Genome Res. 2003 Apr;13(4):693-702. doi: 10.1101/gr.634603.
4
Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples.对环境和临床样本中未培养微生物混合物来源的基因组序列片段进行的新型系统发育研究。
DNA Res. 2005;12(5):281-90. doi: 10.1093/dnares/dsi015. Epub 2006 Jan 10.
5
A novel bioinformatics method for efficient knowledge discovery by BLSOM from big genomic sequence data.一种通过BLSOM从大型基因组序列数据中进行高效知识发现的新型生物信息学方法。
Biomed Res Int. 2014;2014:765648. doi: 10.1155/2014/765648. Epub 2014 Apr 3.
6
Understanding and reducing variability of SOM neighbourhood structure.理解并减少自组织映射(SOM)邻域结构的变异性。
Neural Netw. 2006 Jul-Aug;19(6-7):838-46. doi: 10.1016/j.neunet.2006.05.017. Epub 2006 Jul 7.
7
Self-organizing maps with asymmetric neighborhood function.具有非对称邻域函数的自组织映射
Neural Comput. 2007 Sep;19(9):2515-35. doi: 10.1162/neco.2007.19.9.2515.
8
MitoRes: a resource of nuclear-encoded mitochondrial genes and their products in Metazoa.MitoRes:后生动物中核编码线粒体基因及其产物的资源库。
BMC Bioinformatics. 2006 Jan 24;7:36. doi: 10.1186/1471-2105-7-36.
9
Self-organizing neural networks to support the discovery of DNA-binding motifs.支持发现DNA结合基序的自组织神经网络。
Neural Netw. 2006 Jul-Aug;19(6-7):950-62. doi: 10.1016/j.neunet.2006.05.023. Epub 2006 Jul 12.
10
Large-scale genome clustering across life based on a linguistic approach.基于语言方法的跨生命的大规模基因组聚类
Biosystems. 2005 Sep;81(3):208-22. doi: 10.1016/j.biosystems.2005.04.003.

引用本文的文献

1
Unsupervised AI reveals insect species-specific genome signatures.无监督人工智能揭示昆虫物种特异性基因组特征。
PeerJ. 2024 Mar 6;12:e17025. doi: 10.7717/peerj.17025. eCollection 2024.
2
A Deep Clustering-based Novel Approach for Binning of Metagenomics Data.一种基于深度聚类的宏基因组学数据分箱新方法。
Curr Genomics. 2022 Nov 18;23(5):353-368. doi: 10.2174/1389202923666220928150100.
3
AI-based search for convergently expanding, advantageous mutations in SARS-CoV-2 by focusing on oligonucleotide frequencies.基于人工智能的方法通过关注寡核苷酸频率来搜索 SARS-CoV-2 中趋同扩张的有利突变。
PLoS One. 2022 Aug 31;17(8):e0273860. doi: 10.1371/journal.pone.0273860. eCollection 2022.
4
Comparative genomic analysis of the human genome and six bat genomes using unsupervised machine learning: Mb-level CpG and TFBS islands.使用无监督机器学习对人类基因组和六倍体蝙蝠基因组进行比较基因组分析:Mb 级 CpG 和 TFBS 岛。
BMC Genomics. 2022 Jul 8;23(1):497. doi: 10.1186/s12864-022-08664-9.
5
Unsupervised explainable AI for molecular evolutionary study of forty thousand SARS-CoV-2 genomes.用于四万 SARS-CoV-2 基因组的分子进化研究的无监督可解释人工智能。
BMC Microbiol. 2022 Mar 10;22(1):73. doi: 10.1186/s12866-022-02484-3.
6
Comparative genomics of using unsupervised AI reveals a high CG frequency.利用无监督人工智能进行比较基因组学研究揭示了 的高 CG 频率。
Life Sci Alliance. 2021 Mar 12;4(5). doi: 10.26508/lsa.202000905. Print 2021 May.
7
A Novel Bioinformatics Strategy to Analyze Microbial Big Sequence Data for Efficient Knowledge Discovery: Batch-Learning Self-Organizing Map (BLSOM).一种用于分析微生物大序列数据以实现高效知识发现的新型生物信息学策略:批学习自组织映射(BLSOM)。
Microorganisms. 2013 Nov 20;1(1):137-157. doi: 10.3390/microorganisms1010137.
8
Evolutionary changes in vertebrate genome signatures with special focus on coelacanth.脊椎动物基因组特征的进化变化,特别关注腔棘鱼。
DNA Res. 2014 Oct;21(5):459-67. doi: 10.1093/dnares/dsu012. Epub 2014 May 6.
9
Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.通过批量学习自组织映射可视化真核生物基因组的基因组特征,特别强调果蝇基因组。
Biomed Res Int. 2014;2014:985706. doi: 10.1155/2014/985706. Epub 2014 Mar 11.
10
Alignment-free visualization of metagenomic data by nonlinear dimension reduction.通过非线性降维对宏基因组数据进行无比对可视化。
Sci Rep. 2014 Mar 31;4:4516. doi: 10.1038/srep04516.