• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种通过BLSOM从大型基因组序列数据中进行高效知识发现的新型生物信息学方法。

A novel bioinformatics method for efficient knowledge discovery by BLSOM from big genomic sequence data.

作者信息

Bai Yu, Iwasaki Yuki, Kanaya Shigehiko, Zhao Yue, Ikemura Toshimichi

机构信息

Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma-shi, Nara 630-0192, Japan.

Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Shiga-ken 526-0829, Japan.

出版信息

Biomed Res Int. 2014;2014:765648. doi: 10.1155/2014/765648. Epub 2014 Apr 3.

DOI:10.1155/2014/765648
PMID:24804244
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3996302/
Abstract

With remarkable increase of genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional data such as oligonucleotide composition on one map. By modifying the conventional SOM, we have previously developed Batch-Learning SOM (BLSOM), which allows classification of sequence fragments according to species, solely depending on the oligonucleotide composition. In the present study, we introduce the oligonucleotide BLSOM used for characterization of vertebrate genome sequences. We first analyzed pentanucleotide compositions in 100 kb sequences derived from a wide range of vertebrate genomes and then the compositions in the human and mouse genomes in order to investigate an efficient method for detecting differences between the closely related genomes. BLSOM can recognize the species-specific key combination of oligonucleotide frequencies in each genome, which is called a "genome signature," and the specific regions specifically enriched in transcription-factor-binding sequences. Because the classification and visualization power is very high, BLSOM is an efficient powerful tool for extracting a wide range of information from massive amounts of genomic sequences (i.e., big sequence data).

摘要

随着各种物种基因组序列数据的显著增加,需要新的工具来对大量序列数据进行全面分析。自组织映射(SOM)是一种有效的工具,可用于在一张图上对高维数据(如寡核苷酸组成)进行聚类和可视化。通过对传统SOM进行改进,我们之前开发了批学习SOM(BLSOM),它仅根据寡核苷酸组成就能根据物种对序列片段进行分类。在本研究中,我们介绍了用于表征脊椎动物基因组序列的寡核苷酸BLSOM。我们首先分析了来自各种脊椎动物基因组的100 kb序列中的五核苷酸组成,然后分析了人类和小鼠基因组中的组成,以研究检测密切相关基因组之间差异的有效方法。BLSOM可以识别每个基因组中寡核苷酸频率的物种特异性关键组合,即“基因组特征”,以及转录因子结合序列特异性富集的特定区域。由于分类和可视化能力非常高,BLSOM是从大量基因组序列(即大量序列数据)中提取广泛信息的高效强大工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/8d1200e73a9a/BMRI2014-765648.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/7284845131c2/BMRI2014-765648.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/e3d72e7a06b6/BMRI2014-765648.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/64cc7942b6fc/BMRI2014-765648.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/7f87476ba438/BMRI2014-765648.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/8d1200e73a9a/BMRI2014-765648.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/7284845131c2/BMRI2014-765648.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/e3d72e7a06b6/BMRI2014-765648.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/64cc7942b6fc/BMRI2014-765648.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/7f87476ba438/BMRI2014-765648.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0921/3996302/8d1200e73a9a/BMRI2014-765648.005.jpg

相似文献

1
A novel bioinformatics method for efficient knowledge discovery by BLSOM from big genomic sequence data.一种通过BLSOM从大型基因组序列数据中进行高效知识发现的新型生物信息学方法。
Biomed Res Int. 2014;2014:765648. doi: 10.1155/2014/765648. Epub 2014 Apr 3.
2
A Novel Bioinformatics Strategy to Analyze Microbial Big Sequence Data for Efficient Knowledge Discovery: Batch-Learning Self-Organizing Map (BLSOM).一种用于分析微生物大序列数据以实现高效知识发现的新型生物信息学策略:批学习自组织映射(BLSOM)。
Microorganisms. 2013 Nov 20;1(1):137-157. doi: 10.3390/microorganisms1010137.
3
Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes.自组织映射(SOM)揭示并可视化了多种真核生物基因组的隐藏序列特征。
Gene. 2006 Jan 3;365:27-34. doi: 10.1016/j.gene.2005.09.040. Epub 2005 Dec 20.
4
Notable clustering of transcription-factor-binding motifs in human pericentric regions and its biological significance.人类着丝粒区转录因子结合基序的显著聚类及其生物学意义。
Chromosome Res. 2013 Aug;21(5):461-74. doi: 10.1007/s10577-013-9371-y. Epub 2013 Jul 30.
5
Development of self-compressing BLSOM for comprehensive analysis of big sequence data.用于大序列数据综合分析的自压缩BLSOM的开发。
Biomed Res Int. 2015;2015:506052. doi: 10.1155/2015/506052. Epub 2015 Oct 1.
6
An artificial intelligence approach fit for tRNA gene studies in the era of big sequence data.一种适用于大数据序列时代tRNA基因研究的人工智能方法。
Genes Genet Syst. 2017 Sep 12;92(1):43-54. doi: 10.1266/ggs.16-00068. Epub 2017 Mar 24.
7
Evolutionary changes in vertebrate genome signatures with special focus on coelacanth.脊椎动物基因组特征的进化变化,特别关注腔棘鱼。
DNA Res. 2014 Oct;21(5):459-67. doi: 10.1093/dnares/dsu012. Epub 2014 May 6.
8
Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.通过批量学习自组织映射可视化真核生物基因组的基因组特征,特别强调果蝇基因组。
Biomed Res Int. 2014;2014:985706. doi: 10.1155/2014/985706. Epub 2014 Mar 11.
9
A novel bioinformatics strategy for searching industrially useful genome resources from metagenomic sequence libraries.一种从宏基因组序列文库中搜索具有工业用途的基因组资源的新型生物信息学策略。
Genes Genet Syst. 2011;86(1):53-66. doi: 10.1266/ggs.86.53.
10
Informatics for unveiling hidden genome signatures.用于揭示隐藏基因组特征的信息学。
Genome Res. 2003 Apr;13(4):693-702. doi: 10.1101/gr.634603.

本文引用的文献

1
Notable clustering of transcription-factor-binding motifs in human pericentric regions and its biological significance.人类着丝粒区转录因子结合基序的显著聚类及其生物学意义。
Chromosome Res. 2013 Aug;21(5):461-74. doi: 10.1007/s10577-013-9371-y. Epub 2013 Jul 30.
2
A novel approach, based on BLSOMs (Batch Learning Self-Organizing Maps), to the microbiome analysis of ticks.一种基于 BLSOMs(批量学习自组织映射)的蜱虫微生物组分析新方法。
ISME J. 2013 May;7(5):1003-15. doi: 10.1038/ismej.2012.171. Epub 2013 Jan 10.
3
A novel bioinformatics strategy for searching industrially useful genome resources from metagenomic sequence libraries.
一种从宏基因组序列文库中搜索具有工业用途的基因组资源的新型生物信息学策略。
Genes Genet Syst. 2011;86(1):53-66. doi: 10.1266/ggs.86.53.
4
Heterochromatin establishment in the context of genome-wide epigenetic reprogramming.在全基因组表观遗传重编程的背景下建立异染色质。
Trends Genet. 2011 May;27(5):177-85. doi: 10.1016/j.tig.2011.02.002. Epub 2011 Apr 15.
5
SUMOylation promotes de novo targeting of HP1α to pericentric heterochromatin.SUMOylation 促进 HP1α 从头定位于着丝粒异染色质。
Nat Genet. 2011 Mar;43(3):220-7. doi: 10.1038/ng.765. Epub 2011 Feb 13.
6
A strand-specific burst in transcription of pericentric satellites is required for chromocenter formation and early mouse development.着丝粒卫星的转录特异性爆发是染色中心形成和早期小鼠发育所必需的。
Dev Cell. 2010 Oct 19;19(4):625-38. doi: 10.1016/j.devcel.2010.09.002.
7
DNA binding of centromere protein C (CENPC) is stabilized by single-stranded RNA.着丝粒蛋白 C(CENPC)的 DNA 结合由单链 RNA 稳定。
PLoS Genet. 2010 Feb 5;6(2):e1000835. doi: 10.1371/journal.pgen.1000835.
8
A novel bioinformatics strategy for function prediction of poorly-characterized protein genes obtained from metagenome analyses.一种从宏基因组分析中获得的功能未知蛋白基因的功能预测的新型生物信息学策略。
DNA Res. 2009 Oct;16(5):287-97. doi: 10.1093/dnares/dsp018. Epub 2009 Oct 3.
9
Epigenetic inheritance during the cell cycle.细胞周期中的表观遗传继承。
Nat Rev Mol Cell Biol. 2009 Mar;10(3):192-206. doi: 10.1038/nrm2640.
10
Centromere RNA is a key component for the assembly of nucleoproteins at the nucleolus and centromere.着丝粒RNA是核仁与着丝粒处核蛋白组装的关键组成部分。
Genome Res. 2007 Aug;17(8):1146-60. doi: 10.1101/gr.6022807. Epub 2007 Jul 10.