PIT Bioinformatics Group, Eötvös University, H-1117 Budapest, Hungary.
PIT Bioinformatics Group, Eötvös University, H-1117 Budapest, Hungary; Uratim Ltd., H-1118 Budapest, Hungary.
Genomics. 2019 Jul;111(4):883-885. doi: 10.1016/j.ygeno.2018.05.016. Epub 2018 May 23.
The fast and affordable sequencing of large clinical and environmental metagenomic datasets opens up new horizons in medical and biotechnological applications. It is believed that today we have described only about 1% of the microorganisms on the Earth, therefore, metagenomic analysis mostly deals with unknown species in the samples. Microbial communities in extreme environments may contain genes with high biotechnological potential, and clinical metagenomes, related to diseases, may uncover still unknown pathogens and pathological mechanisms in known diseases. While the species-level identification and description of the taxa in the samples do not seem to be possible today, we can search for novel genes with known functions in these samples, using numerous techniques, including artificial intelligence tools, like the hidden Markov models (HMMs). Here we describe a simple-to-use webserver, the MetaHMM, which is capable of homology-based automatic model-building for the genes to be searched for, and it also finds the closest matches in the metagenome. The webserver uses already highly successful building blocks: it performs multiple alignments by applying Clustal Omega, builds a hidden Markov model with HMMER components of hmmbuild and uses hmmsearch for finding similar sequences to the specified model in the metagenomes. The webserver is publicly available at https://metahmm.pitgroup.org.
快速且经济实惠地对大型临床和环境宏基因组数据集进行测序,为医学和生物技术应用开辟了新的前景。据信,我们今天仅描述了地球上约 1%的微生物,因此,宏基因组分析主要涉及样本中未知的物种。极端环境中的微生物群落可能含有具有高生物技术潜力的基因,而与疾病相关的临床宏基因组可能揭示出已知疾病中尚未发现的病原体和病理机制。虽然目前似乎不可能对样本中的分类群进行种属水平的鉴定和描述,但我们可以使用多种技术,包括人工智能工具,如隐马尔可夫模型 (HMM),在这些样本中搜索具有已知功能的新基因。在这里,我们描述了一个简单易用的网络服务器 MetaHMM,它能够为要搜索的基因进行基于同源性的自动模型构建,并在宏基因组中找到最接近的匹配。该网络服务器使用了已经非常成功的构建模块:它通过应用 Clustal Omega 进行多重比对,使用 hmmbuild 的 HMMER 组件构建隐马尔可夫模型,并使用 hmmsearch 在宏基因组中查找与指定模型相似的序列。该网络服务器可在 https://metahmm.pitgroup.org 上公开获取。