Suppr超能文献

生物序列分析:一个基于机器学习方法的 DNA、RNA 和蛋白质序列分析平台。

BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches.

出版信息

Brief Bioinform. 2019 Jul 19;20(4):1280-1294. doi: 10.1093/bib/bbx165.

Abstract

With the avalanche of biological sequences generated in the post-genomic age, one of the most challenging problems is how to computationally analyze their structures and functions. Machine learning techniques are playing key roles in this field. Typically, predictors based on machine learning techniques contain three main steps: feature extraction, predictor construction and performance evaluation. Although several Web servers and stand-alone tools have been developed to facilitate the biological sequence analysis, they only focus on individual step. In this regard, in this study a powerful Web server called BioSeq-Analysis (http://bioinformatics.hitsz.edu.cn/BioSeq-Analysis/) has been proposed to automatically complete the three main steps for constructing a predictor. The user only needs to upload the benchmark data set. BioSeq-Analysis can generate the optimized predictor based on the benchmark data set, and the performance measures can be reported as well. Furthermore, to maximize user's convenience, its stand-alone program was also released, which can be downloaded from http://bioinformatics.hitsz.edu.cn/BioSeq-Analysis/download/, and can be directly run on Windows, Linux and UNIX. Applied to three sequence analysis tasks, experimental results showed that the predictors generated by BioSeq-Analysis even outperformed some state-of-the-art methods. It is anticipated that BioSeq-Analysis will become a useful tool for biological sequence analysis.

摘要

随着后基因组时代产生的大量生物序列,如何计算分析它们的结构和功能是最具挑战性的问题之一。机器学习技术在这个领域发挥着关键作用。通常,基于机器学习技术的预测器包含三个主要步骤:特征提取、预测器构建和性能评估。尽管已经开发了几个用于促进生物序列分析的 Web 服务器和独立工具,但它们仅专注于单个步骤。在这方面,本研究提出了一个名为 BioSeq-Analysis(http://bioinformatics.hitsz.edu.cn/BioSeq-Analysis/)的强大 Web 服务器,可自动完成构建预测器的三个主要步骤。用户只需上传基准数据集。BioSeq-Analysis 可以根据基准数据集生成优化的预测器,并报告性能度量。此外,为了最大限度地提高用户的便利性,还发布了其独立程序,可从 http://bioinformatics.hitsz.edu.cn/BioSeq-Analysis/download/下载,并可直接在 Windows、Linux 和 UNIX 上运行。在三个序列分析任务中的应用表明,BioSeq-Analysis 生成的预测器甚至优于一些最先进的方法。预计 BioSeq-Analysis 将成为生物序列分析的有用工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验