Suppr超能文献

GCAC:用于虚拟筛选中预测模型构建的星系工作流系统。

GCAC: galaxy workflow system for predictive model building for virtual screening.

机构信息

School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India.

出版信息

BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):550. doi: 10.1186/s12859-018-2492-8.

Abstract

BACKGROUND

Traditional drug discovery approaches are time-consuming, tedious and expensive. Identifying a potential drug-like molecule using high throughput screening (HTS) with high confidence is always a challenging task in drug discovery and cheminformatics. A small percentage of molecules that pass the clinical trial phases receives FDA approval. This whole process takes 10-12 years and millions of dollar of investment. The inconsistency in HTS is also a challenge for reproducible results. Reproducible research in computational research is highly desirable as a measure to evaluate scientific claims and published findings. This paper describes the development and availability of a knowledge based predictive model building system using the R Statistical Computing Environment and its ensured reproducibility using Galaxy workflow system.

RESULTS

We describe a web-enabled data mining analysis pipeline which employs reproducible research approaches to confront the issue of availability of tools in high throughput virtual screening. The pipeline, named as "Galaxy for Compound Activity Classification (GCAC)" includes descriptor calculation, feature selection, model building, and screening to extract potent candidates, by leveraging the combined capabilities of R statistical packages and literate programming tools contained within a workflow system environment with automated configuration.

CONCLUSION

GCAC can serve as a standard for screening drug candidates using predictive model building under galaxy environment, allowing for easy installation and reproducibility. A demo site of the tool is available at http://ccbb.jnu.ac.in/gcac.

摘要

背景

传统的药物发现方法既耗时、繁琐又昂贵。使用高通量筛选(HTS)高置信度识别潜在的类药分子始终是药物发现和化学信息学中的一项具有挑战性的任务。只有一小部分通过临床试验阶段的分子能获得 FDA 批准。整个过程需要 10-12 年和数百万美元的投资。HTS 的不一致性也是重现性结果的一个挑战。计算研究中的可重现性研究是高度可取的,可作为评估科学主张和已发表发现的一种措施。本文描述了一种基于知识的预测模型构建系统的开发和可用性,该系统使用 R 统计计算环境,并使用 Galaxy 工作流程系统确保其可重现性。

结果

我们描述了一个支持网络的数据挖掘分析管道,该管道采用可重现性研究方法来解决高通量虚拟筛选中工具可用性的问题。该管道名为“Galaxy for Compound Activity Classification (GCAC)”,包括描述符计算、特征选择、模型构建和筛选,通过利用 R 统计软件包的综合功能和工作流程系统环境中包含的文学编程工具,以自动化配置提取有效候选物。

结论

GCAC 可以作为在 Galaxy 环境下使用预测模型构建筛选药物候选物的标准,允许轻松安装和重现性。该工具的演示站点可在 http://ccbb.jnu.ac.in/gcac 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2bba/7394323/358217e3766c/12859_2018_2492_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验