Suppr超能文献

REDIAL - 2020:一套用于估计抗SARS-CoV-2活性的机器学习模型。

REDIAL-2020: A suite of machine learning models to estimate Anti-SARS-CoV-2 activities.

作者信息

Govinda K C, Bocci Giovanni, Verma Srijan, Hassan Mahmudulla, Holmes Jayme, Yang Jeremy J, Sirimulla Suman, Oprea Tudor I

机构信息

Computational Science Program, The University of Texas at El Paso, Texas 79968, USA.

Department of Pharmaceutical Sciences, School of Pharmacy, The University of Texas at El Paso, Texas 79902, USA.

出版信息

ChemRxiv. 2020 Sep 16:12915779. doi: 10.26434/chemrxiv.12915779.v2.

Abstract

Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The "best models" are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (http://drugcentral.org/Redial). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (, , and ) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (https://github.com/sirimullalab/redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.

摘要

就2019冠状病毒病而言,药物研发和重新定位策略迫在眉睫。我们开发了“REDIAL-2020”,这是一套用于从分子结构估计小分子活性的机器学习模型,用于一系列与严重急性呼吸综合征冠状病毒2相关的检测。每个分类器基于三种不同类型的描述符(指纹、物理化学和药效团)进行并行模型开发。这些模型使用来自美国国立转化医学科学研究所2019冠状病毒病门户网站(https://opendata.ncats.nih.gov/covid19/index.html)的高通量筛选数据,采用多种分类机器学习算法进行训练。“最佳模型”组合在一个集成共识预测器中,在有外部验证的情况下,该预测器优于单个模型。这套机器学习模型可通过DrugCentral网站门户(http://drugcentral.org/Redial)获得。可接受的输入格式为:药物名称、PubChem化合物识别码或简化分子线性输入规范;输出是抗严重急性呼吸综合征冠状病毒2活性的估计值。该网络应用程序报告了跨越六个独立模型的三个领域(此处原文缺失具体领域内容)的估计活性,随后进行相似性搜索,在实验确定的数据中显示与查询最相似的分子。基于三个独立数据集,这些机器学习模型具有60%至74%的外部预测能力。作为对美国国立转化医学科学研究所2019冠状病毒病门户网站的补充,REDIAL-2020可作为一种快速在线工具,用于识别治疗2019冠状病毒病的活性分子。源代码和特定模型可通过Github(https://github.com/sirimullalab/redial-2020)获得,或者对于更喜欢容器化版本的用户,可通过Docker Hub(https://hub.docker.com/r/sirimullalab/redial-2020)获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56e5/7668752/67c47b025eb5/nihpp-12915779-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验