Korn Daniel, Pervitsky Vera, Bobrowski Tesia, Alves Vinicius M, Schmitt Charles, Bizon Chris, Baker Nancy, Chirkova Rada, Cherkasov Artem, Muratov Eugene, Tropsha Alexander
Department of Computer Science, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, the University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
ChemRxiv. 2020 Nov 26:13289222. doi: 10.26434/chemrxiv.13289222.v1.
The COVID-19 pandemic has catalyzed a widespread effort to identify drug candidates and biological targets of relevance to SARS-COV-2 infection, which resulted in large numbers of publications on this subject. We have built the VID-19 nowledge xtractor (COKE), a web application to extract, curate, and annotate essential drug-target relationships from the research literature on COVID-19 to assist drug repurposing efforts. SciBiteAI ontological tagging of the COVID Open Research Dataset (CORD-19), a repository of COVID-19 scientific publications, was employed to identify drug-target relationships. Entity identifiers were resolved through lookup routines using UniProt and DrugBank. A custom algorithm was used to identify co-occurrences of protein and drug terms, and confidence scores were calculated for each entity pair. COKE processing of the current CORD-19 database identified about 3,000 drug-protein pairs, including 29 unique proteins and 500 investigational, experimental, and approved drugs. Some of these drugs are presently undergoing clinical trials for COVID-19. The rapidly evolving situation concerning the COVID-19 pandemic has resulted in a dramatic growth of publications on this subject in a short period. These circumstances call for methods that can condense the literature into the key concepts and relationships necessary for insights into SARS-CoV-2 drug repurposing. The COKE repository and web application deliver key drug - target protein relationships to researchers studying SARS-CoV-2. COKE portal may provide comprehensive and critical information on studies concerning drug repurposing against COVID-19. COKE is freely available at https://coke.mml.unc.edu/ and the code is available at https://github.com/DnlRKorn/CoKE.
新冠疫情促使人们广泛努力寻找与新冠病毒感染相关的候选药物和生物学靶点,这导致了大量关于该主题的出版物。我们构建了VID-19知识提取器(COKE),这是一个网络应用程序,用于从关于新冠的研究文献中提取、整理和注释重要的药物-靶点关系,以协助药物再利用研究。利用新冠开放研究数据集(CORD-19,一个新冠科学出版物的知识库)的SciBiteAI本体标记来识别药物-靶点关系。通过使用UniProt和DrugBank的查找程序来解析实体标识符。使用一种定制算法来识别蛋白质和药物术语的共现情况,并为每个实体对计算置信度得分。对当前CORD-19数据库进行COKE处理,识别出约3000个药物-蛋白质对,包括29种独特蛋白质和500种研究性、实验性及已批准药物。其中一些药物目前正在进行针对新冠的临床试验。新冠疫情迅速演变的形势导致短期内关于该主题的出版物急剧增加。这些情况需要能够将文献浓缩为深入了解新冠病毒药物再利用所需的关键概念和关系的方法。COKE知识库和网络应用程序为研究新冠病毒的研究人员提供关键的药物-靶点蛋白质关系。COKE门户可能提供关于针对新冠的药物再利用研究的全面且关键的信息。COKE可在https://coke.mml.unc.edu/免费获取,代码可在https://github.com/DnlRKorn/CoKE获取。