Muramatsu Tomonari, Tanokura Masaru
Research Center for Food Safety and Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.
Bioinform Adv. 2021 Jul 22;1(1):vbab013. doi: 10.1093/bioadv/vbab013. eCollection 2021.
COVID-19 is a serious infectious disease that has recently emerged and continues to spread worldwide. Its spreading rate is too high to expect that new specific drugs will be developed in sufficient time. As an alternative, drugs already developed for other diseases have been tested for use in the treatment of COVID-19 (drug repositioning). However, to select candidate drugs from a large number of compounds, numerous inhibition assays involving viral infection of cultured cells are required. For efficiency, it would be useful to narrow the list of candidates down using logical considerations prior to performing these assays. We have developed a powerful tool to predict candidate drugs for the treatment of COVID-19 and other diseases. This tool is based on the concatenation of events/substances, each of which is linked to a KEGG (Kyoto Encyclopedia of Genes and Genomes) code based on a relationship obtained from text mining of the vast literature in the PubMed database. By analyzing 21 589 326 records with abstracts from PubMed, 98 556 KEGG codes with NAME/DEFINITION fields were connected. Among them, 9799 KEGG drug codes were connected to COVID-19, of which 7492 codes had no direct connection to COVID-19. Although this report focuses on COVID-19, the program developed here can be applied to other infectious diseases and used to quickly identify drug candidates when new infectious diseases appear in the future.
The programs and data underlying this article will be shared on reasonable request to the corresponding authors.
atmuramatsu@g.ecc.u-tokyo.ac.jp, amtanok@mail.ecc.u-tokyo.ac.jp.
Supplementary data are available at online.
新型冠状病毒肺炎(COVID-19)是一种最近出现并在全球范围内持续传播的严重传染病。其传播速度极快,以至于难以期望在足够短的时间内研发出新型特效药物。作为一种替代方法,已针对其他疾病研发的药物已被测试用于治疗COVID-19(药物重新定位)。然而,要从大量化合物中筛选候选药物,需要进行大量涉及培养细胞病毒感染的抑制试验。为提高效率,在进行这些试验之前,利用逻辑推理缩小候选药物名单将很有帮助。我们开发了一种强大的工具来预测治疗COVID-19和其他疾病的候选药物。该工具基于事件/物质的串联,每个事件/物质都基于从PubMed数据库中大量文献的文本挖掘获得的关系与KEGG(京都基因与基因组百科全书)代码相关联。通过分析来自PubMed的21589326条带有摘要的记录,连接了98556个带有NAME/DEFINITION字段的KEGG代码。其中,9799个KEGG药物代码与COVID-19相关联,其中7492个代码与COVID-19无直接关联。尽管本报告重点关注COVID-19,但此处开发的程序可应用于其他传染病,并在未来出现新的传染病时用于快速识别候选药物。
本文所基于的程序和数据将根据对通讯作者的合理请求进行共享。
atmuramatsu@g.ecc.u-tokyo.ac.jp,amtanok@mail.ecc.u-tokyo.ac.jp。
补充数据可在网上获取。