Li Ting, Tong Weida, Roberts Ruth, Liu Zhichao, Thakkar Shraddha
Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, United States.
University of Arkansas at Little Rock and University of Arkansas for Medical Sciences Joint Bioinformatics Program, Little Rock, AR, United States.
Front Artif Intell. 2021 Nov 18;4:757780. doi: 10.3389/frai.2021.757780. eCollection 2021.
Carcinogenicity testing plays an essential role in identifying carcinogens in environmental chemistry and drug development. However, it is a time-consuming and label-intensive process to evaluate the carcinogenic potency with conventional 2-years rodent animal studies. Thus, there is an urgent need for alternative approaches to providing reliable and robust assessments on carcinogenicity. In this study, we proposed a DeepCarc model to predict carcinogenicity for small molecules using deep learning-based model-level representations. The DeepCarc Model was developed using a data set of 692 compounds and evaluated on a test set containing 171 compounds in the National Center for Toxicological Research liver cancer database (NCTRlcdb). As a result, the proposed DeepCarc model yielded a Matthews correlation coefficient (MCC) of 0.432 for the test set, outperforming four advanced deep learning (DL) powered quantitative structure-activity relationship (QSAR) models with an average improvement rate of 37%. Furthermore, the DeepCarc model was also employed to screen the carcinogenicity potential of the compounds from both DrugBank and Tox21. Altogether, the proposed DeepCarc model could serve as an early detection tool (https://github.com/TingLi2016/DeepCarc) for carcinogenicity assessment.
致癌性测试在环境化学和药物开发中识别致癌物方面发挥着重要作用。然而,用传统的两年期啮齿动物研究来评估致癌潜力是一个耗时且标记密集的过程。因此,迫切需要替代方法来对致癌性进行可靠且有力的评估。在本研究中,我们提出了一种DeepCarc模型,利用基于深度学习的模型级表示来预测小分子的致癌性。DeepCarc模型是使用一个包含692种化合物的数据集开发的,并在美国国家毒理学研究中心肝癌数据库(NCTRlcdb)中一个包含171种化合物的测试集上进行了评估。结果,所提出的DeepCarc模型在测试集上的马修斯相关系数(MCC)为0.432,优于四个先进的深度学习(DL)驱动的定量构效关系(QSAR)模型,平均提高率为37%。此外,DeepCarc模型还被用于筛选DrugBank和Tox21中化合物的致癌潜力。总之,所提出的DeepCarc模型可以作为一种用于致癌性评估的早期检测工具(https://github.com/TingLi2016/DeepCarc)。