Hidaka Tadashi, Imamura Keiko, Hioki Takeshi, Takagi Terufumi, Giga Yoshikazu, Giga Mi-Ho, Nishimura Yoshiteru, Kawahara Yoshinobu, Hayashi Satoru, Niki Takeshi, Fushimi Makoto, Inoue Haruhisa
Research, Takeda Pharmaceutical Company Limited, Fujisawa, Japan.
Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto, Japan.
Patterns (N Y). 2020 Nov 11;1(9):100140. doi: 10.1016/j.patter.2020.100140. eCollection 2020 Dec 11.
Machine learning is expected to improve low throughput and high assay cost in cell-based phenotypic screening. However, it is still a challenge to apply machine learning to achieving sufficiently complex phenotypic screening due to imbalanced datasets, non-linear prediction, and unpredictability of new chemotypes. Here, we developed a prediction model based on the heat-diffusion equation (PM-HDE) to address this issue. The algorithm was verified as feasible for virtual compound screening using biotest data of 946 assay systems registered with PubChem. PM-HDE was then applied to actual screening. Based on supervised learning of the data of about 50,000 compounds from biological phenotypic screening with motor neurons derived from ALS-patient-induced pluripotent stem cells, virtual screening of >1.6 million compounds was implemented. We confirmed that PM-HDE enriched the hit compounds and identified new chemotypes. This prediction model could overcome the inflexibility in machine learning, and our approach could provide a novel platform for drug discovery.
机器学习有望改善基于细胞的表型筛选中通量低和检测成本高的问题。然而,由于数据集不平衡、非线性预测以及新化学类型的不可预测性,将机器学习应用于实现足够复杂的表型筛选仍然是一项挑战。在此,我们开发了一种基于热扩散方程的预测模型(PM-HDE)来解决这一问题。该算法通过使用在PubChem注册的946个检测系统的生物测试数据进行虚拟化合物筛选被验证是可行的。然后将PM-HDE应用于实际筛选。基于对来自肌萎缩侧索硬化症(ALS)患者诱导多能干细胞衍生的运动神经元的约50000种化合物的生物表型筛选数据的监督学习,对超过160万种化合物进行了虚拟筛选。我们证实PM-HDE富集了命中化合物并鉴定出了新的化学类型。这种预测模型可以克服机器学习中的灵活性不足,我们的方法可以为药物发现提供一个新的平台。