Gingrich Phillip W, Chitsazi Rezvan, Biswas Ansuman, Jiang Chunjie, Zhao Li, Tym Joseph E, Brammer Kevin M, Li Jun, Shu Zhigang, Maxwell David S, Tacy Jeffrey A, Mica Ioan L, Darkoh Michael, di Micco Patrizio, Russell Kaitlyn P, Workman Paul, Al-Lazikani Bissan
Department of Genomic Medicine; Therapeutics Discovery Division; and The Institute for Data Science in Oncology; University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Enterprise Development and Integration, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Nucleic Acids Res. 2025 Jan 6;53(D1):D1287-D1294. doi: 10.1093/nar/gkae1050.
canSAR (https://cansar.ai) continues to serve as the largest publicly available platform for cancer-focused drug discovery and translational research. It integrates multidisciplinary data from disparate and otherwise siloed public data sources as well as data curated uniquely for canSAR. In addition, canSAR deploys a suite of curation and standardization tools together with AI algorithms to generate new knowledge from these integrated data to inform hypothesis generation. Here we report the latest updates to canSAR. As well as increasing available data, we provide enhancements to our algorithms to improve the offering to the user. Notably, our enhancements include a revised ligandability classifier leveraging Positive Unlabeled Learning that finds twice as many ligandable opportunities across the pocketome, and our revised chemical standardization pipeline and hierarchy better enables the aggregation of structurally related molecular records.