Kai Jing, Yang Luyao, AbuElela Ayman F, Abdel-Haleem Alyaa M, AlAmoodi Asma S, Bin Nafisah Abdulghani A, Alshaibani Alfadel, Alzahrani Ali S, Lagani Vincenzo, Gomez-Cabrero David, Gao Xin, Merzaban Jasmeen S
Bioscience Program, King Abdullah University of Science and Technology (KAUST), Biological and Environmental Sciences and Engineering (BESE) Division, Thuwal 23955-6900, Saudi Arabia.
Computer Science Program, King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Centre (CBRC), Thuwal 23955-6900, Saudi Arabia.
Cell Rep Methods. 2025 Aug 18;5(8):101140. doi: 10.1016/j.crmeth.2025.101140. Epub 2025 Aug 11.
We identified a gene panel comprising 71 glycosyltransferases (GTs) that alter glycan patterns on cancer cells as they become more virulent. When these cancer-pattern GTs (CPGTs) were run through an algorithm trained on The Cancer Genome Atlas, they differentiated tumors from healthy tissue with 97% accuracy and clustered 27 cancers with 94% accuracy in external validation, revealing each variety's "biometric glycan ID." Using machine learning, we built four models for cancer classification, including two for detecting the molecular subtypes of breast cancer and glioma using even smaller CPGT sets. Our results reveal the power of using glyco-genes for diagnostics: Our breast cancer classifier was almost twice as effective in independent testing as the widely used prediction analysis of microarray 50 (PAM50) subtyping kit at differentiating between luminal A, luminal B, HER2-enriched, and basal-like breast cancers based on a comparable number of genes. Only four GT genes were needed to build a prognostic model for glioma survival.