Thapa Kisan, Kinali Meric, Pei Shichao, Luna Augustin, Babur Özgün
Computer Science Department, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, MA 02125, USA.
Developmental Therapeutics Branch, Center for Cancer Research, National Cancer Institute, NIH, 9000 Rockville Pike, Bathesda, MD 20892, USA.
Patterns (N Y). 2025 Mar 14;6(3):101203. doi: 10.1016/j.patter.2025.101203.
High-throughput molecular profiling technologies have revolutionized molecular biology research in the past decades. One important use of molecular data is to make predictions of phenotypes and other features of the organisms using machine learning algorithms. Deep learning models have become increasingly popular for this task due to their ability to learn complex non-linear patterns. Applying deep learning to molecular profiles, however, is challenging due to the very high dimensionality of the data and relatively small sample sizes, causing models to overfit. A solution is to incorporate biological prior knowledge to guide the learning algorithm for processing the functionally related input together. This helps regularize the models and improve their generalizability and interpretability. Here, we describe three major strategies proposed to use prior knowledge in deep learning models to make predictions based on molecular profiles. We review the related deep learning architectures, including the major ideas in relatively new graph neural networks.
在过去几十年中,高通量分子谱分析技术彻底改变了分子生物学研究。分子数据的一个重要用途是使用机器学习算法预测生物体的表型和其他特征。由于深度学习模型能够学习复杂的非线性模式,它们在这项任务中越来越受欢迎。然而,将深度学习应用于分子谱分析具有挑战性,因为数据维度非常高且样本量相对较小,这会导致模型过拟合。一种解决方案是纳入生物学先验知识,以指导学习算法一起处理功能相关的输入。这有助于使模型正则化,并提高其泛化能力和可解释性。在这里,我们描述了为在深度学习模型中使用先验知识以基于分子谱进行预测而提出的三种主要策略。我们回顾了相关的深度学习架构,包括相对较新的图神经网络中的主要思想。
Patterns (N Y). 2025-3-14
IEEE J Biomed Health Inform. 2023-9
Comput Biol Med. 2023-9
BMC Med Genomics. 2019-12-20
BMC Bioinformatics. 2024-1-15
BioData Min. 2024-10-2
IEEE/ACM Trans Comput Biol Bioinform. 2024
NPJ Syst Biol Appl. 2024-8-2
BMC Bioinformatics. 2024-1-15
Brief Bioinform. 2023-9-22
EBioMedicine. 2023-9