Suppr超能文献

一种机器学习方法可在体外人类细胞转录组数据上识别细胞衰老。

A machine learning approach identifies cellular senescence on transcriptome data of human cells in vitro.

作者信息

Mahmud Shamsed, Zheng Chen, Santiago Fernando E, Zhang Lei, Robbins Paul D, Dong Xiao

机构信息

Institute on the Biology of Aging and Metabolism, University of Minnesota, Twin Cities, Minneapolis, MN, 55455, USA.

Department of Genetics, Cell Biology and Development, University of Minnesota, Twin Cities, Minneapolis, MN, 55455, USA.

出版信息

Geroscience. 2024 Dec 30. doi: 10.1007/s11357-024-01485-6.

Abstract

Although cellular senescence has been recognized as a hallmark of aging, it is challenging to detect senescence cells (SnCs) due to their high level of heterogeneity at the molecular level. Machine learning (ML) is likely an ideal approach to address this challenge because of its ability to recognize complex patterns that cannot be characterized by one or a few features, from high-dimensional data. To test this, we evaluated the performance of four ML algorithms including support vector machines (SVM), random forest (RF), decision tree (DT), and Soft Independent Modelling of Class Analogy (SIMCA), in distinguishing SnCs from controls based on bulk RNA sequencing data. The dataset includes 162 in vitro samples, covering three human cell types: fibroblasts, melanocytes, and keratinocytes, and three senescence inducers: irradiation, bleomycin treatment, and replication. Under tenfold and leave-one-out cross-validation, as well as independent dataset validation, all methods provided ~ 80% or higher accuracy, with SVM reaching over 99%. Similar accuracy was achieved using expert-curated gene lists, e.g., SenMayo and CellAge, instead of our algorithm-prioritized gene list using minimum redundancy-maximum relevance (mRMR). However, only a few genes overlapped between the gene sets, suggesting a wide impact of senescence on the transcriptome. Overall, our study demonstrated a proof-of-concept for identifying senescence using ML.

摘要

尽管细胞衰老已被公认为衰老的一个标志,但由于衰老细胞(SnCs)在分子水平上具有高度异质性,检测它们具有挑战性。机器学习(ML)可能是应对这一挑战的理想方法,因为它能够从高维数据中识别出无法用一个或几个特征来表征的复杂模式。为了验证这一点,我们评估了四种机器学习算法的性能,包括支持向量机(SVM)、随机森林(RF)、决策树(DT)和类分析软独立建模(SIMCA),它们基于大量RNA测序数据将SnCs与对照区分开来。该数据集包括162个体外样本,涵盖三种人类细胞类型:成纤维细胞、黑素细胞和角质形成细胞,以及三种衰老诱导剂:辐射、博来霉素处理和复制。在十折交叉验证和留一法交叉验证以及独立数据集验证下,所有方法的准确率均达到约80%或更高,其中支持向量机的准确率超过99%。使用专家策划的基因列表(如SenMayo和CellAge)而不是我们使用最小冗余-最大相关性(mRMR)算法优先排序的基因列表,也能达到类似的准确率。然而,基因集之间只有少数基因重叠,这表明衰老对转录组有广泛影响。总体而言,我们的研究证明了使用机器学习识别衰老的概念验证。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验