Wang Xinyu, Liu Tianyi, Sheng Yueyang, Zhang Yanzhuo, Qiu Cheng, Li Manyu, Cheng Yuxi, Li Shan, Wang Ying, Wu Chengai
Department of Molecular Orthopaedics, National Center for Orthopaedics, Beijing Research Institute of Traumatology and Orthopaedics, Beijing Jishuitan Hospital, Capital Medical University, Beijing, 100035, China.
Department of Anesthesiology, National Center for Orthopaedics, Beijing Jishuitan Hospital, Capital Medical University, Beijing, 100035, China.
Heliyon. 2024 Jul 23;10(15):e35121. doi: 10.1016/j.heliyon.2024.e35121. eCollection 2024 Aug 15.
Osteoarthritis (OA) is a common chronic joint disease. This study aimed to investigate possible OA diagnostic biomarkers and to verify their significance in clinical samples.
We exploited three datasets from the Gene Expression Omnibus (GEO) database, serving as the training set. We first determined differentially expressed genes and screened candidate diagnostic biomarkers by applying three machine learning algorithms (Random Forest, Least Absolute Shrinkage and Selection Operator logistic regression, Support Vector Machine-Recursive Feature Elimination). Another GEO dataset was used as the validation set. The test set consisted of RNA-sequenced peripheral blood samples collected from patients and healthy donors. Blood samples and chondrocytes were collected for quantitative real-time PCR to confirm expression levels. Receiver operating characteristic curves were generated for individual and combined biomarkers.
In total, 251 DEGs were screened, where , and were screened by all three algorithms. The area under the curve (AUC) of various biomarkers in our test set did not reach as high as that in public datasets. exhibited highest AUC of 0.947 in the training set but 0.691 in our test set, while the favorable combined model comprising , , and demonstrated an AUC of 0.986 in the training set, 1.000 in the validation set and 0.836 in our test set.
We identified a combined model for early diagnosis of OA that includes , , and . This finding offers new avenues for further exploration of mechanisms underlying OA.
骨关节炎(OA)是一种常见的慢性关节疾病。本研究旨在调查可能的OA诊断生物标志物,并验证它们在临床样本中的意义。
我们利用来自基因表达综合数据库(GEO)的三个数据集作为训练集。我们首先通过应用三种机器学习算法(随机森林、最小绝对收缩和选择算子逻辑回归、支持向量机-递归特征消除)确定差异表达基因并筛选候选诊断生物标志物。另一个GEO数据集用作验证集。测试集由从患者和健康供体收集的RNA测序外周血样本组成。收集血液样本和软骨细胞进行定量实时PCR以确认表达水平。为单个和组合生物标志物生成受试者工作特征曲线。
总共筛选出251个差异表达基因,其中 、 和 被所有三种算法筛选出来。我们测试集中各种生物标志物的曲线下面积(AUC)不如公共数据集中的高。 在训练集中的AUC最高为0.947,但在我们的测试集中为0.691,而由 、 、 和 组成的良好组合模型在训练集中的AUC为0.986,在验证集中为1.000,在我们的测试集中为0.836。
我们确定了一个用于OA早期诊断的组合模型,包括 、 、 和 。这一发现为进一步探索OA潜在机制提供了新途径。