Suppr超能文献

用于预测系统性硬化症严重程度的全球基因特征模型的新型分类。

Novel classification for global gene signature model for predicting severity of systemic sclerosis.

机构信息

Department of Health Promotions and Development, University of Pittsburgh School of Nursing, Pittsburgh, PA, United States of America.

Department of Biological & Environmental Sciences, Troy University, Troy, AL, United States of America.

出版信息

PLoS One. 2018 Jun 20;13(6):e0199314. doi: 10.1371/journal.pone.0199314. eCollection 2018.

Abstract

Progression of systemic scleroderma (SSc), a chronic connective tissue disease that causes a fibrotic phenotype, is highly heterogeneous amongst patients and difficult to accurately diagnose. To meet this clinical need, we developed a novel three-layer classification model, which analyses gene expression profiles from SSc skin biopsies to diagnose SSc severity. Two SSc skin biopsy microarray datasets were obtained from Gene Expression Omnibus. The skin scores obtained from the original papers were used to further categorize the data into subgroups of low (<18) and high (≥18) severity. Data was pre-processed for normalization, background correction, centering and scaling. A two-layered cross-validation scheme was employed to objectively evaluate the performance of classification models of unobserved data. Three classification models were used: support vector machine, random forest, and naive Bayes in combination with feature selection methods to improve performance accuracy. For both input datasets, random forest classifier combined with correlation-based feature selection (CFS) method and naive Bayes combined with CFS or support vector machine based recursive feature elimination method yielded the best results. Additionally, we performed a principal component analysis to show that low and high severity groups are readily separable by gene expression signatures. Ultimately, we found that our novel classification prediction model produced global gene signatures that significantly correlated with skin scores. This study represents the first report comparing the performance of various classification prediction models for gene signatures from SSc patients, using current clinical diagnostic factors. In summary, our three-classification model system is a powerful tool for elucidating gene signatures from SSc skin biopsies and can also be used to develop a prognostic gene signature for SSc and other fibrotic disorders.

摘要

系统性硬皮病(SSc)是一种慢性结缔组织疾病,会导致纤维化表型,其进展在患者之间高度异质,难以准确诊断。为满足这一临床需求,我们开发了一种新型三层分类模型,该模型分析 SSc 皮肤活检的基因表达谱,以诊断 SSc 严重程度。从基因表达综合数据库中获得了两个 SSc 皮肤活检微阵列数据集。使用原始论文中的皮肤评分将数据进一步分为低(<18)和高(≥18)严重程度亚组。对数据进行预处理以进行归一化、背景校正、中心化和缩放。采用两层交叉验证方案客观评估对未见数据分类模型的性能。使用三种分类模型:支持向量机、随机森林和朴素贝叶斯,并结合特征选择方法来提高性能准确性。对于两个输入数据集,随机森林分类器与基于相关性的特征选择(CFS)方法相结合,以及朴素贝叶斯与 CFS 或基于支持向量机的递归特征消除方法相结合,产生了最佳结果。此外,我们进行了主成分分析,以表明低严重程度和高严重程度组可以通过基因表达特征轻松区分。最终,我们发现我们的新型分类预测模型产生的全局基因特征与皮肤评分显著相关。这项研究代表了首次使用当前临床诊断因素比较各种分类预测模型对 SSc 患者基因特征的性能的报告。总之,我们的三分类模型系统是阐明 SSc 皮肤活检基因特征的有力工具,也可用于开发 SSc 和其他纤维化疾病的预后基因特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bba/6010260/6d3a6011a851/pone.0199314.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验