Suppr超能文献

拓扑数据分析鉴定特发性肺纤维化的分子表型。

Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis.

机构信息

Biological Sciences, University of Southampton, Southampton, Hampshire, UK.

Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, UK.

出版信息

Thorax. 2023 Jul;78(7):682-689. doi: 10.1136/thorax-2022-219731. Epub 2023 Feb 20.

Abstract

BACKGROUND

Idiopathic pulmonary fibrosis (IPF) is a debilitating, progressive disease with a median survival time of 3-5 years. Diagnosis remains challenging and disease progression varies greatly, suggesting the possibility of distinct subphenotypes.

METHODS AND RESULTS

We analysed publicly available peripheral blood mononuclear cell expression datasets for 219 IPF, 411 asthma, 362 tuberculosis, 151 healthy, 92 HIV and 83 other disease samples, totalling 1318 patients. We integrated the datasets and split them into train (n=871) and test (n=477) cohorts to investigate the utility of a machine learning model (support vector machine) for predicting IPF. A panel of 44 genes predicted IPF in a background of healthy, tuberculosis, HIV and asthma with an area under the curve of 0.9464, corresponding to a sensitivity of 0.865 and a specificity of 0.89. We then applied topological data analysis to investigate the possibility of subphenotypes within IPF. We identified five molecular subphenotypes of IPF, one of which corresponded to a phenotype enriched for death/transplant. The subphenotypes were molecularly characterised using bioinformatic and pathway analysis tools identifying distinct subphenotype features including one which suggests an extrapulmonary or systemic fibrotic disease.

CONCLUSIONS

Integration of multiple datasets, from the same tissue, enabled the development of a model to accurately predict IPF using a panel of 44 genes. Furthermore, topological data analysis identified distinct subphenotypes of patients with IPF which were defined by differences in molecular pathobiology and clinical characteristics.

摘要

背景

特发性肺纤维化(IPF)是一种使人衰弱且进行性的疾病,中位生存时间为 3-5 年。诊断仍然具有挑战性,疾病进展差异很大,这表明可能存在不同的亚表型。

方法和结果

我们分析了 219 例 IPF、411 例哮喘、362 例肺结核、151 例健康、92 例 HIV 和 83 例其他疾病患者的公开可用外周血单核细胞表达数据集,共计 1318 例患者。我们整合了数据集,并将其分为训练集(n=871)和测试集(n=477),以研究机器学习模型(支持向量机)预测 IPF 的效用。一组 44 个基因可在健康、肺结核、HIV 和哮喘的背景下预测 IPF,曲线下面积为 0.9464,对应敏感性为 0.865,特异性为 0.89。然后,我们应用拓扑数据分析来研究 IPF 中可能存在的亚表型。我们确定了 IPF 的五个分子亚表型,其中一个与死亡/移植相关的表型富集。使用生物信息学和途径分析工具对亚表型进行了分子特征分析,确定了不同的亚表型特征,包括一个提示存在肺外或系统性纤维化疾病的特征。

结论

整合来自同一组织的多个数据集,使我们能够使用一组 44 个基因开发一种准确预测 IPF 的模型。此外,拓扑数据分析确定了 IPF 患者的不同亚表型,这些亚表型由分子病理生物学和临床特征的差异定义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bba/10314053/95358139ceb8/thorax-2022-219731f01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验