Suppr超能文献

妊娠期糖尿病患者诊断性胞嘧啶-磷酸-鸟嘌呤生物标志物的鉴定:表观基因组范围关联研究与机器学习

Identification of diagnostic cytosine-phosphate-guanine biomarkers in patients with gestational diabetes mellitus epigenome-wide association study and machine learning.

作者信息

Liu Yan, Wang Zhenglu, Zhao Lin

机构信息

Department of Obstetrics, Tianjin First Central Hospital, Nankai University, Tianjin, China.

Biobank, Tianjin First Central Hospital, Nankai University, Tianjin, China.

出版信息

Gynecol Endocrinol. 2021 Sep;37(9):857-862. doi: 10.1080/09513590.2021.1937101. Epub 2021 Jul 13.

Abstract

OBJECTIVE

To explore gestational diabetes mellitus (GDM) diagnostic markers and establish the predictive model of GDM.

METHODS

We downloaded the DNA methylation data of GSE70453 and GSE102177 from the Gene Expression Omnibus database. Epigenome-wide association study (EWAS) was performed to analyze the relationship between cytosine-phosphate-guanine (CpG) methylation and GDM. And then the logistic regression models were constructed, with the β-values of CpG sites as predictor variable and the GDM occurrence as binary outcome variable. Data from GSE70453 served as training sets and data from GSE102177 served as verification sets.

RESULTS

The EWAS and overlap analysis identified nine-shared significant CpGs in the two DNA methylation data sets. Remarkably, these nine CpGs were differently methylated in GDM samples compared to their matched normal specimens, among which five fully methylated CpGs were finally selected. Importantly, we established a binary logistic regression model based on the above five CpGs, in which cg11169102, cg21179618 and cg21620107 were critical. Hence, we further built a logistic regression model by using the three CpGs and found that the area under the curve was 0.8209. The validation of the model by using the verification sets indicated the area under the curve was 0.8519.

CONCLUSIONS

We identified potential CpG biomarkers for the diagnosis of gestational diabetes mellitus patients through using EWAS and Logistic regression models in combination.

摘要

目的

探索妊娠期糖尿病(GDM)的诊断标志物并建立GDM预测模型。

方法

我们从基因表达综合数据库下载了GSE70453和GSE102177的DNA甲基化数据。进行全基因组关联研究(EWAS)以分析胞嘧啶-磷酸-鸟嘌呤(CpG)甲基化与GDM之间的关系。然后构建逻辑回归模型,将CpG位点的β值作为预测变量,将GDM的发生作为二元结局变量。来自GSE70453的数据用作训练集,来自GSE102177的数据用作验证集。

结果

EWAS和重叠分析在两个DNA甲基化数据集中鉴定出9个共享的显著CpG。值得注意的是,与匹配的正常样本相比,这9个CpG在GDM样本中的甲基化程度不同,最终选择了5个完全甲基化的CpG。重要的是,我们基于上述5个CpG建立了二元逻辑回归模型,其中cg11169102、cg21179618和cg21620107至关重要。因此,我们进一步使用这3个CpG构建了逻辑回归模型,发现曲线下面积为0.8209。使用验证集对模型进行验证表明曲线下面积为0.8519。

结论

我们通过联合使用EWAS和逻辑回归模型,鉴定出了用于诊断妊娠期糖尿病患者的潜在CpG生物标志物。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验