Information and Computer Engineering College, Northeast Forestry University, Harbin 150040, China.
Department of Gynecology and Obstetrics, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
Comput Math Methods Med. 2021 Feb 23;2021:6691096. doi: 10.1155/2021/6691096. eCollection 2021.
Preeclampsia (PE) is a maternal disease that causes maternal and child death. Treatment and preventive measures are not sound enough. The problem of PE screening has attracted much attention. The purpose of this study is to screen placental mRNA to obtain the best PE biomarkers for identifying patients with PE. We use Limma in the R language to screen out the 48 differentially expressed genes with the largest differences and used correlation-based feature selection algorithms to reduce the dimensionality and avoid attribute redundancy arising from too many mRNA samples participating in the classification. After reducing the mRNA attributes, the mRNA samples are sorted from large to small according to information gain. In this study, a classifier model is designed to identify whether samples had PE through mRNA in the placenta. To improve the accuracy of classification and avoid overfitting, three classifiers, including C4.5, AdaBoost, and multilayer perceptron, are used. We use the majority voting strategy integrated with the differentially expressed genes and the genes filtered by the best subset method as comparison methods to train the classifier. The results show that the classification accuracy rate has increased from 79% to 82.2%, and the number of mRNA features has decreased from 48 to 13. This study provides clues for the main PE biomarkers of mRNA in the placenta and provides ideas for the treatment and screening of PE.
子痫前期(PE)是一种导致母婴死亡的产妇疾病。其治疗和预防措施并不完善。PE 的筛查问题引起了广泛关注。本研究旨在通过筛选胎盘 mRNA 获得最佳的 PE 生物标志物,以识别患有 PE 的患者。我们使用 R 语言中的 Limma 筛选出差异表达基因中差异最大的 48 个基因,并使用基于相关性的特征选择算法来降低维度,避免由于参与分类的 mRNA 样本过多而产生属性冗余。减少 mRNA 属性后,根据信息增益从小到大对 mRNA 样本进行排序。在本研究中,设计了一种通过胎盘 mRNA 识别样本是否患有 PE 的分类器模型。为了提高分类准确性并避免过拟合,使用了 C4.5、AdaBoost 和多层感知器三种分类器。我们使用集成差异表达基因和最佳子集方法筛选的基因的多数投票策略作为比较方法来训练分类器。结果表明,分类准确率从 79%提高到 82.2%,mRNA 特征数量从 48 个减少到 13 个。本研究为胎盘 mRNA 中 PE 的主要生物标志物提供了线索,并为 PE 的治疗和筛查提供了思路。