预测多囊卵巢综合征患者的中医证型：特征选择方法与多标签机器学习模型的探索

Predicting TCM patterns in PCOS patients: An exploration of feature selection methods and multi-label machine learning models.

作者信息

Lim Jiekee, Li Jieyun, Feng Xiao, Feng Lu, Xiao Xinang, Zhou Mi, Yang Hong, Xu Zhaoxia

机构信息

School of Traditional Chinese Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, PR China.

The First Affiliated Hospital, Guangzhou University of Traditional Chinese Medicine, Guangzhou, PR China.

出版信息

Heliyon. 2024 Jul 26;10(15):e35283. doi: 10.1016/j.heliyon.2024.e35283. eCollection 2024 Aug 15.

DOI:10.1016/j.heliyon.2024.e35283

PMID:39166018

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11334618/

Abstract

BACKGROUND

Traditional Chinese Medicine (TCM) offers individualized treatment for Polycystic Ovary Syndrome (PCOS) through pattern differentiation, but the subjectivity of TCM diagnoses can lead to inconsistent outcomes. Integrating machine learning (ML) offers an objective basis to support TCM diagnoses. This study aims to evaluate various feature selection techniques and multi-label ML algorithms to develop an effective predictive model for classifying TCM patterns in PCOS patients, thereby enhancing diagnostic standardization and treatment personalization.

METHODS

The study utilized a dataset comprising 432 patients with PCOS, exhibiting one or more of five TCM patterns. Feature selection began with Variance Thresholding (VT), followed by a comparison of five advanced techniques: Statistical Analysis Test, Recursive Feature Elimination with Cross-Validation (RFECV), Least Absolute Shrinkage and Selection Operator Regression, BorutaShap, and ReliefF. To ascertain the most effective model for predicting PCOS TCM patterns, four ML algorithms-Support Vector Machine, Logistic Regression, Extreme Gradient Boosting (XGBoost), and Artificial Neural Networks-were evaluated against the identified feature set.

RESULTS

VT reduced the feature count from 224 to 174. RFECV emerged as the most effective feature selection method, identifying 67 key features. XGBoost emerged as the top-performing model, demonstrating superior testing accuracy (0.7870), F1 score (0.9519), and Hamming loss (0.0481) with RFECV-optimized features.

CONCLUSIONS

The RFECV-XGBoost model proved effective for classifying TCM patterns in PCOS. It emphasizes the necessity of precise feature selection and the significant capabilities of ML in advancing TCM pattern diagnostics, marking a significant step toward enhancing precise and personalized healthcare in biomedical studies.

摘要

背景

中医通过辨证论治为多囊卵巢综合征（PCOS）提供个体化治疗，但中医诊断的主观性可能导致结果不一致。整合机器学习（ML）为支持中医诊断提供了客观依据。本研究旨在评估各种特征选择技术和多标签ML算法，以开发一种有效的预测模型，用于对PCOS患者的中医证型进行分类，从而提高诊断标准化和治疗个性化。

方法

该研究使用了一个包含432例PCOS患者的数据集，这些患者表现出五种中医证型中的一种或多种。特征选择首先采用方差阈值法（VT），然后比较五种先进技术：统计分析测试、带交叉验证的递归特征消除（RFECV）、最小绝对收缩和选择算子回归、BorutaShap和ReliefF。为了确定预测PCOS中医证型的最有效模型，针对识别出的特征集评估了四种ML算法——支持向量机、逻辑回归、极端梯度提升（XGBoost）和人工神经网络。

结果

VT将特征数量从224个减少到174个。RFECV成为最有效的特征选择方法，识别出67个关键特征。XGBoost成为表现最佳的模型，在使用RFECV优化特征时，展示出卓越的测试准确率（0.7870）、F1分数（0.9519）和汉明损失（0.0481）。

结论

RFECV-XGBoost模型被证明对PCOS的中医证型分类有效。它强调了精确特征选择的必要性以及ML在推进中医证型诊断方面的显著能力，标志着在生物医学研究中朝着提高精准和个性化医疗迈出了重要一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e7d/11334618/c6b7a2fc3cb1/gr1.jpg

相似文献

Predicting TCM patterns in PCOS patients: An exploration of feature selection methods and multi-label machine learning models.预测多囊卵巢综合征患者的中医证型：特征选择方法与多标签机器学习模型的探索

Heliyon. 2024 Jul 26;10(15):e35283. doi: 10.1016/j.heliyon.2024.e35283. eCollection 2024 Aug 15.

Computer-assisted lip diagnosis on Traditional Chinese Medicine using multi-class support vector machines.基于多类支持向量机的中医唇诊计算机辅助诊断。

BMC Complement Altern Med. 2012 Aug 16;12:127. doi: 10.1186/1472-6882-12-127.

Machine learning classification of polycystic ovary syndrome based on radial pulse wave analysis.基于桡动脉脉搏波分析的多囊卵巢综合征的机器学习分类。

BMC Complement Med Ther. 2023 Nov 13;23(1):409. doi: 10.1186/s12906-023-04249-5.

Construction and Application of a Traditional Chinese Medicine Syndrome Differentiation Model for Dysmenorrhea Based on Machine Learning.基于机器学习的痛经中医辨证模型的构建与应用

Comb Chem High Throughput Screen. 2025;28(4):664-674. doi: 10.2174/0113862073293191240212091028.

Predicting polycystic ovary syndrome with machine learning algorithms from electronic health records.基于电子健康记录的机器学习算法预测多囊卵巢综合征。

Front Endocrinol (Lausanne). 2024 Jan 30;15:1298628. doi: 10.3389/fendo.2024.1298628. eCollection 2024.

Preoperative prediction of vessel invasion in locally advanced gastric cancer based on computed tomography radiomics and machine learning.基于计算机断层扫描影像组学和机器学习的局部进展期胃癌血管侵犯术前预测

Oncol Lett. 2023 May 22;26(1):293. doi: 10.3892/ol.2023.13879. eCollection 2023 Jul.

Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence.基于优化特征选择和可解释人工智能的多囊卵巢综合征检测机器学习模型

Diagnostics (Basel). 2023 Apr 21;13(8):1506. doi: 10.3390/diagnostics13081506.

Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。

BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.

Development and Validation of an Explainable Machine Learning Model for Predicting Myocardial Injury After Noncardiac Surgery in Two Centers in China: Retrospective Study.中国两个中心用于预测非心脏手术后心肌损伤的可解释机器学习模型的开发与验证：一项回顾性研究

JMIR Aging. 2024 Jul 26;7:e54872. doi: 10.2196/54872.

Predictive model and risk analysis for peripheral vascular disease in type 2 diabetes mellitus patients using machine learning and shapley additive explanation.基于机器学习和 Shapley 加法解释的 2 型糖尿病患者外周血管疾病预测模型和风险分析。

Front Endocrinol (Lausanne). 2024 Feb 28;15:1320335. doi: 10.3389/fendo.2024.1320335. eCollection 2024.

引用本文的文献

Artificial intelligence in polycystic ovarian syndrome management: past, present, and future.人工智能在多囊卵巢综合征管理中的应用：过去、现在与未来

Radiol Med. 2025 Jun 23. doi: 10.1007/s11547-025-02032-9.

Developing a transparent reporting tool for AI-based diagnostic prediction models of disease and syndrome in Chinese medicine: a Delphi protocol.开发用于中医疾病和证候人工智能诊断预测模型的透明报告工具：德尔菲协议。

Front Digit Health. 2025 May 16;7:1575320. doi: 10.3389/fdgth.2025.1575320. eCollection 2025.

Advanced holographic convolutional dense networks and Tangent runner optimization for enhanced polycystic ovarian disease classification.用于增强多囊卵巢疾病分类的先进全息卷积密集网络和切线跑步者优化

Sci Rep. 2025 May 5;15(1):15719. doi: 10.1038/s41598-025-98873-5.

Optimized Machine Learning for the Early Detection of Polycystic Ovary Syndrome in Women.优化机器学习用于女性多囊卵巢综合征的早期检测

Sensors (Basel). 2025 Feb 14;25(4):1166. doi: 10.3390/s25041166.

本文引用的文献

A review of traditional Chinese medicine diagnosis using machine learning: Inspection, auscultation-olfaction, inquiry, and palpation.基于机器学习的中医诊断综述：望闻问切。

Comput Biol Med. 2024 Mar;170:108074. doi: 10.1016/j.compbiomed.2024.108074. Epub 2024 Feb 2.

Traditional Chinese medicine diagnostic prediction model for holistic syndrome differentiation based on deep learning.基于深度学习的中医整体辨证诊断预测模型

Integr Med Res. 2024 Mar;13(1):101019. doi: 10.1016/j.imr.2023.101019. Epub 2023 Dec 19.

Traditional Chinese medicine formulae: A complementary method for the treatment of polycystic ovary syndrome.中药方剂：多囊卵巢综合征的辅助治疗方法。

J Ethnopharmacol. 2024 Apr 6;323:117698. doi: 10.1016/j.jep.2023.117698. Epub 2024 Jan 1.

Metabolic syndrome prediction model using Bayesian optimization and XGBoost based on traditional Chinese medicine features.基于中医特征，利用贝叶斯优化和XGBoost的代谢综合征预测模型

Heliyon. 2023 Nov 30;9(12):e22727. doi: 10.1016/j.heliyon.2023.e22727. eCollection 2023 Dec.

Machine learning classification of polycystic ovary syndrome based on radial pulse wave analysis.基于桡动脉脉搏波分析的多囊卵巢综合征的机器学习分类。

BMC Complement Med Ther. 2023 Nov 13;23(1):409. doi: 10.1186/s12906-023-04249-5.

A new model for predicting the occurrence of polycystic ovary syndrome: Based on data of tongue and pulse.一种预测多囊卵巢综合征发生的新模型：基于舌象和脉象数据。

Digit Health. 2023 Feb 28;9:20552076231160323. doi: 10.1177/20552076231160323. eCollection 2023 Jan-Dec.

Preoperative prediction of sonic hedgehog and group 4 molecular subtypes of pediatric medulloblastoma based on radiomics of multiparametric MRI combined with clinical parameters.基于多参数MRI影像组学结合临床参数对小儿髓母细胞瘤中 Sonic Hedgehog 和 4 组分子亚型的术前预测

Front Neurosci. 2023 Apr 11;17:1157858. doi: 10.3389/fnins.2023.1157858. eCollection 2023.

A Traditional Chinese Medicine Syndrome Classification Model Based on Cross-Feature Generation by Convolution Neural Network: Model Development and Validation.基于卷积神经网络交叉特征生成的中医证候分类模型：模型开发与验证

JMIR Med Inform. 2022 Apr 6;10(4):e29290. doi: 10.2196/29290.

Polycystic ovary syndrome: clinical and laboratory variables related to new phenotypes using machine-learning models.多囊卵巢综合征：基于机器学习模型的新表型相关的临床和实验室变量。

J Endocrinol Invest. 2022 Mar;45(3):497-505. doi: 10.1007/s40618-021-01672-8. Epub 2021 Sep 15.

Effective attention-based network for syndrome differentiation of AIDS.基于有效注意力的艾滋病辨证网络。

BMC Med Inform Decis Mak. 2020 Oct 15;20(1):264. doi: 10.1186/s12911-020-01249-0.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

预测多囊卵巢综合征患者的中医证型：特征选择方法与多标签机器学习模型的探索

Predicting TCM patterns in PCOS patients: An exploration of feature selection methods and multi-label machine learning models.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献