Fan Gaowei, Zhang Shunli, Wu Qisheng, Song Yan, Jia Anqi, Li Di, Yue Yuhong, Wang Qingtao
Department of Clinical Laboratory, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, China.
Division of Pathology & Laboratory Medicine, Lu Daopei Hospital, Beijing, China.
Clin Chim Acta. 2022 Oct 1;535:53-60. doi: 10.1016/j.cca.2022.08.007. Epub 2022 Aug 13.
Low-density lipoprotein cholesterol (LDL-C) is a critical biomarker for cardiovascular disease. However, no consensus exists on the best method for estimating LDL-C in Chinese laboratories. This study aimed to develop a machine learning (ML) method for LDL-C estimation.
An extensive data set of 111,448 samples were randomized into five equal subsets. ML-based equations were developed using age, sex, and lipid parameters based on five-fold cross-validation. The trained ML equations were externally validated in three different data sets. The performance of the ML equations was compared with the Friedewald, Martin/Hopkins, and Sampson equations.
The selected ML equations showed less bias with direct LDL-C than other LDL-C equations in the Chinese population, including those with triglycerides (TG) ≥ 400 mg / dL and LDL-C < 40 mg / dL. The performance of the ML equations was less susceptible to age. External validation showed the generalization of the ML equations.
This study highlights the potential of integrating sex, age, and lipid parameters into the ML equations to obtain a more robust and reliable LDL-C calculation.
低密度脂蛋白胆固醇(LDL-C)是心血管疾病的关键生物标志物。然而,中国实验室中估算LDL-C的最佳方法尚无共识。本研究旨在开发一种用于估算LDL-C的机器学习(ML)方法。
将111448个样本的广泛数据集随机分为五个相等的子集。基于年龄、性别和脂质参数,采用五折交叉验证法建立基于ML的方程。在三个不同的数据集中对训练好的ML方程进行外部验证。将ML方程的性能与Friedewald、Martin/Hopkins和Sampson方程进行比较。
在中国人中,所选的ML方程与直接LDL-C相比,偏差小于其他LDL-C方程,包括甘油三酯(TG)≥400mg/dL且LDL-C<40mg/dL的人群。ML方程的性能受年龄影响较小。外部验证显示了ML方程的通用性。
本研究强调了将性别、年龄和脂质参数整合到ML方程中以获得更稳健、可靠的LDL-C计算方法的潜力。