基于电子病历中的屈光数据预测中国学龄儿童近视进展：一项回顾性、多中心机器学习研究。

Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study.

机构信息

State Key Laboratory of Ophthalmology, Clinical Research Center for Ocular Disease, Zhongshan Ophthalmic Centre, Sun Yat-sen University, Guangzhou, China.

School of Public Health, Sun Yat-sen University, Guangzhou, China.

出版信息

PLoS Med. 2018 Nov 6;15(11):e1002674. doi: 10.1371/journal.pmed.1002674. eCollection 2018 Nov.

DOI:10.1371/journal.pmed.1002674

PMID:30399150

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6219762/

Abstract

BACKGROUND

Electronic medical records provide large-scale real-world clinical data for use in developing clinical decision systems. However, sophisticated methodology and analytical skills are required to handle the large-scale datasets necessary for the optimisation of prediction accuracy. Myopia is a common cause of vision loss. Current approaches to control myopia progression are effective but have significant side effects. Therefore, identifying those at greatest risk who should undergo targeted therapy is of great clinical importance. The objective of this study was to apply big data and machine learning technology to develop an algorithm that can predict the onset of high myopia, at specific future time points, among Chinese school-aged children.

METHODS AND FINDINGS

Real-world clinical refraction data were derived from electronic medical record systems in 8 ophthalmic centres from January 1, 2005, to December 30, 2015. The variables of age, spherical equivalent (SE), and annual progression rate were used to develop an algorithm to predict SE and onset of high myopia (SE ≤ -6.0 dioptres) up to 10 years in the future. Random forest machine learning was used for algorithm training and validation. Electronic medical records from the Zhongshan Ophthalmic Centre (a major tertiary ophthalmic centre in China) were used as the training set. Ten-fold cross-validation and out-of-bag (OOB) methods were applied for internal validation. The remaining 7 independent datasets were used for external validation. Two population-based datasets, which had no participant overlap with the ophthalmic-centre-based datasets, were used for multi-resource validation testing. The main outcomes and measures were the area under the curve (AUC) values for predicting the onset of high myopia over 10 years and the presence of high myopia at 18 years of age. In total, 687,063 multiple visit records (≥3 records) of 129,242 individuals in the ophthalmic-centre-based electronic medical record databases and 17,113 follow-up records of 3,215 participants in population-based cohorts were included in the analysis. Our algorithm accurately predicted the presence of high myopia in internal validation (the AUC ranged from 0.903 to 0.986 for 3 years, 0.875 to 0.901 for 5 years, and 0.852 to 0.888 for 8 years), external validation (the AUC ranged from 0.874 to 0.976 for 3 years, 0.847 to 0.921 for 5 years, and 0.802 to 0.886 for 8 years), and multi-resource testing (the AUC ranged from 0.752 to 0.869 for 4 years). With respect to the prediction of high myopia development by 18 years of age, as a surrogate of high myopia in adulthood, the algorithm provided clinically acceptable accuracy over 3 years (the AUC ranged from 0.940 to 0.985), 5 years (the AUC ranged from 0.856 to 0.901), and even 8 years (the AUC ranged from 0.801 to 0.837). Meanwhile, our algorithm achieved clinically acceptable prediction of the actual refraction values at future time points, which is supported by the regressive performance and calibration curves. Although the algorithm achieved balanced and robust performance, concerns about the compromised quality of real-world clinical data and over-fitting issues should be cautiously considered.

CONCLUSIONS

To our knowledge, this study, for the first time, used large-scale data collected from electronic health records to demonstrate the contribution of big data and machine learning approaches to improved prediction of myopia prognosis in Chinese school-aged children. This work provides evidence for transforming clinical practice, health policy-making, and precise individualised interventions regarding the practical control of school-aged myopia.

摘要

背景

电子病历为开发临床决策系统提供了大规模的真实世界临床数据。然而，为了优化预测精度，需要使用复杂的方法和分析技能来处理必要的大规模数据集。近视是导致视力丧失的常见原因。目前控制近视进展的方法虽然有效，但有显著的副作用。因此，确定那些风险最大、应该接受靶向治疗的人具有重要的临床意义。本研究的目的是应用大数据和机器学习技术，开发一种算法，以预测中国学龄儿童在特定未来时间点发生高度近视的时间。

方法和发现

从 2005 年 1 月 1 日至 2015 年 12 月 30 日，从 8 个眼科中心的电子病历系统中提取真实世界的临床折射数据。使用年龄、球镜等效（SE）和年进展率等变量来开发一种算法，以预测未来 10 年内 SE 和高度近视（SE≤-6.0 屈光度）的发生。随机森林机器学习用于算法训练和验证。中山大学眼科中心（中国主要的三级眼科中心）的电子病历被用作训练集。应用 10 倍交叉验证和袋外（OOB）方法进行内部验证。其余 7 个独立数据集用于外部验证。使用两个基于人群的数据集进行多资源验证测试，这两个数据集与眼科中心的数据集没有重叠。主要结局和测量指标是预测未来 10 年内高度近视发生的曲线下面积（AUC）值和 18 岁时存在高度近视的情况。在眼科中心的电子病历数据库中，共有 687063 个≥3 次就诊记录（129242 人）和人群队列中 17113 个随访记录（3215 人）被纳入分析。我们的算法在内部验证中准确地预测了高度近视的存在（3 年的 AUC 范围为 0.903 至 0.986，5 年的 AUC 范围为 0.875 至 0.901，8 年的 AUC 范围为 0.852 至 0.888）、外部验证（3 年的 AUC 范围为 0.874 至 0.976，5 年的 AUC 范围为 0.847 至 0.921，8 年的 AUC 范围为 0.802 至 0.886）和多资源测试（4 年的 AUC 范围为 0.752 至 0.869）。关于到 18 岁时高度近视发展的预测，作为成年高度近视的替代指标，该算法在 3 年（AUC 范围为 0.940 至 0.985）、5 年（AUC 范围为 0.856 至 0.901）甚至 8 年（AUC 范围为 0.801 至 0.837）的时间内提供了可接受的临床准确性。同时，我们的算法在预测未来时间点的实际折射值方面也达到了可接受的精度，这一点得到了回归性能和校准曲线的支持。尽管该算法实现了平衡和稳健的性能，但应谨慎考虑对真实世界临床数据质量的影响和过拟合问题。

结论

据我们所知，本研究首次使用从电子健康记录中收集的大规模数据，证明了大数据和机器学习方法在提高中国学龄儿童近视预后预测方面的贡献。这项工作为临床实践、卫生政策制定和针对学龄儿童近视的精确个体化干预提供了证据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5bf/6219762/6ae41ed47a0d/pmed.1002674.g001.jpg

相似文献

Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study.基于电子病历中的屈光数据预测中国学龄儿童近视进展：一项回顾性、多中心机器学习研究。

PLoS Med. 2018 Nov 6;15(11):e1002674. doi: 10.1371/journal.pmed.1002674. eCollection 2018 Nov.

Development and validation of predictive models for myopia onset and progression using extensive 15-year refractive data in children and adolescents.利用儿童和青少年长达 15 年的全面屈光数据，开发和验证近视发病和进展的预测模型。

J Transl Med. 2024 Mar 17;22(1):289. doi: 10.1186/s12967-024-05075-0.

Predicting the risk of emergency admission with machine learning: Development and validation using linked electronic health records.使用机器学习预测急诊入院风险：基于电子健康记录的开发和验证。

PLoS Med. 2018 Nov 20;15(11):e1002695. doi: 10.1371/journal.pmed.1002695. eCollection 2018 Nov.

Machine learning assessment of myocardial ischemia using angiography: Development and retrospective validation.基于造影的机器学习评估心肌缺血：开发与回顾性验证。

PLoS Med. 2018 Nov 13;15(11):e1002693. doi: 10.1371/journal.pmed.1002693. eCollection 2018 Nov.

Predicting Survival From Large Echocardiography and Electronic Health Record Datasets: Optimization With Machine Learning.从大型超声心动图和电子健康记录数据集预测生存：机器学习优化。

JACC Cardiovasc Imaging. 2019 Apr;12(4):681-689. doi: 10.1016/j.jcmg.2018.04.026. Epub 2018 Jun 13.

Impact of ophthalmic clinical service use in mitigating myopia onset and progression in preschool children: a retrospective cohort study.学龄前儿童眼科临床服务利用对减缓近视发生和进展的影响：一项回顾性队列研究。

BMC Ophthalmol. 2024 May 27;24(1):221. doi: 10.1186/s12886-024-03488-5.

Machine Learning Models for Predicting Cycloplegic Refractive Error and Myopia Status Based on Non-Cycloplegic Data in Chinese Students.基于中国学生非睫状肌麻痹数据的预测睫状肌麻痹屈光误差和近视状态的机器学习模型。

Transl Vis Sci Technol. 2024 Aug 1;13(8):16. doi: 10.1167/tvst.13.8.16.

Development and Validation of an Electronic Health Record-Based Machine Learning Model to Estimate Delirium Risk in Newly Hospitalized Patients Without Known Cognitive Impairment.基于电子病历的机器学习模型开发与验证：用于预测无已知认知障碍的新入院患者发生谵妄的风险。

JAMA Netw Open. 2018 Aug 3;1(4):e181018. doi: 10.1001/jamanetworkopen.2018.1018.

Prevalence Patterns and Onset Prediction of High Myopia for Children and Adolescents in Southern China via Real-World Screening Data: Retrospective School-Based Study.基于真实世界筛查数据的中国南方儿童和青少年高度近视的流行模式和发病预测：回顾性基于学校的研究。

J Med Internet Res. 2023 Mar 1;25:e39507. doi: 10.2196/39507.

Accurate Prediction of Coronary Heart Disease for Patients With Hypertension From Electronic Health Records With Big Data and Machine-Learning Methods: Model Development and Performance Evaluation.利用大数据和机器学习方法从电子健康记录中准确预测高血压患者的冠心病：模型开发与性能评估

JMIR Med Inform. 2020 Jul 6;8(7):e17257. doi: 10.2196/17257.

引用本文的文献

A nomogram for identifying premyopia and myopia candidates in Chinese children: focusing on those with cycloplegic spherical equivalent refraction ≤ + 0.75D.用于识别中国儿童近视前期和近视候选者的列线图：聚焦于睫状肌麻痹等效球镜度≤+0.75D的儿童。

BMC Ophthalmol. 2025 Aug 28;25(1):490. doi: 10.1186/s12886-025-04278-3.

Integrating biometric and multimodal imaging data for early prediction of myopia onset.整合生物特征和多模态成像数据用于近视发病的早期预测。

Sci Rep. 2025 Aug 26;15(1):31416. doi: 10.1038/s41598-025-15605-5.

Predicting onset of myopic refractive error in children using machine learning on routine pediatric eye examinations only.仅通过对常规儿科眼部检查使用机器学习来预测儿童近视屈光不正的发病情况。

Sci Rep. 2025 Aug 17;15(1):30055. doi: 10.1038/s41598-025-13990-5.

School-level prediction and management of myopia in children and adolescents.儿童和青少年近视的校级预测与管理。

J Transl Med. 2025 Aug 6;23(1):876. doi: 10.1186/s12967-025-06855-y.

Influence of dynamic changes of ocular biometric parameters on new-onset myopia in Chinese children: a 4-year cohort study.中国儿童眼部生物测量参数动态变化对新发近视的影响：一项4年队列研究

Sci Rep. 2025 Aug 4;15(1):28474. doi: 10.1038/s41598-025-14453-7.

Development and validation of a model for predicting myopia in young children in China.中国幼儿近视预测模型的开发与验证

BMC Ophthalmol. 2025 Jul 31;25(1):439. doi: 10.1186/s12886-025-04267-6.

Artificial intelligence in pediatric eye care: Assessing the necessity and impact.儿科眼科护理中的人工智能：评估必要性和影响。

Indian J Ophthalmol. 2025 Aug 1;73(8):1219-1221. doi: 10.4103/IJO.IJO_899_25. Epub 2025 Jul 28.

IMI-Instrumentation for Myopia Management.IMI-近视管理仪器

Invest Ophthalmol Vis Sci. 2025 Jul 1;66(9):7. doi: 10.1167/iovs.66.9.7.

Application of artificial intelligence in myopia prevention and control.人工智能在近视防控中的应用。

Pediatr Investig. 2025 Mar 18;9(2):114-124. doi: 10.1002/ped4.70001. eCollection 2025 Jun.

Myopia Development During Transition From Kindergarten to Early Grades in Elementary School: Population-Based Evidence From an Epidemic Area in Taiwan.台湾某流行地区基于人群的证据：从幼儿园到小学低年级过渡阶段的近视发展情况

Invest Ophthalmol Vis Sci. 2025 Jun 2;66(6):48. doi: 10.1167/iovs.66.6.48.

本文引用的文献

With Great Power Comes Great Responsibility: Big Data Research From the National Inpatient Sample.能力越大，责任越大：来自全国住院患者样本的大数据研究。

Circ Cardiovasc Qual Outcomes. 2017 Jul;10(7). doi: 10.1161/CIRCOUTCOMES.117.003846.

Identifying Children at Risk of High Myopia Using Population Centile Curves of Refraction.利用屈光百分位数曲线识别高度近视风险儿童。

PLoS One. 2016 Dec 28;11(12):e0167642. doi: 10.1371/journal.pone.0167642. eCollection 2016.

Real-World Evidence - What Is It and What Can It Tell Us?真实世界证据——它是什么以及能告诉我们什么？

N Engl J Med. 2016 Dec 8;375(23):2293-2297. doi: 10.1056/NEJMsb1609216.

Predicting the Future - Big Data, Machine Learning, and Clinical Medicine.预测未来——大数据、机器学习与临床医学。

N Engl J Med. 2016 Sep 29;375(13):1216-9. doi: 10.1056/NEJMp1606181.

Age of onset of myopia predicts risk of high myopia in later childhood in myopic Singapore children.近视发病年龄可预测新加坡近视儿童在童年后期发展为高度近视的风险。

Ophthalmic Physiol Opt. 2016 Jul;36(4):388-94. doi: 10.1111/opo.12305.

Meta-analysis of gene-environment-wide association scans accounting for education level identifies additional loci for refractive error.考虑教育水平的基因-环境全基因组关联扫描的荟萃分析确定了屈光不正的其他基因座。

Nat Commun. 2016 Mar 29;7:11008. doi: 10.1038/ncomms11008.

Effect of Time Spent Outdoors at School on the Development of Myopia Among Children in China: A Randomized Clinical Trial.户外活动时间对中国儿童近视发展的影响：一项随机临床试验。

JAMA. 2015 Sep 15;314(11):1142-8. doi: 10.1001/jama.2015.10803.

Five-Year Clinical Trial on Atropine for the Treatment of Myopia 2: Myopia Control with Atropine 0.01% Eyedrops.五年阿托品治疗近视临床试验 2：0.01%阿托品滴眼液治疗近视的控制效果。

Ophthalmology. 2016 Feb;123(2):391-399. doi: 10.1016/j.ophtha.2015.07.004. Epub 2015 Aug 11.

Prediction of Juvenile-Onset Myopia.青少年近视的预测

JAMA Ophthalmol. 2015 Jun;133(6):683-9. doi: 10.1001/jamaophthalmol.2015.0471.

The myopia boom.近视热潮。

Nature. 2015 Mar 19;519(7543):276-8. doi: 10.1038/519276a.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于电子病历中的屈光数据预测中国学龄儿童近视进展：一项回顾性、多中心机器学习研究。

Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study.

机构信息

出版信息

BACKGROUND

METHODS AND FINDINGS

CONCLUSIONS

背景

方法和发现

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献