Department of Rehabilitation Medicine, University of Alberta, Edmonton, Canada.
Department of Community Health Sciences, University of Calgary, Calgary, Canada.
BMC Med Inform Decis Mak. 2023 Oct 5;23(1):202. doi: 10.1186/s12911-023-02298-x.
Menopause is a normal transition in a woman's life. For some women, it is a stage without significant difficulties; for others, menopause symptoms can severely affect their quality of life. This study developed and validated a case definition for problematic menopause using Canadian primary care electronic medical records, which is an essential step in examining the condition and improving quality of care.
We used data from the Canadian Primary Care Sentinel Surveillance Network including billing and diagnostic codes, diagnostic free-text, problem list entries, medications, and referrals. These data formed the basis of an expert-reviewed reference standard data set and contained the features that were used to train a machine learning model based on classification and regression trees. An ad hoc feature importance measure coupled with recursive feature elimination and clustering were applied to reduce our initial 86,000 element feature set to a few tens of the most relevant features in the data, while class balancing was accomplished with random under- and over-sampling. The final case definition was generated from the tree-based machine learning model output combined with a feature importance algorithm. Two independent samples were used: one for training / testing the machine learning algorithm and the other for case definition validation.
We randomly selected 2,776 women aged 45-60 for this analysis and created a case definition, consisting of two occurrences within 24 months of International Classification of Diseases, Ninth Revision, Clinical Modification code 627 (or any sub-codes) OR one occurrence of Anatomical Therapeutic Chemical classification code G03CA (or any sub-codes) within the patient chart, that was highly effective at detecting problematic menopause cases. This definition produced a sensitivity of 81.5% (95% CI: 76.3-85.9%), specificity of 93.5% (91.9-94.8%), positive predictive value of 73.8% (68.3-78.6%), and negative predictive value of 95.7% (94.4-96.8%).
Our case definition for problematic menopause demonstrated high validity metrics and so is expected to be useful for epidemiological study and surveillance. This case definition will enable future studies exploring the management of menopause in primary care settings.
绝经是女性生命中的正常过渡。对一些女性来说,这是一个没有明显困难的阶段;对另一些女性来说,更年期症状可能严重影响她们的生活质量。本研究使用加拿大初级保健电子病历开发并验证了一种有问题的更年期病例定义,这是检查该疾病并改善护理质量的重要步骤。
我们使用了加拿大初级保健监测网络的数据,包括计费和诊断代码、诊断自由文本、问题清单条目、药物和转诊。这些数据构成了专家审查参考标准数据集的基础,并包含了用于基于分类和回归树训练机器学习模型的特征。我们应用了一种特定的特征重要性度量方法,结合递归特征消除和聚类,将我们最初的 86000 个元素特征集减少到数据中几十个最相关的特征,同时通过随机欠采样和过采样来实现类别平衡。最终的病例定义是从基于树的机器学习模型输出与特征重要性算法相结合生成的。我们使用了两个独立的样本:一个用于训练/测试机器学习算法,另一个用于病例定义验证。
我们随机选择了 2776 名 45-60 岁的女性进行这项分析,并创建了一个病例定义,该定义包括在 24 个月内两次出现国际疾病分类,第九修订版,临床修正码 627(或任何子码)或在患者图表中出现一次解剖治疗化学分类码 G03CA(或任何子码),这在检测有问题的更年期病例方面非常有效。该定义的敏感性为 81.5%(95%置信区间:76.3-85.9%),特异性为 93.5%(91.9-94.8%),阳性预测值为 73.8%(68.3-78.6%),阴性预测值为 95.7%(94.4-96.8%)。
我们有问题的更年期病例定义表现出了较高的有效性指标,因此预计将有助于流行病学研究和监测。该病例定义将使未来在初级保健环境中探索更年期管理的研究成为可能。