Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States.
Columbia University School of Nursing, New York, NY 10032, United States.
J Am Med Inform Assoc. 2024 Jan 18;31(2):289-297. doi: 10.1093/jamia/ocad198.
To determine if different formats for conveying machine learning (ML)-derived postpartum depression risks impact patient classification of recommended actions (primary outcome) and intention to seek care, perceived risk, trust, and preferences (secondary outcomes).
We recruited English-speaking females of childbearing age (18-45 years) using an online survey platform. We created 2 exposure variables (presentation format and risk severity), each with 4 levels, manipulated within-subject. Presentation formats consisted of text only, numeric only, gradient number line, and segmented number line. For each format viewed, participants answered questions regarding each outcome.
Five hundred four participants (mean age 31 years) completed the survey. For the risk classification question, performance was high (93%) with no significant differences between presentation formats. There were main effects of risk level (all P < .001) such that participants perceived higher risk, were more likely to agree to treatment, and more trusting in their obstetrics team as the risk level increased, but we found inconsistencies in which presentation format corresponded to the highest perceived risk, trust, or behavioral intention. The gradient number line was the most preferred format (43%).
All formats resulted high accuracy related to the classification outcome (primary), but there were nuanced differences in risk perceptions, behavioral intentions, and trust. Investigators should choose health data visualizations based on the primary goal they want lay audiences to accomplish with the ML risk score.
确定传达机器学习(ML)衍生的产后抑郁症风险的不同格式是否会影响患者对推荐行动的分类(主要结局)以及寻求护理的意愿、感知风险、信任和偏好(次要结局)。
我们使用在线调查平台招募了讲英语的育龄女性(18-45 岁)。我们创建了 2 个暴露变量(呈现格式和风险严重程度),每个变量都有 4 个水平,在被试内进行操作。呈现格式包括仅文本、仅数字、梯度数字线和分段数字线。对于查看的每种格式,参与者都回答了与每个结局相关的问题。
504 名参与者(平均年龄 31 岁)完成了调查。对于风险分类问题,表现很高(93%),呈现格式之间没有显著差异。风险水平存在主要影响(均 P < .001),即随着风险水平的增加,参与者感知到更高的风险,更有可能同意治疗,对他们的产科团队更信任,但我们发现哪种呈现格式对应于最高的感知风险、信任或行为意向存在不一致性。梯度数字线是最受欢迎的格式(43%)。
所有格式在与分类结局(主要结局)相关的准确性方面都很高,但在风险感知、行为意向和信任方面存在细微差异。研究人员应根据他们希望非专业观众使用 ML 风险评分实现的主要目标来选择健康数据可视化。