Albert Kendra, Delano Maggie
Cyberlaw Clinic, Harvard Law School, Cambridge, MA 02138, USA.
Engineering Department, Swarthmore College, Swarthmore, PA 19146, USA.
Patterns (N Y). 2022 Aug 12;3(8):100534. doi: 10.1016/j.patter.2022.100534.
False assumptions that sex and gender are binary, static, and concordant are deeply embedded in the medical system. As machine learning researchers use medical data to build tools to solve novel problems, understanding how existing systems represent sex/gender incorrectly is necessary to avoid perpetuating harm. In this perspective, we identify and discuss three factors to consider when working with sex/gender in research: "sex/gender slippage," the frequent substitution of sex and sex-related terms for gender and vice versa; "sex confusion," the fact that any given sex variable holds many different potential meanings; and "sex obsession," the idea that the relevant variable for most inquiries related to sex/gender is sex assigned at birth. We then explore how these phenomena show up in medical machine learning research using electronic health records, with a specific focus on HIV risk prediction. Finally, we offer recommendations about how machine learning researchers can engage more carefully with questions of sex/gender.
认为性别是二元、固定且一致的错误假设在医疗系统中根深蒂固。随着机器学习研究人员利用医疗数据构建工具来解决新问题,了解现有系统如何错误地呈现性别对于避免持续造成伤害至关重要。从这个角度来看,我们识别并讨论在研究中处理性别时需要考虑的三个因素:“性别混淆”,即频繁用性别和与性别相关的术语替代性别,反之亦然;“性别混乱”,即任何给定的性别变量都有许多不同的潜在含义这一事实;以及“性别痴迷”,即认为与性别相关的大多数询问的相关变量是出生时指定的性别的观点。然后,我们探讨这些现象如何在使用电子健康记录的医疗机器学习研究中出现,特别关注艾滋病毒风险预测。最后,我们就机器学习研究人员如何更谨慎地处理性别问题提供建议。