Department of Epidemiology, University of North Carolina Chapel Hill, Chapel Hill, North Carolina, USA.
Center for Applied Transgender Studies, Chicago, Illinois, USA.
Pharmacoepidemiol Drug Saf. 2024 Mar;33(3):e5732. doi: 10.1002/pds.5732. Epub 2023 Nov 27.
With the expansion of research utilizing electronic healthcare data to identify transgender (TG) population health trends, the validity of computational phenotype (CP) algorithms to identify TG patients is not well understood. We aim to identify the current state of the literature that has utilized CPs to identify TG people within electronic healthcare data and their validity, potential gaps, and a synthesis of future recommendations based on past studies.
Authors searched the National Library of Medicine's PubMed, Scopus, and the American Psychological Association PsycInfo's databases to identify studies published in the United States that applied CPs to identify TG people within electronic healthcare data.
Twelve studies were able to validate or enhance the positive predictive value (PPV) of their CP through manual chart reviews (n = 5), hierarchy of code mechanisms (n = 4), key text-strings (n = 2), or self-surveys (n = 1). CPs with the highest PPV to identify TG patients within their study population contained diagnosis codes and other components such as key text-strings. However, if key text-strings were not available, researchers have been able to find most TG patients within their electronic healthcare databases through diagnosis codes alone.
CPs with the highest accuracy to identify TG patients contained diagnosis codes along with components such as procedural codes or key text-strings. CPs with high validity are essential to identifying TG patients when self-reported gender identity is not available. Still, self-reported gender identity information should be collected within electronic healthcare data as it is the gold standard method to better understand TG population health patterns.
随着利用电子医疗数据研究 transgender(跨性别)人群健康趋势的不断扩展,计算表型(CP)算法用于识别跨性别患者的有效性尚不清楚。我们旨在确定当前利用 CP 从电子医疗数据中识别跨性别者的文献状况,以及其有效性、潜在差距,并根据过去的研究综合提出未来的建议。
作者在美国国家医学图书馆的 PubMed、Scopus 和美国心理协会的 PsycInfo 数据库中进行搜索,以确定在美国发表的应用 CP 从电子医疗数据中识别跨性别者的研究。
有 12 项研究通过手动图表审查(n=5)、代码机制层次结构(n=4)、关键文本字符串(n=2)或自我调查(n=1),能够验证或提高 CP 的阳性预测值(PPV)。在其研究人群中识别跨性别患者的 CP 具有最高 PPV,包含诊断代码和其他组件,如关键文本字符串。然而,如果没有关键文本字符串,研究人员已经能够通过仅诊断代码在电子医疗数据库中找到大多数跨性别患者。
具有最高识别跨性别患者准确性的 CP 包含诊断代码以及手术代码或关键文本字符串等组件。当无法获得自我报告的性别认同时,具有高有效性的 CP 对于识别跨性别患者至关重要。然而,应在电子医疗数据中收集自我报告的性别认同信息,因为这是更好地了解跨性别者人群健康模式的金标准方法。