Sherman Recinda L, Boscoe Francis P, O'Brien David K, George Justin T, Henry Kevin A, Soloway Laura E, Lee David J
J Registry Manag. 2014 Fall;41(3):120-4.
Intrarecord edits on site-sex combinations are a standard tool to identify errors in the coding of sex in cancer registry data. However, the percentage of sex-specific cancers, like cervix, is low (20 percent of total invasive cases). Visual review and follow-back to improve the quality of the sex coding is labor intensive and typically only performed as a special project on subsets of data. The New York State Cancer Registry (NYSCR) created an edit for identifying potential sex misclassification in cancer registry data and has made its components available for use through the North American Association of Central Cancer Registries (NAACCR). The edit uses the most popular male and female first names based on decade of birth to identify potentially miscoded cases. This paper provides a summary of 3 independently conducted assessments of the sex edit at the central cancer registry level and includes a focus on misclassification of sex for breast cancer.
The sex edit was applied in 3 state cancer registries: Alabama, Alaska, and Florida. Alabama applied the edit to their entire database for 1996-2004 (N = 190,614) and compared the results to external databases available to most cancer registries. Alaska applied the edit to their entire database (N = 46,645) and were able to compare the results to 2 unique, state-based databases (Alaska Permanent Fund Dividend database and State Troopers database). Florida applied the sex edit to a sample of sites (n = 953,074) with particular attention to breast cancer. RESULTS for breast cases were compared to results from an a priori quality control project on Florida male breast cancer cases. Using the Florida data, issues specific to male breast cancer were evaluated.
In Alabama, 45 percent of 977 cases flagged as potentially miscoded sex were determined to be miscodes. In Alaska, 19 percent of 88 cases flagged as potentially miscoded sex were determined to be miscodes but the percent of miscoded cases identified by the edit more than doubled in the most recent years of data. For the Florida male breast cancer comparison, the sex edit correctly identified 729 of 903 cases known to be miscoded (81 percent) and was unable to assign a potential sex on the remaining 174 cases-but did not incorrectly flag any cases as miscodes.
The sex edit is a useful tool for identifying cases that require further review to confirm the reported sex code is correct. However, it only assesses 69 percent to 84 percent of cases based on name and, of those flagged, only 19 percent to 45 percent are true misclassifications. But for breast cancer, a site with a skewed male to female ratio, the verified misclassification rate was 100 percent of the male breast cancer cases flagged as potential females. The proper application of the sex edit can improve the quality of the sex variable and can greatly reduce the impact of miscoded sex on gender-skewed sites like male breast cancer.
对部位-性别组合进行记录内编辑是识别癌症登记数据中性别编码错误的标准工具。然而,特定性别的癌症,如子宫颈癌,所占比例较低(占所有浸润性病例的20%)。通过目视审查和追溯来提高性别编码质量需要耗费大量人力,通常仅作为针对部分数据子集的特殊项目来开展。纽约州癌症登记处(NYSCR)创建了一种用于识别癌症登记数据中潜在性别错误分类的编辑方法,并已通过北美中央癌症登记协会(NAACCR)提供其组成部分以供使用。该编辑方法基于出生年代使用最常见的男性和女性名字来识别可能编码错误的病例。本文总结了在中央癌症登记层面独立开展的3次对性别编辑的评估,并重点关注乳腺癌的性别错误分类情况。
性别编辑应用于3个州的癌症登记处:阿拉巴马州、阿拉斯加州和佛罗里达州。阿拉巴马州将该编辑方法应用于其1996 - 2004年的整个数据库(N = 190,614),并将结果与大多数癌症登记处可获取的外部数据库进行比较。阿拉斯加州将该编辑方法应用于其整个数据库(N = 46,645),并能够将结果与2个基于该州的独特数据库(阿拉斯加永久基金股息数据库和州警数据库)进行比较。佛罗里达州将性别编辑应用于部分部位的样本(n = 953,074),特别关注乳腺癌。将乳腺癌病例的结果与佛罗里达州男性乳腺癌病例的一个先验质量控制项目的结果进行比较。利用佛罗里达州的数据,对男性乳腺癌特有的问题进行了评估。
在阿拉巴马州,被标记为潜在性别编码错误的977例病例中,45%被确定为编码错误。在阿拉斯加州,被标记为潜在性别编码错误的88例病例中,19%被确定为编码错误,但在最近几年的数据中,该编辑方法识别出的编码错误病例百分比增加了一倍多。对于佛罗里达州男性乳腺癌的比较,性别编辑正确识别出已知编码错误的903例病例中的729例(81%),对于其余174例病例无法确定潜在性别,但没有将任何病例错误标记为编码错误。
性别编辑是一种有用的工具,可用于识别需要进一步审查以确认报告的性别编码是否正确的病例。然而,它仅根据名字评估69%至84%的病例,在这些被标记的病例中,只有19%至45%是真正的错误分类。但对于乳腺癌这种男性与女性比例失衡的部位,被标记为潜在女性的男性乳腺癌病例的经核实的错误分类率为100%。正确应用性别编辑可以提高性别变量的质量,并能大大减少性别编码错误对像男性乳腺癌这样性别失衡部位的影响。