Weiss Nancy S, Cooper Sharon P, Socias Christina, Weiss Ronnie A, Chen Vivien W
J Registry Manag. 2015 Fall;42(3):103-10.
Usual industry and occupation text information have been collected by central cancer registries but few have had the resources to code these data, limiting their usefulness for assessing occupational cancer risks.
This project was undertaken to use software available from the National Institute for Occupational Safety and Health (NIOSH) to code industry and occupation information in cancer records reported to the Texas Cancer Registry (TCR) and the Louisiana Tumor Registry (LTR) and to assess the feasibility of its use in ongoing registry operations; to assess the quality of the reported information; and to determine its usefulness in occupational cancer research.
De-identified data files of TCR (n = 103,276) and LTR (n = 26,090) cancer records were obtained for diagnosis years 2010 and 2011, respectively, for cases aged 14 years and older, with industry and occupation text. These data fields were coded to the 2000 US Census Bureau using the NIOSH Industry and Occupation Computerized Coding System (NIOCCS) software at the high level confidence (90% or greater accuracy) and through manual code assignments for records not coded by NIOCCS.
NIOCCS assigned a code for 37.2% of TCR records and 59.9% of LTR records. Examination of the quality of the coded data found 44.2% of TCR records and 31.1% of LTR records to have missing, unknown, or otherwise insufficient text for assigning a specific industry and occupation code. Additionally, the vague noninformative category of "retired" was reported for 14.9% and 11.2% of TCR and LTR records, respectively. Records with "homemaker/housewife" or those with terms indicating that they never worked represented 7.2% of TCR cases and 9.7% of LTR cases. Excluding the unknown, never worked, and retired categories, no one specific industry or occupation major grouping represented more than 5% of cases in either of the registries.
NIOCCS is a helpful tool for coding industry and occupation text and continues to improve, but other registry resources are required for implementation into ongoing operations. Improvement in data quality of reported text information in cancer records is paramount to maximize the efficiency of NIOCCS and improve the availability of coded, specific industry and occupation information for occupational cancer research.
中央癌症登记处已收集了常见的行业和职业文本信息,但很少有机构有资源对这些数据进行编码,这限制了它们在评估职业性癌症风险方面的作用。
本项目旨在使用美国国家职业安全与健康研究所(NIOSH)提供的软件,对向德克萨斯州癌症登记处(TCR)和路易斯安那州肿瘤登记处(LTR)报告的癌症记录中的行业和职业信息进行编码,并评估其在登记处日常运作中使用的可行性;评估报告信息的质量;并确定其在职业性癌症研究中的作用。
分别获取了TCR(n = 103,276)和LTR(n = 26,090)癌症记录的去识别化数据文件,这些记录来自2010年和2011年诊断的14岁及以上、带有行业和职业文本信息的病例。使用NIOSH行业和职业计算机编码系统(NIOCCS)软件,以高置信度(90%或更高的准确率)将这些数据字段编码到2000年美国人口普查局的分类标准,对于NIOCCS未编码的记录则通过人工编码赋值。
NIOCCS为37.2%的TCR记录和59.9%的LTR记录分配了代码。对编码数据质量的检查发现,44.2%的TCR记录和31.1%的LTR记录存在缺失、未知或其他不足以分配特定行业和职业代码的文本信息。此外,分别有14.9%的TCR记录和11.2%的LTR记录报告了含义模糊的“退休”类别。记录中“家庭主妇/家庭主夫”或表明从未工作的类别分别占TCR病例的7.2%和LTR病例的9.7%。排除未知、从未工作和退休类别后,在两个登记处中,没有一个特定的行业或职业主要分组占病例数超过5%。
NIOCCS是编码行业和职业文本的有用工具,并且在不断改进,但要在日常运作中实施还需要其他登记处资源。提高癌症记录中报告文本信息的数据质量对于最大化NIOCCS的效率以及改善用于职业性癌症研究的编码后的特定行业和职业信息的可用性至关重要。