Department of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
Department of Emergency Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
Cancer. 2019 Sep 1;125(17):2926-2934. doi: 10.1002/cncr.32118. Epub 2019 May 15.
The rarity and heterogeneity of sarcomas make performing appropriately powered studies challenging and magnify the significance of large databases in sarcoma research. Established large tumor registries and population-based databases have become increasingly relevant for answering clinical questions regarding sarcoma incidence, treatment patterns, and outcomes. However, the validity of large databases has been questioned and scrutinized because of the inaccuracy and wide variability of coding practices and the absence of clinically relevant variables. In addition, the utilization of large databases for the study of rare cancers such as sarcoma may be particularly challenging because of the known limitations of administrative data and poor overall data quality. Currently, there are several large national cancer databases, including the Surveillance, Epidemiology, and End Results database, the National Cancer Data Base of the American College of Surgeons and the American Cancer Society, and the National Program of Cancer Registries of the Centers for Disease Control and Prevention. These databases are often used for sarcoma research, but they are limited by their dependence on administrative or billing data, the lack of agreement between chart abstractors on diagnosis codes, and the use of preexisting documented hospital diagnosis codes for tumor registries, which lead to a significant underestimation of sarcomas in large data sets. Current and future initiatives to improve databases and big data applications for sarcoma research include increasing the utilization of sarcoma-specific registries and encouraging national initiatives to expand on real-world, evidence-based data sets.
肉瘤的罕见性和异质性使得进行适当有力的研究具有挑战性,并凸显了大型数据库在肉瘤研究中的重要性。已建立的大型肿瘤登记处和基于人群的数据库对于回答有关肉瘤发病率、治疗模式和结果的临床问题变得越来越重要。然而,由于编码实践的不准确和广泛变化以及缺乏临床相关变量,大型数据库的有效性受到了质疑和审查。此外,由于行政数据的已知局限性和整体数据质量较差,利用大型数据库研究肉瘤等罕见癌症可能特别具有挑战性。目前,有几个大型国家癌症数据库,包括监测、流行病学和最终结果数据库、美国外科医师学院和美国癌症协会的国家癌症数据库以及疾病预防控制中心的国家癌症登记处计划。这些数据库常用于肉瘤研究,但它们受到行政或计费数据的依赖、图表摘要者对诊断代码缺乏一致性以及肿瘤登记处使用预先存在的有文件证明的医院诊断代码的限制,这导致在大型数据集中国肉瘤的严重低估。当前和未来旨在改善数据库和肉瘤研究大数据应用的举措包括增加肉瘤特异性登记处的利用,并鼓励国家举措扩大真实世界的循证数据集。