Department of Medicine, University of Washington, Seattle, WA, USA.
BMC Womens Health. 2013 Feb 5;13:3. doi: 10.1186/1472-6874-13-3.
Diagnostic test sets are a valuable research tool that contributes importantly to the validity and reliability of studies that assess agreement in breast pathology. In order to fully understand the strengths and weaknesses of any agreement and reliability study, however, the methods should be fully reported. In this paper we provide a step-by-step description of the methods used to create four complex test sets for a study of diagnostic agreement among pathologists interpreting breast biopsy specimens. We use the newly developed Guidelines for Reporting Reliability and Agreement Studies (GRRAS) as a basis to report these methods.
Breast tissue biopsies were selected from the National Cancer Institute-funded Breast Cancer Surveillance Consortium sites. We used a random sampling stratified according to woman's age (40-49 vs. ≥50), parenchymal breast density (low vs. high) and interpretation of the original pathologist. A 3-member panel of expert breast pathologists first independently interpreted each case using five primary diagnostic categories (non-proliferative changes, proliferative changes without atypia, atypical ductal hyperplasia, ductal carcinoma in situ, and invasive carcinoma). When the experts did not unanimously agree on a case diagnosis a modified Delphi method was used to determine the reference standard consensus diagnosis. The final test cases were stratified and randomly assigned into one of four unique test sets.
We found GRRAS recommendations to be very useful in reporting diagnostic test set development and recommend inclusion of two additional criteria: 1) characterizing the study population and 2) describing the methods for reference diagnosis, when applicable.
诊断测试集是一种有价值的研究工具,对评估乳腺病理学一致性的研究的有效性和可靠性有重要贡献。然而,为了充分了解任何一致性和可靠性研究的优缺点,方法应该得到充分报告。在本文中,我们提供了创建用于研究病理学家解读乳腺活检标本的诊断一致性的四个复杂测试集的方法的逐步描述。我们使用新开发的报告可靠性和一致性研究指南(GRRAS)作为报告这些方法的基础。
从美国国立癌症研究所资助的乳腺监测联盟(Breast Cancer Surveillance Consortium)站点选择乳腺组织活检样本。我们使用根据女性年龄(40-49 岁与≥50 岁)、实质乳腺密度(低与高)和原始病理学家的解释进行分层的随机抽样。由 3 名专家乳腺病理学家组成的小组首先使用五个主要诊断类别(非增生性变化、非典型增生性增生、不典型导管增生、导管原位癌和浸润性癌)独立地对每个病例进行解释。当专家对病例诊断未达成一致时,采用改良 Delphi 法确定参考标准共识诊断。最终的测试病例被分层并随机分配到四个独特的测试集中的一个。
我们发现 GRRAS 建议在报告诊断测试集开发时非常有用,并建议包括两个额外的标准:1)描述研究人群,2)在适用时描述参考诊断方法。