Lemmers Simone A M, Le Luyer Mona, Stoll Samantha J, Hoffnagle Alison G, Ferrell Rebecca J, Gamble Julia A, Guatelli-Steinberg Debbie, Gurian Kaita N, McGrath Kate, O'Hara Mackie C, Smith Andrew D A C, Dunn Erin C
Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America.
Department of Psychiatry, Harvard Medical School, Boston, Massachusetts, United States of America.
PLoS One. 2025 Mar 19;20(3):e0318700. doi: 10.1371/journal.pone.0318700. eCollection 2025.
Accentuated Lines (ALs) in tooth enamel can reflect metabolic disruptions from physiological or psychological stresses during development. They can therefore serve as a retrospective biomarker of generalized stress exposure in archaeological and clinical research. However, little consensus exists on when ALs are identified and inter-rater reliability is poorly quantified across studies. Here, we sought to address this gap by examining the reliability of accentuated (AL) markings across raters, in terms of both the presence versus absence of ALs and their intensity (HAL= Highly Accentuated, MAL= Mildly Accentuated, RL= Retzius Line). Ratings were made and compared across observers (with different levels of experience) and pairs of raters (who agreed on AL coding through consensus meetings) (N = 15 teeth, eight observers). Results indicated that more experience in AL assessment does not necessarily produce higher reliability between raters. Most disagreements in intensity ratings occurred in categories other than HAL. Furthermore, when AL assessment was performed by pairs of raters, reliability was significantly higher than individual assessments (Gwet's AC1 = 0.28 to 0.56 for line presence assessment; Gwet's AC1 = 0.48 to 0.64 for line intensity assessment). Based on these results, we recommend a workflow called IRRISS (Improving Reliability and Reporting In Scoring of Stress-markers) to increase rigor and reproducibility in histological analysis of dental collections. The introduction of IRRISS is well-timed, given the surge in studies of teeth occurring across anthropological, epidemiological, medical, forensic, and climate research fields.
牙釉质中的加重线(ALs)可以反映发育过程中生理或心理压力导致的代谢紊乱。因此,它们可作为考古学和临床研究中全身应激暴露的回顾性生物标志物。然而,对于何时识别加重线以及不同研究间评分者间信度量化不足,目前几乎没有共识。在此,我们试图通过检查不同评分者对加重(AL)标记的可靠性来填补这一空白,包括加重线的有无及其强度(HAL = 高度加重,MAL = 轻度加重,RL = 芮氏线)。在不同经验水平的观察者以及通过共识会议就AL编码达成一致的评分者对之间进行评分并比较(N = 15颗牙齿,8名观察者)。结果表明,在加重线评估方面经验更丰富并不一定会使评分者之间的信度更高。强度评分中的大多数分歧发生在HAL以外的类别中。此外,当由评分者对进行AL评估时,信度显著高于个体评估(线存在评估的Gwet's AC1 = 0.28至0.56;线强度评估的Gwet's AC1 = 0.48至0.64)。基于这些结果,我们推荐一种名为IRRISS(提高应激标记评分的可靠性和报告质量)的工作流程,以提高牙齿组织学分析的严谨性和可重复性。鉴于人类学、流行病学、医学、法医学和气候研究领域中对牙齿研究的激增,IRRISS的引入时机恰到好处。