颅骨测量中的误差测量：使用2000个模拟颅长数据集（g-op）的四种常用评估方法的比较性能。

Fancourt Hayley S M, Stephan Carl N

Laboratory for Human Craniofacial and Skeletal Identification (HuCS-ID Lab), School of Biomedical Sciences, The University of Queensland, Brisbane 4072, Australia.

Forensic Sci Int. 2018 Apr;285:162-171. doi: 10.1016/j.forsciint.2018.02.008. Epub 2018 Feb 21.

For measurements to be accurate and precise, measurement errors should be small. In the anthropometry and craniofacial identification literature, four methods are commonly used for assessing measurement error: Pearson's product moment correlation coefficient (r), intra-class correlation coefficients (ICC), statistical significance tests (often reported by P-values) and the technical error of measurement (TEM; also known as Dalberg's error/ratio). In this paper, the performance of all four of these statistics were evaluated using maximum cranial lengths (g-op) from Howells (n=2524), by duplicating the dataset and mathematically adding known degrees of error to the second set. This was repeated under a broad array of trials (2000 total) each with slightly different amounts of error simulation to comprehensively assess the four error metrics in terms of descriptive power and utility, using the same data for each of the four error assessment methods. Data simulations included the addition of random and systematic errors of different sizes with absolute differences ranging from 1 to 50mm (or in relative terms, 28% of the original measurement). Two sample sizes (n=25 and 2524 individuals) were explored and all analyses were conducted in R. P-values from Student's t-tests only showed significant differences (P<0.05) for the larger sample size when the error was systematic. Small samples, and/or any with random error, did not yield low or significant P-values (P<0.05). When raw differences were <4mm for 95% of the sample (n=2524), the ICC and r were high (>0.97) and remained so even after tripling the error, such that 95% of the sample possessed raw differences up to 12mm (r=0.8). In contrast, the TEM was low initially (<2mm or r-TEM<1%), and then increased (<4.5mm and 2.5%, TEM and r-TEM respectively). These data show that P-values, ICC and r values hold substantial limits for error description as they do not always flag error well. In contrast, TEM appears to covary with error more saliently and holds the advantage that changes are reported in the units of the original measurement. For these reasons, TEM is recommended in favour to P-values, ICC and r.

为使测量准确且精确，测量误差应较小。在人体测量学和颅面识别文献中，通常使用四种方法来评估测量误差：皮尔逊积矩相关系数（r）、组内相关系数（ICC）、统计显著性检验（通常用P值报告）以及测量技术误差（TEM；也称为达尔伯格误差/比率）。在本文中，使用豪威尔斯的最大颅长（g-op）（n = 2524）对这四种统计方法的性能进行了评估，方法是复制数据集，并在数学上给第二组数据添加已知程度的误差。在一系列广泛的试验（总共2000次）中重复此操作，每次试验的误差模拟量略有不同，以便使用四种误差评估方法中的每一种都使用相同的数据，从描述能力和实用性方面全面评估这四种误差指标。数据模拟包括添加不同大小的随机误差和系统误差，绝对差值范围为1至50毫米（或相对而言，为原始测量值的28%）。研究了两种样本量（n = 25和2524个人），所有分析均在R中进行。来自学生t检验的P值仅在误差为系统性且样本量较大时显示出显著差异（P < 0.05）。小样本以及/或者任何存在随机误差的样本，都没有产生低的或显著的P值（P < 0.05）。当95%的样本（n = 2524）的原始差值<4毫米时，ICC和r很高（>0.97），即使将误差增加两倍后仍然如此，以至于95%的样本的原始差值高达12毫米（r = 0.8）。相比之下，TEM最初较低（<2毫米或r-TEM<1%），然后增加（分别为<4.5毫米和2.5%，TEM和r-TEM）。这些数据表明，P值、ICC和r值在误差描述方面存在很大局限性，因为它们并不总是能很好地标记误差。相比之下，TEM似乎与误差的相关性更显著，并且具有以原始测量单位报告变化的优势。出于这些原因，推荐使用TEM而非P值、ICC和r。

相似文献

Error measurement in craniometrics: The comparative performance of four popular assessment methods using 2000 simulated cranial length datasets (g-op).

Forensic Sci Int. 2018 Apr;285:162-171. doi: 10.1016/j.forsciint.2018.02.008. Epub 2018 Feb 21.

Error quantification of osteometric data in forensic anthropology.

Forensic Sci Int. 2018 Jun;287:183-189. doi: 10.1016/j.forsciint.2018.04.004. Epub 2018 Apr 10.

Anthropometric measurement error and the assessment of nutritional status.

Br J Nutr. 1999 Sep;82(3):165-77. doi: 10.1017/s0007114599001348.

Accuracy and reliability of virtual femur measurement from CT scan.

J Forensic Leg Med. 2019 Apr;63:11-17. doi: 10.1016/j.jflm.2019.02.010. Epub 2019 Feb 21.

Accuracy and reliability of measurements obtained from computed tomography 3D volume rendered images.

Forensic Sci Int. 2014 May;238:133-40. doi: 10.1016/j.forsciint.2014.03.005. Epub 2014 Mar 15.

Random errors in anthropometry.

J Hum Ergol (Tokyo). 1996 Dec;25(2):155-66.

Agreement and error rates associated with standardized data collection protocols for skeletal and dental data on 3D virtual subadult crania.

Forensic Sci Int. 2022 May;334:111272. doi: 10.1016/j.forsciint.2022.111272. Epub 2022 Mar 15.

Intraobserver error associated with measurements of the hand.

Am J Hum Biol. 2005 May-Jun;17(3):368-71. doi: 10.1002/ajhb.20129.

Variation within physical and digital craniometrics.

Forensic Sci Int. 2020 Jan;306:110092. doi: 10.1016/j.forsciint.2019.110092. Epub 2019 Nov 29.

Use of units of measurement error in anthropometric comparisons.

Anthropol Anz. 2017 Sep 1;74(3):183-192. doi: 10.1127/anthranz/2017/0628. Epub 2017 Aug 1.

引用本文的文献

Assessing the contribution of orthodontic profiles in predicting facial soft tissue thickness for forensic facial approximation.

Int J Legal Med. 2025 Jun 18. doi: 10.1007/s00414-025-03542-x.

Sex estimation using metrics of the innominate: A test of the DSP2 method.

J Forensic Sci. 2025 Jan;70(1):249-257. doi: 10.1111/1556-4029.15645. Epub 2024 Oct 26.

Ancestry estimation in forensic anthropology: accuracy of the AncesTrees software in a Brazilian sample.

Forensic Sci Res. 2023 Nov 10;8(3):202-210. doi: 10.1093/fsr/owad030. eCollection 2023 Sep.

Lip morphology estimation models based on three-dimensional images in a modern adult population from China.

Int J Legal Med. 2021 Sep;135(5):1887-1901. doi: 10.1007/s00414-021-02559-2. Epub 2021 Mar 24.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Error measurement in craniometrics: The comparative performance of four popular assessment methods using 2000 simulated cranial length datasets (g-op).

Forensic Sci Int. 2018 Apr;285:162-171. doi: 10.1016/j.forsciint.2018.02.008. Epub 2018 Feb 21.

Error quantification of osteometric data in forensic anthropology.

Forensic Sci Int. 2018 Jun;287:183-189. doi: 10.1016/j.forsciint.2018.04.004. Epub 2018 Apr 10.

Anthropometric measurement error and the assessment of nutritional status.

Br J Nutr. 1999 Sep;82(3):165-77. doi: 10.1017/s0007114599001348.

Accuracy and reliability of virtual femur measurement from CT scan.

J Forensic Leg Med. 2019 Apr;63:11-17. doi: 10.1016/j.jflm.2019.02.010. Epub 2019 Feb 21.

Accuracy and reliability of measurements obtained from computed tomography 3D volume rendered images.

Forensic Sci Int. 2014 May;238:133-40. doi: 10.1016/j.forsciint.2014.03.005. Epub 2014 Mar 15.

Random errors in anthropometry.

J Hum Ergol (Tokyo). 1996 Dec;25(2):155-66.

Agreement and error rates associated with standardized data collection protocols for skeletal and dental data on 3D virtual subadult crania.

Forensic Sci Int. 2022 May;334:111272. doi: 10.1016/j.forsciint.2022.111272. Epub 2022 Mar 15.

Intraobserver error associated with measurements of the hand.

Am J Hum Biol. 2005 May-Jun;17(3):368-71. doi: 10.1002/ajhb.20129.

Variation within physical and digital craniometrics.

Forensic Sci Int. 2020 Jan;306:110092. doi: 10.1016/j.forsciint.2019.110092. Epub 2019 Nov 29.

Use of units of measurement error in anthropometric comparisons.

Anthropol Anz. 2017 Sep 1;74(3):183-192. doi: 10.1127/anthranz/2017/0628. Epub 2017 Aug 1.

引用本文的文献

Assessing the contribution of orthodontic profiles in predicting facial soft tissue thickness for forensic facial approximation.

Int J Legal Med. 2025 Jun 18. doi: 10.1007/s00414-025-03542-x.

Sex estimation using metrics of the innominate: A test of the DSP2 method.

J Forensic Sci. 2025 Jan;70(1):249-257. doi: 10.1111/1556-4029.15645. Epub 2024 Oct 26.

Ancestry estimation in forensic anthropology: accuracy of the AncesTrees software in a Brazilian sample.

Forensic Sci Res. 2023 Nov 10;8(3):202-210. doi: 10.1093/fsr/owad030. eCollection 2023 Sep.

Lip morphology estimation models based on three-dimensional images in a modern adult population from China.

Int J Legal Med. 2021 Sep;135(5):1887-1901. doi: 10.1007/s00414-021-02559-2. Epub 2021 Mar 24.

Error measurement in craniometrics: The comparative performance of four popular assessment methods using 2000 simulated cranial length datasets (g-op).

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献