Vacek Jaroslav, Vonkova Hana, Gabrhelík Roman
Department of Addictology, First Faculty of Medicine, Charles University, Apolinářská 4, 120 00, Prague, Czech Republic.
Department of Education and Institute for Research and Development of Education, Faculty of Education, Charles University, Prague, Czech Republic.
Prev Sci. 2017 May;18(4):450-458. doi: 10.1007/s11121-017-0772-6.
We conducted a feasibility study for matching children (N = 2571, average age 12 years, 50.4% female) and their parents (N = 1931, average age 41 years, 83.3% female) represented by an anonymous self-generated identification code (SGIC) and assessed its methodological properties. We used a nine-character SGIC with the children and a mirrored version of the same code with the parents. The average overall error rate in generating the SGIC was 9.7% (4.0% in the parents and 13.9% in the children). We were able to link a total of 1765 parents' and children's codes uniquely (94.9% of all possible dyads) with any four-character combination and the employment of the "school" variable. The overall matching quality of linking using the SGIC only is characterized by precision (positive predictive value) of 0.979, recall (sensitivity, true positive rate) of 0.934, and an F-measure (harmonic mean of precision and recall) of 0.956. The analysis of the discrepant characters in the dyads identified the paternal grandmother's name and eye color as those varying most often. This study is the first to look at SGIC match rates and error and omission rates in linking different subjects into dyads in prevention research. We identified a high number of unique child-parent matches while guaranteeing anonymity to the participants. We provided evidence that our SGIC is a suitable tool for between-group linking procedures and has a highly successful matching rate, while maintaining anonymity in the school-based prevention study samples.
我们开展了一项可行性研究,以匹配由匿名的自行生成识别码(SGIC)所代表的儿童(N = 2571,平均年龄12岁,50.4%为女性)及其父母(N = 1931,平均年龄41岁,83.3%为女性),并评估其方法学特性。我们为儿童使用了一个九字符的SGIC,为父母使用了同一代码的镜像版本。生成SGIC时的平均总体错误率为9.7%(父母为4.0%,儿童为13.9%)。我们能够通过任何四个字符的组合以及使用“学校”变量,唯一地链接总共1765个父母和孩子的代码(占所有可能二元组的94.9%)。仅使用SGIC进行链接的总体匹配质量的特征在于,精确率(阳性预测值)为0.979,召回率(敏感度、真阳性率)为0.934,F值(精确率和召回率的调和均值)为0.956。对二元组中差异字符的分析确定,祖母的名字和眼睛颜色是变化最频繁的。本研究首次在预防研究中考察了将不同受试者链接成二元组时的SGIC匹配率、错误率和遗漏率。我们在保证参与者匿名的同时,识别出了大量唯一的儿童 - 父母匹配。我们提供了证据,证明我们的SGIC是用于组间链接程序的合适工具,具有非常高的成功匹配率,同时在基于学校的预防研究样本中保持匿名性。