Gilbert Sappho Z, Morrison Conor L, Chen Qiuyu J, Punian Jesman, Bernstein Jodi T, Jessri Mahsa
Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, CT, United States.
Food, Nutrition, and Health Program, Faculty of Land and Food Systems, The University of British Columbia, Vancouver, BC, Canada.
Front Nutr. 2023 Feb 16;9:1013516. doi: 10.3389/fnut.2022.1013516. eCollection 2022.
There is increasing recognition of the value of linking food sales databases to national food composition tables for population nutrition research.
Expanding upon automated and manual database mapping approaches in the literature, our aim was to match 1,179 food products in the Canadian data subset of Euromonitor International's Passport Nutrition to their closest respective equivalents in Health Canada's Canadian Nutrient File (CNF).
Matching took place in two major steps. First, an algorithm based on thresholds of maximal nutrient difference (between Euromonitor and CNF foods) and fuzzy matching was executed to offer match options. If a nutritionally appropriate match was available among the algorithm suggestions, it was selected. When the suggested set contained no nutritionally sound matches, the Euromonitor product was instead manually matched to a CNF food or deemed unmatchable, with the unique addition of expert validation to maximize meticulousness in matching. Both steps were independently performed by at least two team members with dietetics expertise.
Of 1,111 Euromonitor products run through the algorithm, an accurate CNF match was offered for 65% of them; missing or zero-calorie data precluded 68 products from being run in the algorithm. Products with 2 or more algorithm-suggested CNF matches had higher match accuracy than those with one (71 vs. 50%, respectively). Overall, inter-rater agreement (reliability) rates were robust for matches chosen among algorithm options (51%) and even higher regarding whether manual selection would be required (71%); among manually selected CNF matches, reliability was 33%. Ultimately, 1,152 (98%) Euromonitor products were matched to a CNF equivalent.
Our reported matching process successfully bridged a food sales database's products to their respective CNF matches for use in future nutritional epidemiological studies of branded foods sold in Canada. Our team's novel utilization of dietetics expertise aided in match validation at both steps, ensuring rigor and quality of resulting match selections.
将食品销售数据库与国家食品成分表相链接用于人群营养研究的价值,正日益得到认可。
在文献中自动和手动数据库映射方法的基础上进行拓展,我们的目标是将欧睿国际的《护照营养》加拿大数据子集中的1179种食品与其在加拿大卫生部的《加拿大营养素文件》(CNF)中最相近的对应食品进行匹配。
匹配分两个主要步骤进行。首先,执行一种基于最大营养素差异阈值(欧睿数据与CNF食品之间)和模糊匹配的算法,以提供匹配选项。如果在算法建议中存在营养上合适的匹配项,则予以选择。当建议集中没有营养合理的匹配项时,改为将欧睿产品手动与CNF食品进行匹配或判定为无法匹配,特别增加了专家验证以最大化匹配的细致程度。这两个步骤均由至少两名具有饮食学专业知识的团队成员独立执行。
在通过算法运行的1111种欧睿产品中,65%的产品获得了准确的CNF匹配;缺失或零卡路里数据使68种产品无法通过算法运行。有2个或更多算法建议的CNF匹配项的产品,其匹配准确率高于只有1个匹配项的产品(分别为71%和50%)。总体而言,在算法选项中选择的匹配项的评分者间一致性(可靠性)率较高(51%),对于是否需要手动选择的一致性率甚至更高(71%);在手动选择的CNF匹配项中,可靠性为33%。最终,1152种(98%)欧睿产品与CNF中的对应产品进行了匹配。
我们报告的匹配过程成功地将食品销售数据库中的产品与其各自的CNF匹配项相连接,以便用于加拿大销售的品牌食品的未来营养流行病学研究中。我们团队对饮食学专业知识的新颖运用有助于在两个步骤中进行匹配验证,确保了最终匹配选择的严谨性和质量。