Suppr超能文献

中等收入国家两个大型行政数据库的记录关联评估:巴西的死产与孕期登革热通报情况

Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil.

作者信息

Paixão Enny S, Harron Katie, Andrade Kleydson, Teixeira Maria Glória, Fiaccone Rosemeire L, Costa Maria da Conceição N, Rodrigues Laura C

机构信息

London School of Hygiene and Tropical Medicine, Keppel St, Bloomsbury, London, WC1E 7HT, UK.

Instituto de Saúde Coletiva, Rua Basílio da Gama, s/n.Canela, Salvador, Bahia, CEP 40110040, Brazil.

出版信息

BMC Med Inform Decis Mak. 2017 Jul 17;17(1):108. doi: 10.1186/s12911-017-0506-5.

Abstract

BACKGROUND

Due to the increasing availability of individual-level information across different electronic datasets, record linkage has become an efficient and important research tool. High quality linkage is essential for producing robust results. The objective of this study was to describe the process of preparing and linking national Brazilian datasets, and to compare the accuracy of different linkage methods for assessing the risk of stillbirth due to dengue in pregnancy.

METHODS

We linked mothers and stillbirths in two routinely collected datasets from Brazil for 2009-2010: for dengue in pregnancy, notifications of infectious diseases (SINAN); for stillbirths, mortality (SIM). Since there was no unique identifier, we used probabilistic linkage based on maternal name, age and municipality. We compared two probabilistic approaches, each with two thresholds: 1) a bespoke linkage algorithm; 2) a standard linkage software widely used in Brazil (ReclinkIII), and used manual review to identify further links. Sensitivity and positive predictive value (PPV) were estimated using a subset of gold-standard data created through manual review. We examined the characteristics of false-matches and missed-matches to identify any sources of bias.

RESULTS

From records of 678,999 dengue cases and 62,373 stillbirths, the gold-standard linkage identified 191 cases. The bespoke linkage algorithm with a conservative threshold produced 131 links, with sensitivity = 64.4% (68 missed-matches) and PPV = 92.5% (8 false-matches). Manual review of uncertain links identified an additional 37 links, increasing sensitivity to 83.7%. The bespoke algorithm with a relaxed threshold identified 132 true matches (sensitivity = 69.1%), but introduced 61 false-matches (PPV = 68.4%). ReclinkIII produced lower sensitivity and PPV than the bespoke linkage algorithm. Linkage error was not associated with any recorded study variables.

CONCLUSION

Despite a lack of unique identifiers for linking mothers and stillbirths, we demonstrate a high standard of linkage of large routine databases from a middle income country. Probabilistic linkage and manual review were essential for accurately identifying cases for a case-control study, but this approach may not be feasible for larger databases or for linkage of more common outcomes.

摘要

背景

由于不同电子数据集中个体层面信息的可得性不断提高,记录链接已成为一种高效且重要的研究工具。高质量的链接对于得出可靠结果至关重要。本研究的目的是描述巴西国家数据集的准备和链接过程,并比较不同链接方法在评估孕期登革热导致死产风险方面的准确性。

方法

我们将巴西2009 - 2010年两个常规收集的数据集中的母亲和死产记录进行了链接:关于孕期登革热,传染病通报(SINAN);关于死产,死亡率(SIM)。由于没有唯一标识符,我们基于母亲姓名、年龄和市采用概率链接。我们比较了两种概率方法,每种方法有两个阈值:1)一种定制链接算法;2)巴西广泛使用的标准链接软件(ReclinkIII),并通过人工审核来识别更多链接。使用通过人工审核创建的金标准数据子集估计敏感性和阳性预测值(PPV)。我们检查了错误匹配和漏匹配的特征,以识别任何偏差来源。

结果

从678,999例登革热病例和62,373例死产记录中,金标准链接识别出191例病例。具有保守阈值的定制链接算法产生了131个链接,敏感性 = 64.4%(68个漏匹配),PPV = 92.5%(8个错误匹配)。对不确定链接的人工审核又识别出37个链接,使敏感性提高到83.7%。具有宽松阈值的定制算法识别出132个真实匹配(敏感性 = 69.1%),但引入了61个错误匹配(PPV = 68.4%)。ReclinkIII产生的敏感性和PPV低于定制链接算法。链接错误与任何记录的研究变量均无关联。

结论

尽管缺乏用于链接母亲和死产的唯一标识符,但我们展示了一个中等收入国家大型常规数据库的高标准链接。概率链接和人工审核对于准确识别病例对照研究中的病例至关重要,但这种方法对于更大的数据库或更常见结局的链接可能不可行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3310/5513351/1692035604ab/12911_2017_506_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验