Cossin Sebastien, Diouf Serigne, Griffier Romain, Le Barrois d'Orgeval Philippine, Diallo Gayo, Jouhet Vianney
CHU de Bordeaux, Pôle de Santé Publique, Service d'information Médicale, Informatique et Archivistique Médicales (IAM), Bordeaux F-33000, France.
Inserm, Bordeaux Population Health Research Center, Team ERIAS, University of Bordeaux, UMR 1219, Bordeaux F-33000, France.
JAMIA Open. 2021 Mar 1;4(1):ooab005. doi: 10.1093/jamiaopen/ooab005. eCollection 2021 Jan.
Vital status is of central importance to hospital clinical research. However, hospital information systems record only in-hospital death information. Recently, the French government released a publicly available dataset containing death-certificate data for over 25 million individuals. The objective of this study was to link French death certificates to the Bordeaux University Hospital records to complete the vital status information.
Our linkage strategy was composed of a search engine to reduce the number of comparisons and machine-learning algorithms. The overall pipeline was evaluated by assembling a file containing 3,565 in-hospital deaths and 15,000 alive persons.
The recall and precision of our linkage strategy were 97.5% and 99.97% for the upper threshold and 99.4% and 98.9% for the lower threshold, respectively.
In this study, we demonstrated the feasibility of accurately linking hospital records with death certificates using a search engine and machine learning.
生命状态对医院临床研究至关重要。然而,医院信息系统仅记录院内死亡信息。最近,法国政府发布了一个公开数据集,其中包含超过2500万个人的死亡证明数据。本研究的目的是将法国死亡证明与波尔多大学医院记录相链接,以完善生命状态信息。
我们的链接策略由一个用于减少比较次数的搜索引擎和机器学习算法组成。通过组装一个包含3565例院内死亡和15000例在世者的文件,对整个流程进行了评估。
对于较高阈值,我们链接策略的召回率和精确率分别为97.5%和99.97%;对于较低阈值,召回率和精确率分别为99.4%和98.9%。
在本研究中,我们证明了使用搜索引擎和机器学习将医院记录与死亡证明准确链接的可行性。