University of Pennsylvania, School of Medicine, Center for Clinical Epidemiology and Biostatistics, Philadelphia, PA 19104-6021, United States.
J Biomed Inform. 2011 Dec;44(6):989-96. doi: 10.1016/j.jbi.2011.07.005. Epub 2011 Jul 26.
Medical message boards are online resources where users with a particular condition exchange information, some of which they might not otherwise share with medical providers. Many of these boards contain a large number of posts and contain patient opinions and experiences that would be potentially useful to clinicians and researchers. We present an approach that is able to collect a corpus of medical message board posts, de-identify the corpus, and extract information on potential adverse drug effects discussed by users. Using a corpus of posts to breast cancer message boards, we identified drug event pairs using co-occurrence statistics. We then compared the identified drug event pairs with adverse effects listed on the package labels of tamoxifen, anastrozole, exemestane, and letrozole. Of the pairs identified by our system, 75-80% were documented on the drug labels. Some of the undocumented pairs may represent previously unidentified adverse drug effects.
医疗留言板是在线资源,用户可以在其中交流特定病症的信息,其中一些信息他们可能不会与医疗服务提供者分享。许多这样的版块都包含大量的帖子,其中包含患者的意见和经验,这对临床医生和研究人员可能非常有用。我们提出了一种能够收集医疗留言板帖子语料库、去除身份识别并提取用户讨论的潜在药物不良反应信息的方法。使用乳腺癌留言板的帖子语料库,我们使用共现统计数据识别了药物事件对。然后,我们将识别出的药物事件对与他莫昔芬、阿那曲唑、依西美坦和来曲唑的包装标签上列出的不良反应进行了比较。在我们的系统识别出的药物对中,75%-80%都记录在药物标签上。一些未记录的药物对可能代表以前未识别出的药物不良反应。