Segura-Bedmar Isabel, Martínez Paloma, Revert Ricardo, Moreno-Schneider Julián
BMC Med Inform Decis Mak. 2015;15 Suppl 2(Suppl 2):S6. doi: 10.1186/1472-6947-15-S2-S6. Epub 2015 Jun 15.
Adverse Drug reactions (ADR) cause a high number of deaths among hospitalized patients in developed countries. Major drug agencies have devoted a great interest in the early detection of ADRs due to their high incidence and increasing health care costs. Reporting systems are available in order for both healthcare professionals and patients to alert about possible ADRs. However, several studies have shown that these adverse events are underestimated. Our hypothesis is that health social networks could be a significant information source for the early detection of ADRs as well as of new drug indications.
In this work we present a system for detecting drug effects (which include both adverse drug reactions as well as drug indications) from user posts extracted from a Spanish health forum. Texts were processed using MeaningCloud, a multilingual text analysis engine, to identify drugs and effects. In addition, we developed the first Spanish database storing drugs as well as their effects automatically built from drug package inserts gathered from online websites. We then applied a distant-supervision method using the database on a collection of 84,000 messages in order to extract the relations between drugs and their effects. To classify the relation instances, we used a kernel method based only on shallow linguistic information of the sentences.
Regarding Relation Extraction of drugs and their effects, the distant supervision approach achieved a recall of 0.59 and a precision of 0.48.
The task of extracting relations between drugs and their effects from social media is a complex challenge due to the characteristics of social media texts. These texts, typically posts or tweets, usually contain many grammatical errors and spelling mistakes. Moreover, patients use lay terminology to refer to diseases, symptoms and indications that is not usually included in lexical resources in languages other than English.
在发达国家,药物不良反应(ADR)导致大量住院患者死亡。由于其高发病率和不断增加的医疗保健成本,主要药物机构对ADR的早期检测投入了极大关注。报告系统可供医疗保健专业人员和患者使用,以便提醒可能的ADR。然而,多项研究表明,这些不良事件被低估了。我们的假设是,健康社交网络可能是ADR以及新药适应症早期检测的重要信息来源。
在这项工作中,我们提出了一个从西班牙健康论坛提取的用户帖子中检测药物效应(包括药物不良反应以及药物适应症)的系统。使用多语言文本分析引擎MeaningCloud处理文本,以识别药物和效应。此外,我们开发了第一个西班牙语数据库,该数据库存储从在线网站收集的药品说明书中自动构建的药物及其效应。然后,我们使用该数据库对84000条消息的集合应用远距离监督方法,以提取药物与其效应之间的关系。为了对关系实例进行分类,我们使用了一种仅基于句子浅层语言信息的核方法。
关于药物及其效应的关系提取,远距离监督方法的召回率为0.59,精确率为0.48。
由于社交媒体文本的特性,从社交媒体中提取药物与其效应之间的关系是一项复杂的挑战。这些文本,通常是帖子或推文,通常包含许多语法错误和拼写错误。此外,患者使用通俗术语来指代疾病、症状和适应症,而这些术语通常不包括在英语以外语言的词汇资源中。