Zeng Daojian, Zhao Chao, Quan Zhe
Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, China.
School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, China.
Front Genet. 2021 Feb 10;12:624307. doi: 10.3389/fgene.2021.624307. eCollection 2021.
Automatic extraction of chemical-induced disease (CID) relation from unstructured text is of essential importance for disease treatment and drug development. In this task, some relational facts can only be inferred from the document rather than single sentence. Recently, researchers investigate graph-based approaches to extract relations across sentences. It iteratively combines the information from neighbor nodes to model the interactions in entity mentions that exist in different sentences. Despite their success, one severe limitation of the graph-based approaches is the over-smoothing problem, which decreases the model distinguishing ability. In this paper, we propose CID-GCN, an effective Graph Convolutional Networks (GCNs) with gating mechanism, for CID relation extraction. Specifically, we construct a heterogeneous graph which contains mention, sentence and entity nodes. Then, the graph convolution operation is employed to aggregate interactive information on the constructed graph. Particularly, we combine gating mechanism with the graph convolution operation to address the over-smoothing problem. The experimental results demonstrate that our approach significantly outperforms the baselines.
从非结构化文本中自动提取化学诱导疾病(CID)关系对于疾病治疗和药物开发至关重要。在这项任务中,一些关系事实只能从文档中推断出来,而不是从单个句子中推断出来。最近,研究人员研究了基于图的方法来跨句子提取关系。它迭代地组合来自相邻节点的信息,以对不同句子中存在的实体提及中的相互作用进行建模。尽管取得了成功,但基于图的方法的一个严重局限性是过度平滑问题,这会降低模型的区分能力。在本文中,我们提出了CID-GCN,一种具有门控机制的有效图卷积网络(GCN),用于CID关系提取。具体来说,我们构建了一个包含提及、句子和实体节点的异构图。然后,采用图卷积操作来聚合在构建的图上的交互信息。特别地,我们将门控机制与图卷积操作相结合来解决过度平滑问题。实验结果表明,我们的方法明显优于基线方法。