Kafkas Şenay, Dunham Ian, McEntyre Johanna
European Molecular Biology Laboratory (EMBL-EBI), European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, UK.
Open Targets, Wellcome Genome Campus, Hinxton, CB10 1SD, UK.
J Biomed Semantics. 2017 Jun 6;8(1):20. doi: 10.1186/s13326-017-0131-3.
We present the Europe PMC literature component of Open Targets - a target validation platform that integrates various evidence to aid drug target identification and validation. The component identifies target-disease associations in documents and ranks the documents based on their confidence from the Europe PMC literature database, by using rules utilising expert-provided heuristic information. The confidence score of a given document represents how valuable the document is in the scope of target validation for a given target-disease association by taking into account the credibility of the association based on the properties of the text. The component serves the platform regularly with the up-to-date data since December, 2015.
Currently, there are a total number of 1168365 distinct target-disease associations text mined from >26 million PubMed abstracts and >1.2 million Open Access full text articles. Our comparative analyses on the current available evidence data in the platform revealed that 850179 of these associations are exclusively identified by literature mining.
This component helps the platform's users by providing the most relevant literature hits for a given target and disease. The text mining evidence along with the other types of evidence can be explored visually through https://www.targetvalidation.org and all the evidence data is available for download in json format from https://www.targetvalidation.org/downloads/data .
我们展示了开放靶点(Open Targets)的欧洲分子生物学实验室文献组件——一个整合各种证据以辅助药物靶点识别和验证的靶点验证平台。该组件通过利用专家提供的启发式信息的规则,在文献中识别靶点-疾病关联,并根据欧洲分子生物学实验室文献数据库中的可信度对文献进行排名。给定文献的置信度得分表示该文献在给定靶点-疾病关联的靶点验证范围内的价值,同时考虑到基于文本属性的关联可信度。自2015年12月以来,该组件定期为平台提供最新数据。
目前,从超过2600万篇PubMed摘要和超过120万篇开放获取全文文章中挖掘出了总共1168365个不同的靶点-疾病关联。我们对平台上当前可用证据数据的比较分析表明,这些关联中有850179个是通过文献挖掘专门识别出来的。
该组件通过为给定的靶点和疾病提供最相关的文献命中结果来帮助平台用户。文本挖掘证据以及其他类型的证据可以通过https://www.targetvalidation.org进行可视化探索,所有证据数据都可以从https://www.targetvalidation.org/downloads/data以json格式下载。