Suppr超能文献

通过挖掘蛋白质突变和 RSids 并应用人类表型本体,从 ClinicalTrials.gov 获得独特的见解。

Unique insights from ClinicalTrials.gov by mining protein mutations and RSids in addition to applying the Human Phenotype Ontology.

机构信息

The Harker School, San Jose, CA, United States of America.

出版信息

PLoS One. 2020 May 27;15(5):e0233438. doi: 10.1371/journal.pone.0233438. eCollection 2020.

Abstract

Researchers and clinicians face a significant challenge in keeping up-to-date with the rapid rate of new associations between genetic mutations and diseases. To remedy this problem, this research mined the ClinicalTrials.gov corpus to extract relevant biological insights, produce unique reports to summarize findings, and make the meta-data available via APIs. An automated text-analysis pipeline performed the following features: parsing the ClinicalTrials.gov files, extracting and analyzing mutations from the corpus, mapping clinical trials to Human Phenotype Ontology (HPO), and finding associations between clinical trials and HPO nodes. Unique reports were created for each mutation (SNPs and protein mutations) mentioned in the corpus, as well as for each clinical trial that references a mutation. These reports, which have been run over multiple time points, along with APIs to access meta-data, are freely available at http://snpminertrials.com. Additionally, HPO was used to normalize disease terms and associate clinical trials with relevant genes. The creation of the pipeline and reports, the association of clinical trials with HPO terms, and the insights, public repository, and APIs produced are all novel in this work. The freely-available resources present relevant biological information and novel insights between biomedical entities in a robust and accessible manner, mitigating the challenge of being informed about new associations between mutations, genes, and diseases.

摘要

研究人员和临床医生在跟上基因突变与疾病之间新关联的快速发展方面面临着重大挑战。为了解决这个问题,这项研究挖掘了 ClinicalTrials.gov 语料库,以提取相关的生物学见解,生成独特的报告来总结发现,并通过 API 提供元数据。一个自动化的文本分析管道执行了以下功能:解析 ClinicalTrials.gov 文件,从语料库中提取和分析突变,将临床试验映射到人类表型本体 (HPO),并在临床试验和 HPO 节点之间寻找关联。对语料库中提到的每个突变(SNP 和蛋白质突变)以及引用突变的每个临床试验都创建了独特的报告。这些报告已经在多个时间点上运行,并提供了访问元数据的 API,可在 http://snpminertrials.com 上免费获取。此外,HPO 用于规范化疾病术语,并将临床试验与相关基因联系起来。该管道和报告的创建、临床试验与 HPO 术语的关联以及产生的见解、公共存储库和 API 在这项工作中都是新颖的。这些免费提供的资源以强大且易于访问的方式呈现了生物医学实体之间的相关生物学信息和新颖见解,减轻了了解基因突变、基因和疾病之间新关联的挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7c3e/7252633/0da4db09e8e0/pone.0233438.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验