Brown Nathan, Cambruzzi Jean, Cox Peter J, Davies Mark, Dunbar James, Plumbley Dean, Sellwood Matthew A, Sim Aaron, Williams-Jones Bryn I, Zwierzyna Magdalena, Sheppard David W
BenevolentAI, London, United Kingdom.
BenevolentAI, London, United Kingdom; Institute of Cardiovascular Science, University College London, London, United Kingdom.
Prog Med Chem. 2018;57(1):277-356. doi: 10.1016/bs.pmch.2017.12.003. Epub 2018 Feb 24.
Interpretation of Big Data in the drug discovery community should enhance project timelines and reduce clinical attrition through improved early decision making. The issues we encounter start with the sheer volume of data and how we first ingest it before building an infrastructure to house it to make use of the data in an efficient and productive way. There are many problems associated with the data itself including general reproducibility, but often, it is the context surrounding an experiment that is critical to success. Help, in the form of artificial intelligence (AI), is required to understand and translate the context. On the back of natural language processing pipelines, AI is also used to prospectively generate new hypotheses by linking data together. We explain Big Data from the context of biology, chemistry and clinical trials, showcasing some of the impressive public domain sources and initiatives now available for interrogation.
药物研发领域对大数据的解读应通过改进早期决策来缩短项目时间线并减少临床失败率。我们遇到的问题首先源于数据量之庞大,以及在构建一个存储数据的基础设施以便高效且富有成效地利用这些数据之前,我们如何首先摄取这些数据。与数据本身相关的问题众多,包括普遍的可重复性,但通常而言,实验所处的背景才是成功的关键。需要人工智能(AI)形式的帮助来理解和诠释这种背景。基于自然语言处理流程,AI还被用于通过将数据关联起来前瞻性地生成新的假设。我们从生物学、化学和临床试验的背景出发来解释大数据,展示一些目前可供查询的令人印象深刻的公共领域资源和倡议。