Suppr超能文献

通过大数据分析实现精准医学的基础设施。

An infrastructure for precision medicine through analysis of big data.

机构信息

Institute for Biomedical Technologies - National Research Council (CNR-ITB), via F.lli Cervi 93, Segrate, 20090, MI, Italy.

Centro Diagnostico Italiano, Via Simone Saint Bon 20, Milan, 20147, Italy.

出版信息

BMC Bioinformatics. 2018 Oct 15;19(Suppl 10):351. doi: 10.1186/s12859-018-2300-5.

Abstract

BACKGROUND

Nowadays, the increasing availability of omics data, due to both the advancements in the acquisition of molecular biology results and in systems biology simulation technologies, provides the bases for precision medicine. Success in precision medicine depends on the access to healthcare and biomedical data. To this end, the digitization of all clinical exams and medical records is becoming a standard in hospitals. The digitization is essential to collect, share, and aggregate large volumes of heterogeneous data to support the discovery of hidden patterns with the aim to define predictive models for biomedical purposes. Patients' data sharing is a critical process. In fact, it raises ethical, social, legal, and technological issues that must be properly addressed.

RESULTS

In this work, we present an infrastructure devised to deal with the integration of large volumes of heterogeneous biological data. The infrastructure was applied to the data collected between 2010-2016 in one of the major diagnostic analysis laboratories in Italy. Data from three different platforms were collected (i.e., laboratory exams, pathological anatomy exams, biopsy exams). The infrastructure has been designed to allow the extraction and aggregation of both unstructured and semi-structured data. Data are properly treated to ensure data security and privacy. Specialized algorithms have also been implemented to process the aggregated information with the aim to obtain a precise historical analysis of the clinical activities of one or more patients. Moreover, three Bayesian classifiers have been developed to analyze examinations reported as free text. Experimental results show that the classifiers exhibit a good accuracy when used to analyze sentences related to the sample location, diseases presence and status of the illnesses.

CONCLUSIONS

The infrastructure allows the integration of multiple and heterogeneous sources of anonymized data from the different clinical platforms. Both unstructured and semi-structured data are processed to obtain a precise historical analysis of the clinical activities of one or more patients. Data aggregation allows to perform a series of statistical assessments required to answer complex questions that can be used in a variety of fields, such as predictive and precision medicine. In particular, studying the clinical history of patients that have developed similar pathologies can help to predict or individuate markers able to allow an early diagnosis of possible illnesses.

摘要

背景

如今,随着分子生物学成果获取和系统生物学模拟技术的进步,组学数据的可用性不断增加,为精准医学提供了基础。精准医学的成功取决于获得医疗保健和生物医学数据的机会。为此,所有临床检查和医疗记录的数字化正在成为医院的标准。数字化对于收集、共享和聚合大量异构数据以支持发现隐藏模式以定义生物医学目的的预测模型至关重要。患者数据共享是一个关键过程。事实上,它引发了必须妥善解决的伦理、社会、法律和技术问题。

结果

在这项工作中,我们提出了一种用于处理大量异构生物数据集成的基础设施。该基础设施应用于 2010 年至 2016 年间在意大利主要诊断分析实验室之一收集的数据。从三个不同的平台收集数据(即实验室检查、病理解剖检查、活检检查)。该基础设施旨在允许提取和聚合非结构化和半结构化数据。对数据进行了适当处理,以确保数据安全和隐私。还实现了专门的算法来处理聚合信息,以获得一个或多个患者的临床活动的精确历史分析。此外,还开发了三个贝叶斯分类器来分析作为自由文本报告的检查。实验结果表明,当用于分析与样本位置、疾病存在和疾病状态相关的句子时,分类器表现出良好的准确性。

结论

该基础设施允许整合来自不同临床平台的多个和异构的匿名数据源。对非结构化和半结构化数据进行处理,以获得一个或多个患者的临床活动的精确历史分析。数据聚合允许执行一系列统计评估,以回答可用于各种领域的复杂问题,例如预测和精准医学。特别是,研究具有相似病理的患者的临床病史有助于预测或确定能够允许早期诊断可能疾病的标志物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e467/6191972/1824af5716ab/12859_2018_2300_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验