Bonin-Andresen M, Smiljanovic B, Stuhlmüller B, Sörensen T, Grützkau A, Häupl T
Medizinische Klinik mit Schwerpunkt Rheumatologie und Klinische Immunologie, Charité Universitätsmedizin, Charitéplatz 1, 10117, Berlin, Deutschland.
Deutsches Rheuma-Forschungszentrum (DRFZ) Berlin, Berlin, Deutschland.
Z Rheumatol. 2018 Apr;77(3):195-202. doi: 10.1007/s00393-018-0436-3.
Big data analysis raises the expectation that computerized algorithms may extract new knowledge from otherwise unmanageable vast data sets. What are the algorithms behind the big data discussion? In principle, high throughput technologies in molecular research already introduced big data and the development and application of analysis tools into the field of rheumatology some 15 years ago. This includes especially omics technologies, such as genomics, transcriptomics and cytomics. Some basic methods of data analysis are provided along with the technology, however, functional analysis and interpretation requires adaptation of existing or development of new software tools. For these steps, structuring and evaluating according to the biological context is extremely important and not only a mathematical problem. This aspect has to be considered much more for molecular big data than for those analyzed in health economy or epidemiology. Molecular data are structured in a first order determined by the applied technology and present quantitative characteristics that follow the principles of their biological nature. These biological dependencies have to be integrated into software solutions, which may require networks of molecular big data of the same or even different technologies in order to achieve cross-technology confirmation. More and more extensive recording of molecular processes also in individual patients are generating personal big data and require new strategies for management in order to develop data-driven individualized interpretation concepts. With this perspective in mind, translation of information derived from molecular big data will also require new specifications for education and professional competence.
大数据分析引发了人们的期待,即计算机算法或许能够从原本难以处理的海量数据集中提取新知识。大数据讨论背后的算法是什么?原则上,分子研究中的高通量技术大约在15年前就已将大数据以及分析工具的开发与应用引入了风湿病学领域。这尤其包括组学技术,如基因组学、转录组学和细胞组学。随着技术还提供了一些基本的数据分析方法,然而,功能分析和解读需要对现有软件工具进行调整或开发新的软件工具。对于这些步骤而言,根据生物学背景进行构建和评估极为重要,而不仅仅是一个数学问题。相较于在卫生经济学或流行病学中分析的数据,分子大数据在这方面必须得到更多的考量。分子数据在一阶层面上由所应用的技术构建而成,并呈现出遵循其生物学特性原则的定量特征。这些生物学相关性必须整合到软件解决方案中,这可能需要相同甚至不同技术的分子大数据网络,以实现跨技术验证。越来越广泛地记录个体患者的分子过程也正在产生个人大数据,并且需要新的管理策略,以便开发数据驱动的个性化解读概念。考虑到这一观点,源自分子大数据的信息转化也将需要教育和专业能力方面的新规范。