Bellazzi R
Riccardo Bellazzi, Biomedical Informatics Labs "Mario Stefanelli", Department of Electric, Computer and Biomedical Engineering, University of Pavia, Tel: +39 0382 985720, +39 0382 985059, +39 0382, 985981, Fax: +39 0382 985373, E-mail:
Yearb Med Inform. 2014 May 22;9(1):8-13. doi: 10.15265/IY-2014-0024.
Big data are receiving an increasing attention in biomedicine and healthcare. It is therefore important to understand the reason why big data are assuming a crucial role for the biomedical informatics community. The capability of handling big data is becoming an enabler to carry out unprecedented research studies and to implement new models of healthcare delivery. Therefore, it is first necessary to deeply understand the four elements that constitute big data, namely Volume, Variety, Velocity, and Veracity, and their meaning in practice. Then, it is mandatory to understand where big data are present, and where they can be beneficially collected. There are research fields, such as translational bioinformatics, which need to rely on big data technologies to withstand the shock wave of data that is generated every day. Other areas, ranging from epidemiology to clinical care, can benefit from the exploitation of the large amounts of data that are nowadays available, from personal monitoring to primary care. However, building big data-enabled systems carries on relevant implications in terms of reproducibility of research studies and management of privacy and data access; proper actions should be taken to deal with these issues. An interesting consequence of the big data scenario is the availability of new software, methods, and tools, such as map-reduce, cloud computing, and concept drift machine learning algorithms, which will not only contribute to big data research, but may be beneficial in many biomedical informatics applications. The way forward with the big data opportunity will require properly applied engineering principles to design studies and applications, to avoid preconceptions or over-enthusiasms, to fully exploit the available technologies, and to improve data processing and data management regulations.
大数据在生物医学和医疗保健领域正受到越来越多的关注。因此,了解大数据在生物医学信息学领域发挥关键作用的原因很重要。处理大数据的能力正成为开展前所未有的研究以及实施新的医疗服务模式的推动因素。因此,首先有必要深入理解构成大数据的四个要素,即体量、多样性、速度和准确性,以及它们在实际中的意义。然后,必须了解大数据存在于何处,以及在何处可以有益地收集它们。有一些研究领域,如转化生物信息学,需要依靠大数据技术来抵御每天产生的数据冲击波。从流行病学到临床护理等其他领域,可以从利用如今可用的大量数据中受益,从个人监测到初级护理。然而,构建支持大数据的系统在研究的可重复性以及隐私和数据访问管理方面具有相关影响;应该采取适当行动来处理这些问题。大数据场景的一个有趣结果是出现了新的软件、方法和工具,如图 MapReduce、云计算和概念漂移机器学习算法,它们不仅将有助于大数据研究,而且在许多生物医学信息学应用中可能是有益的。利用大数据机遇的前进道路将需要正确应用工程原理来设计研究和应用,避免先入之见或过度热情,充分利用现有技术,并改进数据处理和数据管理规则。