Department of Internal Medicine and Clinical Epidemiology, Princess Alexandra Hospital, Brisbane, Queensland, Australia.
Intern Med J. 2019 Jan;49(1):126-129. doi: 10.1111/imj.14172.
Big Data are characterised by greater volumes of data from a greater variety of sources which are produced and processed at greater velocity. Huge digitised datasets from electronic medical records, registries, administrative datasets and genomic databanks can now be analysed by advanced computer programs to reveal patterns, trends and associations previously indiscernible using conventional analytic methods. These new insights may have important implications for clinical care. But Big Data can be limited by inaccuracies and bias inherent to observational datasets and which cannot be eliminated simply by using ever enlarging data sets or more sophisticated software. The hope and hype of Big Data cannot be allowed to override potential for harm, and the need for concurrent development of new research designs, better analytic methods and rigorous evaluation of predictive accuracy and effects on care and outcomes.
大数据的特点是数据量更大、来源更多样、处理速度更快。现在,通过先进的计算机程序,可以对来自电子病历、登记处、行政数据集和基因组数据库的大型数字化数据集进行分析,以揭示以前使用传统分析方法无法识别的模式、趋势和关联。这些新的见解可能对临床护理有重要意义。但是,大数据可能受到观察性数据集固有的不准确性和偏差的限制,而仅仅通过使用不断扩大的数据集或更复杂的软件并不能消除这些限制。不能让大数据的希望和炒作盖过潜在的危害,同时需要开发新的研究设计、更好的分析方法,并严格评估预测准确性以及对护理和结果的影响。