Bui Alex A T, Van Horn John Darrell
BD2K Centers Coordinating Center (BD2K CCC), University of California, Los Angeles, Los Angeles, CA, USA. Electronic address: http://www.bd2kccc.org.
BD2K Training Coordinating Center (BD2K TCC), University of Southern California, Los Angeles, CA, USA. Electronic address: http://www.bigdatau.org.
J Biomed Inform. 2017 May;69:115-117. doi: 10.1016/j.jbi.2017.03.017. Epub 2017 Mar 30.
Through the increasing availability of more efficient data collection procedures, biomedical scientists are now confronting ever larger sets of data, often finding themselves struggling to process and interpret what they have gathered. This, while still more data continues to accumulate. This torrent of biomedical information necessitates creative thinking about how the data are being generated, how they might be best managed, analyzed, and eventually how they can be transformed into further scientific understanding for improving patient care. Recognizing this as a major challenge, the National Institutes of Health (NIH) has spearheaded the "Big Data to Knowledge" (BD2K) program - the agency's most ambitious biomedical informatics effort ever undertaken to date. In this commentary, we describe how the NIH has taken on "big data" science head-on, how a consortium of leading research centers are developing the means for handling large-scale data, and how such activities are being marshalled for the training of a new generation of biomedical data scientists. All in all, the NIH BD2K program seeks to position data science at the heart of 21 Century biomedical research.
随着更高效的数据收集程序日益普及,生物医学科学家如今面临着规模越来越大的数据集,常常发现自己难以处理和解读所收集到的数据。而与此同时,更多的数据仍在不断积累。这股生物医学信息洪流使得人们有必要创造性地思考数据是如何生成的、如何才能得到最佳管理和分析,以及最终如何将其转化为进一步的科学认识以改善患者护理。美国国立卫生研究院(NIH)认识到这是一项重大挑战,率先开展了“大数据到知识”(BD2K)计划——这是该机构迄今为止开展的最雄心勃勃的生物医学信息学工作。在这篇评论文章中,我们描述了NIH如何直面“大数据”科学,领先的研究中心联盟如何开发处理大规模数据的方法,以及这些活动如何被整合起来用于培养新一代生物医学数据科学家。总而言之,NIH的BD2K计划旨在将数据科学置于21世纪生物医学研究的核心位置。