Suppr超能文献

BD2K培训协调中心的ERuDIte:数据科学教育资源发现索引

BD2K Training Coordinating Center's ERuDIte: the Educational Resource Discovery Index for Data Science.

作者信息

Ambite José Luis, Fierro Lily, Gordon Jonathan, Burns Gully A, Geigl Florian, Lerman Kristina, Van Horn John D

机构信息

University of Southern California's Information Sciences Institute (ISI), Marina del Rey, CA 90292.

performed as a visiting Ph.D. student at ISI.

出版信息

IEEE Trans Emerg Top Comput. 2021 Jan-Mar;9(1):316-328. doi: 10.1109/tetc.2019.2903466. Epub 2019 Mar 6.

Abstract

Data science is a field that has developed to enable efficient integration and analysis of increasingly large data sets in many domains. In particular, big data in genetics, neuroimaging, mobile health, and other subfields of biomedical science, promises new insights, but also poses challenges. To address these challenges, the National Institutes of Health launched the Big Data to Knowledge (BD2K) initiative, including a Training Coordinating Center (TCC) tasked with developing a resource for personalized data science training for biomedical researchers. The BD2K TCC web portal is powered by ERuDIte, the Educational Resource Discovery Index, which collects training resources for data science, including online courses, videos of tutorials and research talks, textbooks, and other web-based materials. While the availability of so many potential learning resources is exciting, they are highly heterogeneous in quality, difficulty, format, and topic, making the field intimidating to enter and difficult to navigate. Moreover, data science is rapidly evolving, so there is a constant influx of new materials and concepts. We leverage data science techniques to build ERuDIte itself, using data extraction, data integration, machine learning, information retrieval, and natural language processing to automatically collect, integrate, describe, and organize existing online resources for learning data science.

摘要

数据科学是一个不断发展的领域,旨在实现对许多领域中日益庞大的数据集进行高效整合与分析。特别是遗传学、神经影像学、移动健康以及生物医学科学其他子领域中的大数据,既带来了新的见解,也带来了挑战。为应对这些挑战,美国国立卫生研究院发起了“大数据到知识”(BD2K)计划,其中包括一个培训协调中心(TCC),其任务是为生物医学研究人员开发个性化数据科学培训资源。BD2K TCC门户网站由教育资源发现索引ERuDIte提供支持,该索引收集数据科学培训资源,包括在线课程、教程视频和研究讲座、教科书以及其他基于网络的材料。虽然有这么多潜在的学习资源令人兴奋,但它们在质量、难度、格式和主题方面高度异质,使得该领域令人望而却步,难以进入且难以驾驭。此外,数据科学正在迅速发展,因此新材料和新概念不断涌入。我们利用数据科学技术来构建ERuDIte本身,使用数据提取、数据集成、机器学习、信息检索和自然语言处理来自动收集、整合、描述和组织现有的数据科学在线学习资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08f5/9089329/6441770cf139/nihms-1681422-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验