Suppr超能文献

数据科学技术课程:设计、评估与计算环境视角

Data science technology course: The design, assessment and computing environment perspectives.

作者信息

Ismail Azlan, Mutalib Sofianita, Haron Haryani

机构信息

Institute for Big Data Analytics and Artificial Intelligence (IBDAAI), Universiti Teknologi MARA (UiTM), 40450 Shah Alam, Selangor Malaysia.

School of Computing Sciences, College of Computing, Informatics and Media, Universiti Teknologi MARA (UiTM), 40450 Shah Alam, Selangor Malaysia.

出版信息

Educ Inf Technol (Dordr). 2023 Jan 24:1-26. doi: 10.1007/s10639-022-11558-8.

Abstract

This article discusses the key elements of the Data Science Technology course offered to postgraduate students enrolled in the Master of Data Science program. This course complements the existing curriculum by providing the skills to handle the Big Data platform and tools, in addition to data science activities. We tackle the discussion about this course based on three main requirements, which are related to the need to exploit the key skills from two dimensions, namely, Data Science and Big Data, and the need for a cluster-based computing platform and its accessibility. We address these requirements by presenting the course design and its assessments, the configuration of the computing platform, and the strategy to enable flexible accessibility. In terms of course design, the offered course contributes to several innovative elements and has covered multiple key areas of the data science body of knowledge and multiple quadrants of the job and skills matrix. In the case of the computing platform, a stable deployment of a Hadoop cluster with flexible accessibility, triggered by the pandemic situation, has been established. Furthermore, through our experience with the implementation of the cluster, it has shown the ability of the cluster to handle computing problems with a larger dataset than the one used for the semesters within the scope of the study. We also provide some reflections and highlight future improvements.

摘要

本文讨论了为数据科学硕士项目的研究生开设的数据科学技术课程的关键要素。该课程通过提供处理大数据平台和工具的技能,以及开展数据科学活动,对现有课程进行了补充。我们基于三个主要要求来探讨这门课程,这些要求与从数据科学和大数据这两个维度挖掘关键技能的需求,以及基于集群的计算平台及其可访问性的需求相关。我们通过介绍课程设计及其评估、计算平台的配置以及实现灵活可访问性的策略来满足这些要求。在课程设计方面,所提供的课程包含几个创新要素,涵盖了数据科学知识体系的多个关键领域以及工作和技能矩阵的多个象限。在计算平台方面,受疫情影响,已建立了一个具有灵活可访问性的稳定部署的Hadoop集群。此外,通过我们在集群实施过程中的经验,它已展示出能够处理比研究范围内学期所使用数据集更大的计算问题的能力。我们还提供了一些思考并突出了未来的改进方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68aa/9871418/8ee77c235fcd/10639_2022_11558_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验