Suppr超能文献

分割训练:一种改进共享单车系统预测任务的新方法。

Divide-and-train: A new approach to improve the predictive tasks of bike-sharing systems.

作者信息

Ali Ahmed, Salah Ahmad, Bekhit Mahmoud, Fathalla Ahmed

机构信息

Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia.

Higher Future Institute for Specialized Technological Studies, Cairo 3044, Egypt.

出版信息

Math Biosci Eng. 2024 Jul 2;21(7):6471-6492. doi: 10.3934/mbe.2024282.

Abstract

Bike-sharing systems (BSSs) have become commonplace in most cities worldwide as an important part of many smart cities. These systems generate a continuous amount of large data volumes. The effectiveness of these BSS systems depends on making decisions at the proper time. Thus, there is a vital need to build predictive models on the BSS data for the sake of improving the process of decision-making. The overwhelming majority of BSS users register before utilizing the service. Thus, several BSSs have prior knowledge of the user's data, such as age, gender, and other relevant details. Several machine learning and deep learning models, for instance, are used to predict urban flows, trip duration, and other factors. The standard practice for these models is to train on the entire dataset to build a predictive model, whereas the biking patterns of various users are intuitively distinct. For instance, the user's age influences the duration of a trip. This endeavor was motivated by the existence of distinct user patterns. In this work, we proposed , a new method for training predictive models on station-based BSS datasets by dividing the original datasets on the values of a given dataset attribute. Then, the proposed method was validated on different machine learning and deep learning models. All employed models were trained on both the complete and split datasets. The enhancements made to the evaluation metric were then reported. Results demonstrated that the proposed method outperformed the conventional training approach. Specifically, the root mean squared error (RMSE) and mean absolute error (MAE) metrics have shown improvements in both trip duration and distance prediction, with an average accuracy of 85% across the divided sub-datasets for the best performing model, i.e., random forest.

摘要

共享单车系统(BSSs)作为许多智慧城市的重要组成部分,已在全球大多数城市变得司空见惯。这些系统会持续生成大量数据。这些BSS系统的有效性取决于在恰当的时间做出决策。因此,为了改进决策过程,迫切需要基于BSS数据构建预测模型。绝大多数BSS用户在使用服务前会进行注册。因此,一些BSS拥有用户数据的先验知识,如年龄、性别和其他相关细节。例如,有几种机器学习和深度学习模型被用于预测城市流量、行程时长及其他因素。这些模型的标准做法是在整个数据集上进行训练以构建预测模型,而不同用户的骑行模式在直观上是不同的。例如,用户的年龄会影响行程的时长。这项工作的动机源于不同用户模式的存在。在这项研究中,我们提出了一种新方法,通过根据给定数据集属性的值对原始数据集进行划分,在基于站点的BSS数据集上训练预测模型。然后,在不同的机器学习和深度学习模型上对所提出的方法进行了验证。所有使用的模型都在完整数据集和分割后的数据集上进行了训练。随后报告了评估指标的改进情况。结果表明,所提出的方法优于传统的训练方法。具体而言,均方根误差(RMSE)和平均绝对误差(MAE)指标在行程时长和距离预测方面均有改善,表现最佳的模型即随机森林在分割后的子数据集上的平均准确率达到了85%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验