Department of Computer Science, University of Pisa, 56126 Pisa, Italy.
Zerynth, 56124 Pisa, Italy.
Sensors (Basel). 2023 Jul 1;23(13):6078. doi: 10.3390/s23136078.
Small and medium-sized enterprises (SMEs) often encounter practical challenges and limitations when extracting valuable insights from the data of retrofitted or brownfield equipment. The existing literature fails to reflect the full reality and potential of data-driven analysis in current SME environments. In this paper, we provide an anonymized dataset obtained from two medium-sized companies leveraging a non-invasive and scalable data-collection procedure. The dataset comprises mainly power consumption machine data collected over a period of 7 months and 1 year from two medium-sized companies. Using this dataset, we demonstrate how machine learning (ML) techniques can enable SMEs to extract useful information even in the short term, even from a small variety of data types. We develop several ML models to address various tasks, such as power consumption forecasting, item classification, next machine state prediction, and item production count forecasting. By providing this anonymized dataset and showcasing its application through various ML use cases, our paper aims to provide practical insights for SMEs seeking to leverage ML techniques with their limited data resources. The findings contribute to a better understanding of how ML can be effectively utilized in extracting actionable insights from limited datasets, offering valuable implications for SMEs in practical settings.
中小企业(SMEs)在从改造或棕地设备的数据中提取有价值的见解时,经常会遇到实际的挑战和限制。现有文献未能反映数据驱动分析在当前中小企业环境中的全部现实和潜力。在本文中,我们提供了一个从两家中型公司获取的匿名数据集,这些公司利用了一种非侵入性和可扩展的数据收集程序。该数据集主要包括从两家中型公司收集的长达 7 个月和 1 年的电力消耗机器数据。使用这个数据集,我们展示了机器学习(ML)技术如何使中小企业即使在短期内,即使只有少量数据类型,也能够提取有用的信息。我们开发了几个 ML 模型来解决各种任务,如电力消耗预测、物品分类、下一个机器状态预测和物品产量预测。通过提供这个匿名数据集,并通过各种 ML 用例展示其应用,我们的论文旨在为寻求利用有限数据资源的 ML 技术的中小企业提供实用的见解。研究结果有助于更好地理解如何有效地利用 ML 从有限的数据集提取可操作的见解,为中小企业在实际环境中提供了有价值的启示。