Suppr超能文献

构建用于机器学习和数据挖掘应用的纺织针织数据集。

Structuring a textile knitting dataset for machine learning and data mining applications.

作者信息

Ahmed Toufique, Junayed Abu Saleh Muhammad

机构信息

Dept. of Textile Engineering, Faculty of Engineering, Daffodil international University, Dhaka-1216, Bangladesh.

Knitting Section, Fakhruddin Textile Mills Limited, Gazipur, Bangladesh.

出版信息

Data Brief. 2025 Jul 12;61:111873. doi: 10.1016/j.dib.2025.111873. eCollection 2025 Aug.

Abstract

Knitting is a vital sector of the fabric manufacturing industry. Concurrently, machine learning is emerging as a highly regarded technique for predicting patterns and classifying various parameters derived from datasets. This study aims to establish a comprehensive database of knitted fabrics that encompasses a variety of parameters related to yarn types, machinery, and fabric characteristics. The raw data was collected from a knitting factory, after which the dataset was processed with various pre-processing techniques using domain knowledge and the Python programming. These techniques included data cleaning, normalization, and feature engineering, all of which were crucial in ensuring the quality and usability of the dataset. Drawing on expertise in knitting science, several new parameters were formulated, and specific complex parameters were subsequently deconstructed into two or three distinct components. The finalized dataset has 12569 rows and 38 columns. This article also discusses potential applications of the dataset, such as identifying a polynomial relationship between grams per square meter (GSM) and yarn count for single jersey fabrics, having an R² score of 0.77. Furthermore, a quadratic relationship between the tightness factor and stitch length was observed, with an R² score of 0.78. Among various machine learning models to predict GSM, Random Forest and XGBoost consistently outperformed across all metrics (R² score, Mean Absolute Error, and Mean Square Error).

摘要

针织是织物制造业的一个重要领域。与此同时,机器学习正成为一种备受推崇的技术,用于预测图案和对从数据集中导出的各种参数进行分类。本研究旨在建立一个涵盖与纱线类型、机器和织物特性相关的各种参数的针织面料综合数据库。原始数据从一家针织厂收集,之后使用领域知识和Python编程通过各种预处理技术对数据集进行处理。这些技术包括数据清理、归一化和特征工程,所有这些对于确保数据集的质量和可用性都至关重要。借鉴针织科学方面的专业知识,制定了几个新参数,随后将特定的复杂参数解构为两个或三个不同的组件。最终的数据集有12569行和38列。本文还讨论了该数据集的潜在应用,例如识别单面针织物每平方米克重(GSM)与纱线支数之间的多项式关系,其R²得分0.77。此外,观察到紧度因子与线圈长度之间存在二次关系,R²得分0.78。在预测GSM的各种机器学习模型中,随机森林和XGBoost在所有指标(R²得分、平均绝对误差和均方误差)上始终表现出色。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验