Suppr超能文献

从微小且不均衡的数据中学习对密集波分复用(DWDM)光信道进行分类

Learning to Classify DWDM Optical Channels from Tiny and Imbalanced Data.

作者信息

Cichosz Paweł, Kozdrowski Stanisław, Sujecki Sławomir

机构信息

Computer Science Institute, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland.

Faculty of Electronics, Military University of Technology, S. Kaliskiego 2, 00-908 Warsaw, Poland.

出版信息

Entropy (Basel). 2021 Nov 13;23(11):1504. doi: 10.3390/e23111504.

Abstract

Applying machine learning algorithms for assessing the transmission quality in optical networks is associated with substantial challenges. Datasets that could provide training instances tend to be small and heavily imbalanced. This requires applying imbalanced compensation techniques when using binary classification algorithms, but it also makes one-class classification, learning only from instances of the majority class, a noteworthy alternative. This work examines the utility of both these approaches using a real dataset from a Dense Wavelength Division Multiplexing network operator, gathered through the network control plane. The dataset is indeed of a very small size and contains very few examples of "bad" paths that do not deliver the required level of transmission quality. Two binary classification algorithms, random forest and extreme gradient boosting, are used in combination with two imbalance handling methods, instance weighting and synthetic minority class instance generation. Their predictive performance is compared with that of four one-class classification algorithms: One-class SVM, one-class naive Bayes classifier, isolation forest, and maximum entropy modeling. The one-class approach turns out to be clearly superior, particularly with respect to the level of classification precision, making it possible to obtain more practically useful models.

摘要

将机器学习算法应用于评估光网络中的传输质量面临着诸多重大挑战。能够提供训练实例的数据集往往规模较小且严重失衡。这就要求在使用二元分类算法时应用失衡补偿技术,但这也使得仅从多数类实例进行学习的单类分类成为一种值得关注的替代方法。这项工作使用从密集波分复用网络运营商通过网络控制平面收集的真实数据集,研究了这两种方法的效用。该数据集规模确实非常小,并且几乎没有不具备所需传输质量水平的“不良”路径示例。两种二元分类算法,随机森林和极端梯度提升,与两种失衡处理方法,实例加权和合成少数类实例生成,结合使用。将它们的预测性能与四种单类分类算法的性能进行比较:单类支持向量机、单类朴素贝叶斯分类器、孤立森林和最大熵建模。结果表明,单类方法明显更优,特别是在分类精度方面,这使得能够获得更具实际用途的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e17b/8623617/8c388c474f88/entropy-23-01504-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验