Department of Computer Science and IT, University of Azad Jammu and Kashmir, Muzaffarabad 13100, Pakistan.
Raptor Interactive (Pty) Ltd., Eco Boulevard, Witch Hazel Ave, Centurion 0157, South Africa.
Sensors (Basel). 2022 Feb 25;22(5):1836. doi: 10.3390/s22051836.
Braille is used as a mode of communication all over the world. Technological advancements are transforming the way Braille is read and written. This study developed an English Braille pattern identification system using robust machine learning techniques using the English Braille Grade-1 dataset. English Braille Grade-1 dataset was collected using a touchscreen device from visually impaired students of the National Special Education School Muzaffarabad. For better visualization, the dataset was divided into two classes as class 1 (1-13) (a-m) and class 2 (14-26) (n-z) using 26 Braille English characters. A position-free braille text entry method was used to generate synthetic data. N = 2512 cases were included in the final dataset. Support Vector Machine (SVM), Decision Trees (DT) and K-Nearest Neighbor (KNN) with Reconstruction Independent Component Analysis (RICA) and PCA-based feature extraction methods were used for Braille to English character recognition. Compared to PCA, Random Forest (RF) algorithm and Sequential methods, better results were achieved using the RICA-based feature extraction method. The evaluation metrics used were the True Positive Rate (TPR), True Negative Rate (TNR), Positive Predictive Value (PPV), Negative Predictive Value (NPV), False Positive Rate (FPR), Total Accuracy, Area Under the Receiver Operating Curve (AUC) and F1-Score. A statistical test was also performed to justify the significance of the results.
盲文在全球范围内被用作一种交流模式。技术进步正在改变盲文的读写方式。本研究使用稳健的机器学习技术,使用英语盲文一级数据集,开发了一种英语盲文模式识别系统。英语盲文一级数据集是使用穆扎法拉巴德国家特殊教育学校的视障学生的触摸屏设备收集的。为了更好地可视化,数据集使用 26 个盲文英文字母分为两个类别,即第 1 类(1-13)(a-m)和第 2 类(14-26)(n-z)。使用位置自由的盲文文本输入方法生成合成数据。最终数据集包含 N = 2512 例。支持向量机(SVM)、决策树(DT)和 K-最近邻(KNN)与独立成分分析(RICA)和基于 PCA 的特征提取方法一起用于盲文到英文字符识别。与 PCA 相比,使用基于 RICA 的特征提取方法可以获得更好的结果,优于随机森林(RF)算法和顺序方法。使用的评估指标包括真阳性率(TPR)、真阴性率(TNR)、阳性预测值(PPV)、阴性预测值(NPV)、假阳性率(FPR)、总准确率、接收器操作曲线下的面积(AUC)和 F1 分数。还进行了统计检验,以证明结果的显著性。