Department of Computer Engineering, Kerman Branch, Islamic Azad University, Kerman, Iran.
Department of Advanced Research, Bushehr University of Medical Sciences, Bushehr, Iran.
BMC Med Inform Decis Mak. 2024 Aug 5;24(1):220. doi: 10.1186/s12911-024-02613-0.
The accuracy of spelling in Electronic Health Records (EHRs) is a critical factor for efficient clinical care, research, and ensuring patient safety. The Persian language, with its abundant vocabulary and complex characteristics, poses unique challenges for real-word error correction. This research aimed to develop an innovative approach for detecting and correcting spelling errors in Persian clinical text.
Our strategy employs a state-of-the-art pre-trained model that has been meticulously fine-tuned specifically for the task of spelling correction in the Persian clinical domain. This model is complemented by an innovative orthographic similarity matching algorithm, PERTO, which uses visual similarity of characters for ranking correction candidates.
The evaluation of our approach demonstrated its robustness and precision in detecting and rectifying word errors in Persian clinical text. In terms of non-word error correction, our model achieved an F1-Score of 90.0% when the PERTO algorithm was employed. For real-word error detection, our model demonstrated its highest performance, achieving an F1-Score of 90.6%. Furthermore, the model reached its highest F1-Score of 91.5% for real-word error correction when the PERTO algorithm was employed.
Despite certain limitations, our method represents a substantial advancement in the field of spelling error detection and correction for Persian clinical text. By effectively addressing the unique challenges posed by the Persian language, our approach paves the way for more accurate and efficient clinical documentation, contributing to improved patient care and safety. Future research could explore its use in other areas of the Persian medical domain, enhancing its impact and utility.
电子健康记录 (EHR) 中的拼写准确性对于高效的临床护理、研究和确保患者安全是一个关键因素。波斯语词汇丰富,语法复杂,这给实际单词纠错带来了独特的挑战。本研究旨在开发一种创新方法,用于检测和纠正波斯语临床文本中的拼写错误。
我们的策略采用了一种最先进的预训练模型,该模型经过精心微调,专门用于波斯语临床领域的拼写纠错任务。该模型辅以一种创新的正字法相似性匹配算法 PERTO,该算法使用字符的视觉相似性对校正候选词进行排名。
我们的方法评估表明,它在检测和纠正波斯语临床文本中的单词错误方面具有稳健性和精确性。在非单词错误校正方面,当使用 PERTO 算法时,我们的模型在 F1-分数方面达到了 90.0%。对于实际单词错误检测,我们的模型表现最佳,F1-分数达到了 90.6%。此外,当使用 PERTO 算法时,该模型在实际单词错误校正方面达到了最高的 F1-分数 91.5%。
尽管存在某些限制,但我们的方法代表了波斯语临床文本拼写错误检测和校正领域的重大进展。通过有效解决波斯语所带来的独特挑战,我们的方法为更准确、更高效的临床文档处理铺平了道路,有助于改善患者护理和安全。未来的研究可以探索其在波斯医学领域其他领域的应用,提高其影响力和实用性。