Suppr超能文献

利用机器学习和稳健统计方法减少中世纪手稿中的渗色效应

Minimizing Bleed-Through Effect in Medieval Manuscripts with Machine Learning and Robust Statistics.

作者信息

Ettari Adriano, Brescia Massimo, Conte Stefania, Momtaz Yahya, Russo Guido

机构信息

Department of Physics E. Pancini, University of Naples Federico II, Via Vicinale Cupa Cinthia, 26, 80126 Napoli, Italy.

出版信息

J Imaging. 2025 Apr 28;11(5):136. doi: 10.3390/jimaging11050136.

Abstract

Over the last decades, plenty of ancient manuscripts have been digitized all over the world, and particularly in Europe. The fruition of these huge digital archives is often limited by the bleed-through effect due to the acid nature of the inks used, resulting in very noisy images. Several authors have recently worked on bleed-through removal, using different approaches. With the aim of developing a bleed-through removal tool, capable of batch application on a large number of images, of the order of hundred thousands, we used machine learning and robust statistical methods with four different methods, and applied them to two medieval manuscripts. The methods used are (i) non-local means (NLM); (ii) Gaussian mixture models (GMMs); (iii) biweight estimation; and (iv) Gaussian blur. The application of these methods to the two quoted manuscripts shows that these methods are, in general, quite effective in bleed-through removal, but the selection of the method has to be performed according to the characteristics of the manuscript, e.g., if there is no ink fading and the difference between bleed-through pixels and the foreground text is clear, we can use a stronger model without the risk of losing important information. Conversely, if the distinction between bleed-through and foreground pixels is less pronounced, it is better to use a weaker model to preserve useful details.

摘要

在过去几十年里,世界各地,尤其是欧洲,大量古代手稿已被数字化。这些庞大数字档案的成果常常因所用墨水的酸性导致的渗色效应而受限,从而产生噪声很大的图像。最近,几位作者采用不同方法致力于去除渗色。为开发一种能够对数十万量级的大量图像进行批量应用的渗色去除工具,我们使用机器学习和稳健统计方法,采用四种不同方法,并将其应用于两份中世纪手稿。所使用的方法有:(i)非局部均值(NLM);(ii)高斯混合模型(GMM);(iii)双权估计;以及(iv)高斯模糊。将这些方法应用于上述两份手稿表明,总体而言,这些方法在去除渗色方面相当有效,但必须根据手稿的特征来选择方法,例如,如果没有墨水褪色且渗色像素与前景文本之间的差异明显,我们可以使用更强的模型而不会有丢失重要信息的风险。相反,如果渗色像素与前景像素之间的区别不太明显,最好使用较弱的模型来保留有用细节。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d9/12112446/58199bf6d300/jimaging-11-00136-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验