• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深层阿拉米语:在金石学中实现机器学习的综合数据范例。

Deep Aramaic: Towards a synthetic data paradigm enabling machine learning in epigraphy.

机构信息

Faculty of Theology and Religious Science, University of Strasbourg, Strasbourg, France.

Faculty of Humanities, History, Ancient History, University of Amsterdam, Amsterdam, Netherlands.

出版信息

PLoS One. 2024 Apr 19;19(4):e0299297. doi: 10.1371/journal.pone.0299297. eCollection 2024.

DOI:10.1371/journal.pone.0299297
PMID:38640100
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11029639/
Abstract

Epigraphy is witnessing a growing integration of artificial intelligence, notably through its subfield of machine learning (ML), especially in tasks like extracting insights from ancient inscriptions. However, scarce labeled data for training ML algorithms severely limits current techniques, especially for ancient scripts like Old Aramaic. Our research pioneers an innovative methodology for generating synthetic training data tailored to Old Aramaic letters. Our pipeline synthesizes photo-realistic Aramaic letter datasets, incorporating textural features, lighting, damage, and augmentations to mimic real-world inscription diversity. Despite minimal real examples, we engineer a dataset of 250 000 training and 25 000 validation images covering the 22 letter classes in the Aramaic alphabet. This comprehensive corpus provides a robust volume of data for training a residual neural network (ResNet) to classify highly degraded Aramaic letters. The ResNet model demonstrates 95% accuracy in classifying real images from the 8th century BCE Hadad statue inscription. Additional experiments validate performance on varying materials and styles, proving effective generalization. Our results validate the model's capabilities in handling diverse real-world scenarios, proving the viability of our synthetic data approach and avoiding the dependence on scarce training data that has constrained epigraphic analysis. Our innovative framework elevates interpretation accuracy on damaged inscriptions, thus enhancing knowledge extraction from these historical resources.

摘要

金石学见证了人工智能的日益融合,特别是通过其机器学习(ML)子领域,特别是在从古代铭文提取见解等任务中。然而,用于训练 ML 算法的稀缺标记数据严重限制了当前技术,特别是对于古文字如古阿拉姆语。我们的研究为生成针对古阿拉姆语字母的合成训练数据开创了一种创新方法。我们的流水线合成了逼真的阿拉姆语字母数据集,其中包括纹理特征、光照、损坏和增强,以模拟真实世界的铭文多样性。尽管实际示例很少,但我们设计了一个包含 250000 个训练图像和 25000 个验证图像的数据集,涵盖了阿拉姆字母表中的 22 个字母类。这个全面的语料库为训练一个用于分类高度退化的阿拉姆字母的残差神经网络(ResNet)提供了大量数据。ResNet 模型在对来自公元前 8 世纪哈达德雕像铭文的真实图像进行分类时表现出 95%的准确率。其他实验验证了在不同材料和风格上的性能,证明了有效的泛化能力。我们的结果验证了该模型在处理各种真实场景下的能力,证明了我们的合成数据方法的可行性,并避免了对稀缺训练数据的依赖,这些数据一直限制着金石学分析。我们的创新框架提高了对受损铭文的解释准确性,从而增强了从这些历史资源中提取知识的能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/ccb07ac4fa10/pone.0299297.g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/6a4b7b9f48db/pone.0299297.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/8d86f88f1dc9/pone.0299297.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/173da303c081/pone.0299297.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/48fc03c3f888/pone.0299297.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/e3f78d1ab91a/pone.0299297.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/00e80798c880/pone.0299297.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/1088f476cecb/pone.0299297.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/7c234342a286/pone.0299297.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/9c5dee6eb943/pone.0299297.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/992692381ec6/pone.0299297.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/a68e5483d1b6/pone.0299297.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/6e71e19f0bb1/pone.0299297.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/4f6b2cd8d122/pone.0299297.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/7efcb289d995/pone.0299297.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/1219dcc748ba/pone.0299297.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/6bba9da967b4/pone.0299297.g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/d3b1a1621e9e/pone.0299297.g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/ccb07ac4fa10/pone.0299297.g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/6a4b7b9f48db/pone.0299297.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/8d86f88f1dc9/pone.0299297.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/173da303c081/pone.0299297.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/48fc03c3f888/pone.0299297.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/e3f78d1ab91a/pone.0299297.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/00e80798c880/pone.0299297.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/1088f476cecb/pone.0299297.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/7c234342a286/pone.0299297.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/9c5dee6eb943/pone.0299297.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/992692381ec6/pone.0299297.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/a68e5483d1b6/pone.0299297.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/6e71e19f0bb1/pone.0299297.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/4f6b2cd8d122/pone.0299297.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/7efcb289d995/pone.0299297.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/1219dcc748ba/pone.0299297.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/6bba9da967b4/pone.0299297.g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/d3b1a1621e9e/pone.0299297.g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c598/11029639/ccb07ac4fa10/pone.0299297.g018.jpg

相似文献

1
Deep Aramaic: Towards a synthetic data paradigm enabling machine learning in epigraphy.深层阿拉米语:在金石学中实现机器学习的综合数据范例。
PLoS One. 2024 Apr 19;19(4):e0299297. doi: 10.1371/journal.pone.0299297. eCollection 2024.
2
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
3
Deep convolutional neural network and IoT technology for healthcare.用于医疗保健的深度卷积神经网络和物联网技术。
Digit Health. 2024 Jan 17;10:20552076231220123. doi: 10.1177/20552076231220123. eCollection 2024 Jan-Dec.
4
Ensemble machine learning model trained on a new synthesized dataset generalizes well for stress prediction using wearable devices.在新合成数据集上训练的集成机器学习模型,对于使用可穿戴设备进行压力预测具有良好的泛化能力。
J Biomed Inform. 2023 Dec;148:104556. doi: 10.1016/j.jbi.2023.104556. Epub 2023 Dec 2.
5
Data-driven evolution of water quality models: An in-depth investigation of innovative outlier detection approaches-A case study of Irish Water Quality Index (IEWQI) model.水质模型的数据驱动演变:创新异常值检测方法的深入研究——以爱尔兰水质指数(IEWQI)模型为例
Water Res. 2024 May 15;255:121499. doi: 10.1016/j.watres.2024.121499. Epub 2024 Mar 20.
6
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.
7
Shallow and deep learning classifiers in medical image analysis.医学图像分析中的浅层和深度学习分类器。
Eur Radiol Exp. 2024 Mar 5;8(1):26. doi: 10.1186/s41747-024-00428-2.
8
An interpretable multi-scale convolutional attention residual neural network for glioma grading with Raman spectroscopy.一种用于基于拉曼光谱的脑胶质瘤分级的可解释多尺度卷积注意力残差神经网络。
Anal Methods. 2025 Jan 23;17(4):677-687. doi: 10.1039/d4ay02068e.
9
A Convolutional Neural Network for Real Time Classification, Identification, and Labelling of Vocal Cord and Tracheal Using Laryngoscopy and Bronchoscopy Video.基于喉镜和支气管镜视频的实时分类、识别和标记声带和气管的卷积神经网络
J Med Syst. 2020 Jan 2;44(2):44. doi: 10.1007/s10916-019-1481-4.
10
CLEAR: Multimodal Human Activity Recognition via Contrastive Learning Based Feature Extraction Refinement.CLEAR:基于对比学习的特征提取优化的多模态人类活动识别
Sensors (Basel). 2025 Feb 1;25(3):896. doi: 10.3390/s25030896.

本文引用的文献

1
Synthetic data at scale: a development model to efficiently leverage machine learning in agriculture.大规模合成数据:一种在农业中有效利用机器学习的发展模式。
Front Plant Sci. 2024 Sep 16;15:1360113. doi: 10.3389/fpls.2024.1360113. eCollection 2024.
2
Unsupervised deep learning supports reclassification of Bronze age cypriot writing system.无监督深度学习支持青铜时代塞浦路斯书写系统的重新分类。
PLoS One. 2022 Jul 14;17(7):e0269544. doi: 10.1371/journal.pone.0269544. eCollection 2022.
3
Restoring and attributing ancient texts using deep neural networks.
利用深度神经网络修复和归因古代文本。
Nature. 2022 Mar;603(7900):280-283. doi: 10.1038/s41586-022-04448-z. Epub 2022 Mar 9.
4
ChemPix: automated recognition of hand-drawn hydrocarbon structures using deep learning.ChemPix:利用深度学习对手绘烃类结构进行自动识别
Chem Sci. 2021 Jul 3;12(31):10622-10633. doi: 10.1039/d1sc02957f. eCollection 2021 Aug 11.
5
Artificial intelligence based writer identification generates new evidence for the unknown scribes of the Dead Sea Scrolls exemplified by the Great Isaiah Scroll (1QIsaa).基于人工智能的作者鉴定为死海古卷中未知抄写员(以 1QIsaa 中的《以赛亚书》抄本为例)提供了新的证据。
PLoS One. 2021 Apr 21;16(4):e0249769. doi: 10.1371/journal.pone.0249769. eCollection 2021.
6
Deep learning of cuneiform sign detection with weak supervision using transliteration alignment.基于音译对齐的弱监督楔形文字符号检测深度学习。
PLoS One. 2020 Dec 16;15(12):e0243039. doi: 10.1371/journal.pone.0243039. eCollection 2020.
7
Forensic document examination and algorithmic handwriting analysis of Judahite biblical period inscriptions reveal significant literacy level.对犹大圣经时期铭文的法医学文件检验和算法笔迹分析显示出较高的识字水平。
PLoS One. 2020 Sep 9;15(9):e0237962. doi: 10.1371/journal.pone.0237962. eCollection 2020.