Suppr超能文献

算算看:让维基百科中的数学内容可计算

Do the Math: Making Mathematics in Wikipedia Computable.

作者信息

Greiner-Petter Andre, Schubotz Moritz, Breitinger Corinna, Scharpf Philipp, Aizawa Akiko, Gipp Bela

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4384-4395. doi: 10.1109/TPAMI.2022.3195261. Epub 2023 Mar 7.

Abstract

Wikipedia combines the power of AI solutions and human reviewers to safeguard article quality. Quality control objectives include detecting malicious edits, fixing typos, and spotting inconsistent formatting. However, no automated quality control mechanisms currently exist for mathematical formulae. Spell checkers are widely used to highlight textual errors, yet no equivalent tool exists to detect algebraically incorrect formulae. Our paper addresses this shortcoming by making mathematical formulae computable. We present a method that (1) gathers the semantic information surrounding the context of each mathematical formulae, (2) provides access to the information in a graph-structured dependency hierarchy, and (3) performs automatic plausibility checks on equations. We evaluate the performance of our approach on 6,337 mathematical expressions contained in 104 Wikipedia articles on the topic of orthogonal polynomials and special functions. Our system, [Formula: see text], verified 358 out of 1,516 equations as error-free. [Formula: see text] successfully translated 27% of the mathematical expressions and outperformed existing translation approaches by 16%. Additionally, [Formula: see text] achieved an F1 score of .495 for annotating mathematical expressions with relevant textual descriptions, which is a significant step towards advancing searchability, readability, and accessibility of mathematical formulae in Wikipedia. A prototype of [Formula: see text] and the semantically enhanced Wikipedia articles are available at: https://tpami.wmflabs.org.

摘要

维基百科将人工智能解决方案的力量与人工审核相结合,以保障文章质量。质量控制目标包括检测恶意编辑、修正错别字以及发现格式不一致的问题。然而,目前对于数学公式不存在自动化的质量控制机制。拼写检查器被广泛用于突出文本错误,但不存在检测代数上不正确公式的等效工具。我们的论文通过使数学公式可计算来解决这一缺点。我们提出了一种方法,该方法(1)收集每个数学公式上下文周围的语义信息,(2)在图结构的依赖层次结构中提供对该信息的访问,以及(3)对方程进行自动合理性检查。我们在104篇关于正交多项式和特殊函数主题的维基百科文章中包含的6337个数学表达式上评估了我们方法的性能。我们的系统[公式:见正文]在1516个方程中验证了358个无错误。[公式:见正文]成功翻译了27%的数学表达式,并且比现有翻译方法的性能高出16%。此外,[公式:见正文]在用相关文本描述注释数学表达式时的F1分数为0.495,这是朝着提高维基百科中数学公式的可搜索性、可读性和可访问性迈出的重要一步。[公式:见正文]的原型以及语义增强的维基百科文章可在以下网址获取:https://tpami.wmflabs.org。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验