Suppr超能文献

带空值的SQL形式化

A Formalization of SQL with Nulls.

作者信息

Ricciotti Wilmer, Cheney James

机构信息

Laboratory for Foundations of Computer Science, University of Edinburgh, 10 Crichton St, Edinburgh, EH8 9AB UK.

出版信息

J Autom Reason. 2022;66(4):989-1030. doi: 10.1007/s10817-022-09632-4. Epub 2022 Jul 27.

Abstract

SQL is the world's most popular declarative language, forming the basis of the multi-billion-dollar database industry. Although SQL has been standardized, the full standard is based on ambiguous natural language rather than formal specification. Commercial SQL implementations interpret the standard in different ways, so that, given the same input data, the same query can yield different results depending on the SQL system it is run on. Even for a particular system, mechanically checked formalization of all widely-used features of SQL remains an open problem. The lack of a well-understood formal semantics makes it very difficult to validate the soundness of database implementations. Although formal semantics for fragments of SQL were designed in the past, they usually did not support set and bag operations, lateral joins, nested subqueries, and, crucially, null values. Null values complicate SQL's semantics in profound ways analogous to null pointers or side-effects in other programming languages. Since certain SQL queries are equivalent in the absence of null values, but produce different results when applied to tables containing incomplete data, semantics which ignore null values are able to prove query equivalences that are unsound in realistic databases. A formal semantics of SQL supporting all the aforementioned features was only proposed recently. In this paper, we report about our mechanization of SQL semantics covering set/bag operations, lateral joins, nested subqueries, and nulls, written in the Coq proof assistant, and describe the validation of key metatheoretic properties. Additionally, we are able to use the same framework to formalize the semantics of a flat relational calculus (with null values), and show a certified translation of its normal forms into SQL.

摘要

SQL是世界上最流行的声明式语言,构成了价值数十亿美元的数据库行业的基础。尽管SQL已经标准化,但完整的标准基于模糊的自然语言而非形式化规范。商业SQL实现以不同方式解释该标准,因此,给定相同的输入数据,相同的查询可能会根据其运行的SQL系统产生不同的结果。即使对于特定系统,对SQL所有广泛使用功能进行机械检查的形式化仍然是一个未解决的问题。缺乏易于理解的形式语义使得验证数据库实现的正确性非常困难。尽管过去设计了SQL片段的形式语义,但它们通常不支持集合和包操作、横向连接、嵌套子查询,至关重要的是,不支持空值。空值以类似于其他编程语言中的空指针或副作用的深刻方式使SQL的语义复杂化。由于某些SQL查询在没有空值时是等效的,但应用于包含不完整数据的表时会产生不同的结果,忽略空值的语义能够证明在实际数据库中不合理的查询等价性。支持所有上述功能的SQL形式语义直到最近才被提出。在本文中,我们报告了我们在Coq证明助手编写的涵盖集合/包操作、横向连接、嵌套子查询和空值的SQL语义机械化,并描述了关键元理论属性的验证。此外,我们能够使用相同的框架对平面关系演算(带空值)的语义进行形式化,并展示其范式到SQL的经过认证的翻译。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/78da/9637088/cb2a2af34f6d/10817_2022_9632_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验