Skip to content

Basic SQL fragment

Following Guagliardo and Libkin (2017), we start with the "fragment that [they] call basic SQL".

Covered syntax

The syntax below is an adaption of the proposed basic SQL fragment to ABNF.

The SQL syntax below is inspired on the BNF proposed by Foster and Godbole (2016).

Included features

Following the work of Guagliardo and Libkin (2017, p.29), this syntax covers only the following features:

  • TODO: provide the list of supported features!

Semantic interpretation

Queries

Query syntax
; utility definitions
TOKEN         = "foo"
Alias         = TOKEN

; tables
Table         = TOKEN
TableAccess   = Table ["AS" Alias]
              / TableAccess "," Table ["AS" Alias]

; columns
Column        = TOKEN
ColumnAccess  = Column ["AS" Alias]
              / ColumnAccess "," Column ["AS" Alias]

; query
Query  = "SELECT" ["DISTINCT"] ColumnAccess QueryFrom
       / "SELECT" ["DISTINCT"] "*" QueryFrom
       / Query
         ("UNION" / "INTERSECT" / "EXCEPT")
         ["ALL"]
         Query ";"

QueryFrom = "FROM" TableAccess ";"
          / "FROM" TableAccess "WHERE" Condition ";"

Typing Queries

  • terms
    • table access
    • column access
    • asterix access
    • constant access
    • aliases
    • term expressions
    • term aggregations
  • queries
    • from
    • from where
    • select
    • select distinct
    • aggregations
    • compound queries

Query type inference rules

Query type inference
-- Terms
-- `````
     -- TODO: currently no term "expressions" are typed
     --   - [ ] add support for full-names: e.g. R.C
     --   - [ ] add support for expressions: e.g. coalesce(X,Y,...)
     -- TODO: define type-operation `includes? XS YS`
(T0) G |- t : BT /\ includes? BT BaseTypes |= t : BT
(T1) G |- t_1 : BT_1 /\ ... /\ t_n : BT_n
       |= t_1, ..., t_n : BT_1, ..., BT_n
(T2) G |- t_1, ..., t_m : BT_1, ..., BT_m
       |= t_1 as N_1, ..., t_m as N_m : BT_1, ..., BT_m
-- Relations
-- `````````
     -- TODO: will this deal with subqueries?
(R0) G |- r : DbView <x> {YS} /\ includes? DbView <x> {YS} Schema 
       |= r : DbView <x> {YS}
     -- TODO: define type-operation (`U`)nion: `XS U YS`
(R1) G |- r_1 : DbView <x_1> {YS_1} /\ ... /\ r_n : DbView <x_n> {YS_n}
       |= r_1, ..., r_n : DbView <bag> {YS_1 U ... U YS_n}
(R2) G |- r_1, ..., r_n : DbView <x> {YS_1 U ... U YS_n}
       |= r_1 as N_1, ...,  r_m as N_m : DbView <x> {YS_1 U ... U YS_n}
-- Query sources
-- `````````````
(S0) G |- s : DbView <x> {YS}
       |= FROM s : DbView <x> {YS}
     -- NB: Trilean is the type of 3VL used in SQL
     --     see: https://en.wikipedia.org/wiki/Three-valued_logic
(S1) G |- s : DbView <x> {YS} /\ c : Trilean
       |= FROM s WHERE c : DbView <x> {YS}
-- Queries
-- ```````
(Q0) G |- t : BTS |= SELECT t ; : DbView <bag> {BTS}
(Q1) G |- t : BTS |= SELECT DISTINCT t ; : DbView <set> {BTS}
(Q2) G |- t : BTS /\ fw : DbView <x> {YS} /\ includes? BTS YS
       |= SELECT t fw; : DbView <bag> {BTS}
(Q3) G |- t : BTS /\ fw : DbView <x> {YS} /\ includes? BTS YS
       |= SELECT DISTINCT t fw; : DbView <set> {BTS}
     -- NB: Here we deviate from Guagliardo & Libkin (2017) by not
     --     making a distinction between the scope of `SELECT * ...`
(Q4) G |- fw : DbView <x> {YS} |= SELECT * fw; : DbView <bag> {YS}
(Q5) G |- fw : DbView <x> {YS} |= SELECT DISTINCT * fw; : DbView <set> {YS}
-- Query operations
-- ````````````````
(O0) G |- q_1 : DbView <x> {YS} /\ q_2 : DbView <x> {YS}
       |= q_1 UNION ALL q_2 : DbView <bag> {YS}
(O1) G |- q_1 : DbView <x> {YS} /\ q_2 : DbView <x> {YS}
       |= q_1 INTERSECT ALL q_2 : DbView <bag> {YS}
(O2) G |- q_1 : DbView <x> {YS} /\ q_2 : DbView <x> {YS}
       |= q_1 EXCEPT ALL q_2 : DbView <bag> {YS}
(O3) G |- q_1 : DbView <x> {YS} /\ q_2 : DbView <x> {YS}
       |= q_1 UNION q_2 : DbView <set> {YS}
(O4) G |- q_1 : DbView <x> {YS} /\ q_2 : DbView <x> {YS}
       |= q_1 INTERSECT q_2 : DbView <set> {YS}
(O5) G |- q_1 : DbView <x> {YS} /\ q_2 : DbView <x> {YS}
       |= q_1 EXCEPT q_2 : DbView <set> {YS}

Conditions

Condition syntax
; predicate
Predicate     = TOKEN

; condition
Condition     = "TRUE" 
              / "FALSE" 
              / Predicate
              / Column "IS" ["NOT"] "NULL"
              / Column "IS" ["NOT"] "IN" Query
              / "EXISTS" Query
              / Condition "AND" Condition
              / Condition "OR" Condition
              / "NOT" Condition

Typing conditions

  • Base cases: TRUE | FALSE | NULL
  • NULL comparison
  • IN value inclusion comparison
  • EXISTS non-empty query check
  • Basic boolean logic: AND, OR, NOT, IS, IS NOT
  • Extended predicate logic

Condition type inference rules

Condition type inference
-- Basics
-- ``````
(C0) |= <true>   : Trilean
(C1) |= <false>  : Trilean
(C2) |= <null>   : Trilean

-- Stanadrd logics
-- ```````````````
(L0) G |- x : Trilean /\ y : Trilean |= x AND y : Trilean
(L1) G |- x : Trilean /\ y : Trilean |= x OR y : Trilean
(L2) G |- x : Trilean |= NOT x : Trilean
(L3) G |- l : BTS /\ r : BTS /\ len? BTS 1
       |= l IS r : Trilean
(L4) G |- l : BTS /\ r : BTS /\ len? BTS 1
       |= l IS NOT r: Trilean
-- Query predicates
-- ``````````````
     -- TODO: define type operation `len? XS N`
(P2) G |- q : DbView <x> {YS}  |= EXISTS q : Trilean
(P4) G |- t : BTS_a /\ ts : BTS_b /\ len? BTS_a 1 /\ includes? BTS_a BTS_b
       |= t IN ts : Trilean
(P5) G |- t : BTS_a /\ ts : BTS_b /\ len? BTS_a 1 /\ includes? BTS_a BTS_b
       |= t NOT IN ts : Trilean
(P4) G |- t : BTS /\ q : DbView <x> {YS} /\ len? BTS 1 /\ includes? BTS YS
       |= t IN q : Trilean
(P5) G |- t : BTS /\ q : DbView <x> {YS} /\ len? BTS 1 /\ includes? BTS YS
       |= t NOT IN q : Trilean

References

  • Guagliardo, P., & Libkin, L. (2017). A formal semantics of SQL queries, its validation, and applications. Proceedings of the VLDB Endowment, 11(1), 27–39. https://doi.org/10.14778/3151113.3151116
  • Foster, E. C., & Godbole, S. (2016). Bnf syntax for selected sql statements. In E. C. Foster & S. Godbole (Eds.), Database Systems: A Pragmatic Approach (pp. 539–583). Apress. https://doi.org/10.1007/978-1-4842-1191-5_29
    • TODO: maybe this can be removed?

Extra reading