©Copyright by Kazimierz Subieta.

SBQL for the AS0 Store Model - Syntax

by Kazimierz Subieta

Back to Description of SBA and SBQL.

 

The Stack-Based Query Language (SBQL) is a formalized object-oriented query language in the SQL or OQL style. It contains semantic counterparts of all the constructs of these languages, but in essentially different configuration and semantic mechanisms. SBQL may concern many object store models, in particular AS0-AS3 models. Thus we claim that SBQL is the most complete and universal query language among all that can be found in the literature and practice.

The concrete syntax of SBQL is not fixed. In SBQL we strive to reduce syntactic considerations only to some abstract syntax. In concrete implementation any designer can invent an own concrete syntax that he/she considers the most adequate for the given purpose. So far the syntactic standard of SBQL does not exist. SBQL we consider as a theoretical frame or pattern for object-oriented languages, similar to the relational algebra or relational calculus. With no doubt, however, SBQL is much more powerful, strong, consequent and consistent in comparison to any such theory, including relational and object algebras. The definition of SBQL explicitly involves the concept of state that is absent in major database theories. Lack of the state concept is a severe flaw of these theories, because it reduces the possibility to define many useful operators and from the very beginning forces some false view on query languages’ semantics. As we argued previously, the concept of the state and the naming-scoping-binding issues must lead to the concept of environment stack, which is absent in major database theories too.

Semantics of SBQL is expressed operationally by a machine, which accomplishes abstract implementation of SBQL constructs. Essentially, this decision is motivated by didactics, because operational semantics for so complex artifacts as query languages is much easier to grasp by the average reader. In our early work we tried to use denotational semantics, which is apparently “more elegant” (according to some theoreticians). We abandoned this specification style because it was illegible, non-intuitive and caused additional barriers in the understanding of implementation.

In SBQL we introduce a new kind of query operators, which we call “non-algebraic” for some tradition in mathematics which treats some operators (such as quantifiers) as non-algebraic ones. Selection, projection/navigation, join, quantifiers, ordering and transitive closures are all the non-algebraic operators that we introduce. Some of them (selection, projection and join) were introduced in the relational algebra as “algebraic” operators. However, their ”algebraization” was at the cost of a bit unfair formal trick, where a part of their semantics (for instance, a selection predicate) was shifted to the informal meta-language of the mathematics. We have definitely rejected such a trick as frustrating, unacceptable and completely unnecessary. Mathematical solutions concerning semantics of query languages are not limited to such limited, simplistic and primitive theories as the relational algebra or relational calculus. In particular, the denotational semantics is much more adequate notion. However, as we have mentioned before, we do not follow strictly the mathematical method because essentially it has no advantages and has severe disadvantages. Our formal approach relies in very precise definition of an abstract machine that will perform operations implied by query operators. This level of formality is adequate to reason about any properties of query languages, including implementation of the language, possible methods of query optimization, strong typing, querying distributed databases and other issues.


SBQL Syntax

We assume that some elements of the set of atomic values V introduced in the AS0 model have the external form, which allows representing them as sequences of characters in a query source code. The representation is sufficient to distinguish such an element by some lexer and then, to assign to it a proper value of V (and an atomic type). Note that many elements of V, such as graphics and compiled procedures, have no such source code representation. Such external representation elements we traditionally call literals[1]. The set of literals we denote L. We assume that the lexer accomplishes the function:

lexval: LV

Usually there are few types of literals: integer numbers, real numbers, strings, boolean values and dates.

We assume that identifiers from the set I introduced in the AS0 model have no such external representations, hence they cannot be directly used in source queries. The only way to deal with such identifiers is to use in queries external names from the set N.  They are to be bound according to the rules explained in the previous section. In this way we are ready to define the simplest queries of SBQL:

·        Any literal belonging to L is a query. For instance,    2,     3.14,    “Doe”,    true    are queries.

·        Any name belonging to N is a query. For instance,    Emp,    sal,    worksIn,    e,    d     are queries.

One may be surprised that a query such as sal is the query of its own rights. We understand that this can be unusual for the SQL lovers. However note that each query is evaluated relatively to the state, in particular, to the state of ENVS. If ENVS contains a binder sal( x ), then the query sal makes a sense and its result can be formally determined as x.

Queries can be joined in more complex queries by operators. SBQL is perhaps the only practical query language which assumes that all operators are unary or binary. In this way the SBQL grammar is very simple. Moreover, the grammar in the maximal extent supports orthogonality and compositionality principles: queries can be freely connected by operators (providing typing constraints are not violated). We subdivide binary operators into algebraic and non-algebraic. In this grammar we neglect parentheses assuming that they can be freely inserted if there are doubts concerning their order or precedence during the parsing. Then, the syntactic rules of SBQL are as follows:

·        If is a symbol denoting a unary algebraic operator and q is a query, then    q    is a query. Examples of unary algebraic operators are: count, sum, avg, max, median, -, log, sqrt, not, etc. Because SBQL is an abstract language, we do not fix which unary operators belong to SBQL and which do not belong. We simply assume that SBQL accepts any unary operator if some designer wants to introduce it.

·        If is a symbol denoting a binary algebraic operator and q1, q2 are queries, then    q1 q2    is a query. Examples of binary algebraic operators are: =, <, >, +, -, *, /, and, or, intersection, concatenation, etc. As previously, we do not fix which binary algebraic operators belong to SBQL. Any of them will be accepted by us if it is necessary in examples or in a particular implementation. For binary operators we assume the traditional infix notation (an operator will be written between its arguments).

·        If θ is a symbol denoting a non-algebraic operator and q1, q2 are queries, then   q1 θ q2    is a query. Non-algebraic operators are the following: where (selection), dot (projection, navigation), join (dependent or navigational join), , (universal and existential quantifiers), order by (sorting) and operators of transitive closures (will be introduced later). For quantifiers we also assume the traditional prefix notation: q1 q2 and q1 q2, which is equivalent to the universal infix notation q1 q2 and q1 q2.

·        If q is a query and n N then   q as n    is a query. The operator as is a unary algebraic operator parameterized by a name. This operator will be used in most cases which require defining an auxiliary name within a query. The operator as in other languages allows to define “correlation variables” (SQL, OQL), variables bound by quantifiers, cursors in for statements, etc. In ODMG OQL there are eight situations when one can define an auxiliary name. Our definition of the operator as will cover all such cases and more. As will be shown, the operator is orthogonal to any other operators and its semantics is very simple.

·        If q is a query and n N then   q group as n    is a query. This is another naming operator similar to as, but with different semantics and pragmatics.

·        If  q1, q2 are queries, then     if q1 then q2    is a query. If q1, q2, q3  are queries, then    if q1 then q2 else q3       is a query. The last query has three arguments, but it is a shorthand - it can be substituted by the composition of queries having one-argument and two-argument operators:  if q1 then q2 else q3   is equivalent to bag((if q1 then q2), (if not q1 then q3)).

·        If q1, q2, q3, ... are queries, then    struct( q1, q2, q3,  ... )    is a query. The operator struct has many arguments, but as in the previous case, it can be understood as a superposition of several binary operators; for instance, struct( q1, q2, q3, q4 ) =  struct(struct (struct ( q1, q2), q3,), q4 ). In the SQL and OQL terminology the operator is called cartesian product; we do not use this term. The keyword struct can be skipped by default, ( q1, q2, q3,  ... ) is equivalent to struct( q1, q2, q3,  ... ).

·        If q1, q2, q3,  ... are queries, then    bag( q1, q2, q3,  ... )    and    sequence( q1, q2, q3,  ... )    are queries. As previously, the operators have many arguments, but can be understood as a superpositions of several binary operators; for instance, bag( q1, q2, q3, q4 ) =  bag( bag( bag( q1, q2), q3,), q4 ).

These syntactic rules will be further extended by several more advanced constructs related to higher-level store models (AS1-AS3) and introducing programming constructs and abstractions such as procedures. Fig.21 contains the summary of the SBQL syntax for the AS0 model.

query ::= literal

The set L

query ::= name

The set N

query ::= unaryAlgOperator query

Unary algebraic operators

unaryAlgOperator ::= count | sum | max | - | sqrt | not | ...

 

query ::= query binaryAlgOperator query

Binary algebraic operators

binaryAlgOperator ::=  =|<| >| +| -| *| /| and| or| intersect|...

 

query ::= query NonAlgOperator query

Non-algebraic operator

NonAlgOperator ::= where | . | join | |

 

query ::= query query | query query

Alternative (traditional) syntax for quantifiers

query ::= query as name

Name definition

query ::= query group as name

Grouping and name definition

query ::= if query then query

Conditional query

query ::= if query then query else query

Another conditional query

querySeq ::= query | query, querySeq

Sequence of queries

query ::= struct( querySeq ) | (querySeq)

Structure constructor

query ::= bag( ) | bag( querySeq )

Bag constructor

query ::= sequence( ) | sequence( querySeq )

Sequence constructor

 

Fig.21. The SBQL basic syntax

 

Examples of SBQL queries (c.f. Fig.3):

1.    

2000

Literal

2.    

Emp

Name

3.    

sal

Name

4.    

2+2

algebraic operator

5.    

sal > 2000

algebraic operator

6.    

Emp where (sal > 2000)

Non-algebraic and algebraic operator

7.    

Emp where (sal > 2000) . (name, (worksIn.Dept))

As before, plus projection and structure composition

8.    

((Emp as e) where ((e.sal) > (2000 + x + y))).e.name

Auxiliary name e

9.    

((Emp as e) join ((e.worksIn.Dept) as d)) .   (e.name, d.dname)

Dependent join with auxiliary naming, path expression, projection and struct

10.   

(Emp where sal < 1000) ((worksIn.Dept.dname) = “Ads”)

Universal quantifier with no “variable”

11.   

  (Emp as e ) count(e.address) = 0

Existential quantifier with a “variable” e and counting function

12.   

bag( 1, 1, 2, 3, 5, 8, 3, 13)

Bag constructor

 

Looking at these examples we can see that SBQL is a very powerful query language. In fact, although its syntax is extremely simple, it is much more powerful than OQL and XQuery. All the features and examples of SBQL will be presented after explaining its semantics and pragmatics.

Note that in comparison to SQL and OQL we do not introduce operators for processing null values and a group by (having) operators. In SBA we avoid null values at all. Following Date, we consider null values as dangerous feature, leading to more problems that it is worth. Null values are substituted by absent (optional) data. Concerning the group by operator, an attempt to define consistent and universal semantics firmly convinced us that the operator is problematic (leading to semantic reefs and difficulties with query optimization) and unnecessary for object-oriented query languages - it can be substituted by other operators. We will show this further on examples.


Last modified: December 31, 2007



[1] Note that the term literal is used in the ODMG standard in a different meaning, out of the traditional terminology of programming languages. We do not follow the  ODMG terminology, retaining the traditional one.