Query Evaluation Procedure eval
by Kazimierz Subieta (February 2006)
Back to Description of SBA and SBQL.
Previously, in the section devoted to the syntax, semantics and pragmatics of query languages we have described our approach to the operational semantics, in particular, to semantic rules implemented within an abstract evaluation machine. The machine has the form of the recursive procedure eval. In this section we provide more detailed description of this procedure.
· The procedure eval has a query as a parameter. The query is represented in the abstract syntax. In implementation the query has the form of a syntactic tree, which is the most convenient for quick and univocal recognizing its structure. The procedure eval makes no changes to the syntactic tree.
· In real implementation, before the procedure eval is called, the syntactic tree is processed by a strong type checker and by optimization modules, see Fig.25 (eval is some abstraction over the interpreter of queries/programs module). The strong type checker removes some ambiguities, for instance, maps literals into values of corresponding types, resolves automatic dereferences, coercions and ellipses, and makes other changes to the tree. Then, optimization modules can make significant changes of the tree aiming query optimization. All these changes, however, do not influence the result returned by the query, hence we neglect them during our description of eval.
· The procedure eval will operate on three basic data structures that are introduced in SBA: the object store, the ENVS stack and the QRES stack, see Fig.25.
· It may change the state of ENVS, but for queries with no side effects the initial state and the final state of ENVS is the same.
· Assuming no side effects in queries, the procedure makes no changes in the object store.
· The procedure makes no change in the initial part of the QRES state, i.e. in these sections that were on QRES when the procedure started. At the end of the procedure, QRES is augmented by the new top section with the result of the processed query, Fig.26.
· The result from the top of QRES is then “consumed” by some agent external to the query, for instance, by a print procedure, by a user graphical interface, by an update command, etc. After such operation the stack returns to the initial state.
Fig.26. QRES changed by the procedure eval and by a query result consumer
Fig.27 presents an example of a syntactic tree for a simple query. A node of the tree contains information on its kind and may contain the information from the query. Each node contains also some free space (a grey box) that can be filled in by a type checker or by optimization modules. We explain the kind and role of this information later. A tree may contain some elements that are implicit in the query, for instance, the algebraic dereference (deref) operator.
Fig.27. Syntactic tree of the query: Emp where name = “Poe”
The procedure eval follows the orthogonality and compositionality principles, which say (in the case of queries) that the semantics of a query is some function of the semantics of its direct subqueries. The principles imply simple and intuitive recursive definition of the semantics.
The definition of eval considers a few cases that are recognized in the syntactic tree. We present the procedure in the form of a pseudo-code. Our goal is to illustrate the formal semantics rather than to present details of all the operators that we would like to introduce in SBQL. For the purposes of standardization the procedure eval can be refined precisely, but this requires some agreement on the pseudo-code that will be used for the specification. The agreement is similar to the standardization of syntax in the form of BNF grammars.
The most simple SBQL queries are literal and names. During parsing or type checking literals are changed into values of proper types. This operation can be written as application of the function lexval: L → V, where L is the set of literals, V is the set of object store values. Concerning names occurring in queries, they are bound on ENVS and the corresponding function is bind: N → Result, where N is the set of all names, Result is the set of all query results, as defined previously. Other cases include unary algebraic operators, binary algebraic operators, naming operators (unary algebraic operators parameterized by a name), non-algebraic operators and structure/collection construction operators. The corresponding parts of the eval procedure will be explained subsequently for particular operators or their kinds. Below we present the beginning and some frame of the procedure. Stack operators push, pop, top will be parameterized by the name of the stack. We also distinguish symbols of operators occurring in queries and the operators themselves as mathematical beings. Operators will be underlined: if S is a symbol denoting an operator, the S is the operator itself.
In the following we subdivide SBQL query operators into algebraic and non-algebraic. The subdivision is based on a very simple rule:
· Algebraic operators do not deal with the ENVS stack; they act on the QRES stack only. Algebraic operators can be unary, binary and perhaps may have more arguments. In SBQL all operators having more than two arguments can be reduced to the composition of unary and binary operators. For instance, the ternary operator if condition then q1 else q2 can be understood as a composition of three binary and one unary operators: bag(if condition then q1, if not condition then q2); providing q1 and q2 return bags. Similarly, bag( q1, q2, q3, q4 ) can be understood as bag( q1, bag( q2, bag( q3, q4 ))). If an algebraic operator ∆ is binary, then in the query qleft ∆ qright the order of evaluation of qleft and qright does not matter. We would like to emphasize (due to frequent misunderstanding) that we are talking on the order of evaluation, not about the order of arguments. In general, qleft ∆ qright is not equivalent to qright ∆ qleft (this holds of course for some operators, but not for all), but this means that we can call the eval procedure first to qright and then to qleft and still the semantics of the query can be correctly determined.
· Non-algebraic operators deal with both ENVS and QRES. Non-algebraic operators are the core of SBA and SBQL. The fundamental difference with algebraic operators is that in the query qleft θ qright , where θ is a non-algebraic operator, the order of evaluating queries qleft, qright is fixed. The procedure eval must be first called on the qleft argument, and then, the procedure eval is called many times for the qright argument. We present this part of the eval procedure a bit later, after discussion and introducing some new notions.
The definition of other algebraic operators that are present in the above procedure eval body is simple and straightforward; we return to them a bit later.
Last modified: July 14, 2006