Stack-Based Approach (SBA) and Stack-Based Query Language (SBQL)
by Kazimierz Subieta (May 2006)
Back to Description of SBA and SBQL.
The Stack-Based Approach (SBA) is a formal approach to object-oriented database query and programming languages. In SBA we reconstruct query languages’ concepts from the point of view of programming languages (PLs). The approach is motivated by our belief that there is no definite border-line between querying and programming. All attempts to establish it failed; see the relational completeness, being essentially a random, poorly motivated concept on the scale of the universality of query languages. Query languages, as facilities for database programming, absorb a lot of PLs’ functionalities; see imperative programming extensions of SQL-92 (update, insert, delete, stored procedures, etc.), a new SQL standard known as SQL-99, Oracle PL/SQL, MS SQL Server Transact-SQL, the recent J2EE Hybernate tool, the Microsoft LINQ.NET project, and a lot of Rapid Application Development tools. Another stream of persistent and/or polymorphic database PLs follows this line through integrating queries with programming languages. SBA is an attempt to create a unified conceptual and semantic basis for queries and programs involving queries, including programming abstractions such as procedures, functions, classes, types, methods, views, etc.
SBA and SBQL are neutral with respect to data models. SBA covers all the database models that we are aware of, starting from the relational model, through XML-oriented data model, RDF-oriented data model, up to sophisticated object-oriented models with static and dynamic inheritance, collections, associations, polymorphism, etc. Our fundamental assumption is that SBA and SBQL address data structures rather than specific ideological assumptions and constraints known as data models. Once one would determine how particular concepts in a data model are to be mapped as abstract data structures, we could propose a corresponding subset of SBA/SBQL that will be able to handle these structures with full algorithmic universality and precision. In this way the discussion of query language we have shifted on another level: we can talk how particular features of data structures are to be served by SBQL rather than sticking a concrete query language with a concrete data model. For instance, when one will determine how XML files will be mapped as abstract data structures we can propose SBQL to serve these structures. In this way we achieved the unique universality, flexibility and performance optimization potential. In particular SBQL is the first in the history query language that deals with dynamic object roles and dynamic inheritance. Moreover, powerful query optimization methods that are developed for SBQL are prepared to work with such advanced features.
SBA can be considered as a theoretical approach with a strong and complete bridge to practice. Because development of SBA was preceded by several implementations of query languages, it can also be considered as a practical approach resulting in a consistent and universal theory.
The design of modern and universal database PLs having querying capabilities requires methods and principles that are already acknowledged by the common practice of developing compilers and interpreters. Practically useful PLs must deal with object-orientedness, procedures and functional procedures (including recursive ones), parameter passing, various control statements, binding strategies, scoping rules, modularization, typing, etc. They should follow software engineering principles such as orthogonality, modularity, minimality, universality, genericity, typing safety, and clean, precise semantics.
The above issues turn out to be very severe for theoretical concepts developed in the database domain for dealing with query languages, including the relational algebra, relational calculus and formal logics. SBA is an alternative to theoretical concepts emerging on the object-orientedness wave, such as nested relational algebras, object algebras, object calculi, F-logic, comprehensions, structural recursion, monoid calculus, functional approaches, etc. Careful analysis of these theoretical frames has led us to the conclusion that all of them are too limited (and sometimes totally inadequate, cf. object algebras) as the semantic basis for this kind of query languages that we intend to develop.
The SBA solution relies on adopting the classical run-time mechanism of PLs, and then, introducing to it necessary improvements. The main syntactic decision of our approach is unification of PL expressions and queries; queries remain as the only kind of PL expressions. For instance, in SBA there is no conceptual difference between expressions such as 2+2 and (x+y)*z, and queries such as
Employee where Salary = 1000
(Employee where Salary = (x+y)*z).Name
All such expressions (or queries, if you like) can be used as arguments of imperative statements, as parameters of procedures, functions or methods, as a return from a functional procedure, etc. Note that in our expressions/queries we avoid the extensive SQL-like syntactic sugar, which is non-orthogonal, sometimes illogical and makes complex queries illegible.
Concerning semantics, we focus on the classical naming-scoping-binding paradigm. Each name occurring in a query (or in a program involving queries) is bound to run-time programming entities (persistent data, procedures, actual parameters of procedures, local procedure objects, etc.) according to the actual scope for the name. The common PLs’ approach (that we follow in SBA) is that the scopes are organized in an environmental stack with the “search from the top” rule (thus the stack-based approach). Some extensions to the structure of stacks used in PLs are necessary to accommodate the fact that in a database we have persistent and bulk data structures and the fact that data are on a server machine, while a stack is on a client machine. Hence the stack contains data identifiers rather than data themselves (i.e., we separate the stack from a store of objects), and possibly multiple objects can be simultaneously bound to a name occurring in a query, which makes the many-data-at-a-time processing possible. The operational semantics (abstract implementation) of query operators, imperative programming constructs and procedures (functions, methods, views, etc.) is defined in terms of an abstract object store and operations on two classical stacks: environmental stack (abbreviated as ENVS) and query results stack (abbreviated as QRES).
Almost all the issues presented below are already discussed in detail in the Polish version of the SBA/SBQL book and in a lot of papers and reports. They will be consecutively inserted here as hyperlinks during the progress in writing the English version of the book. Majority of language concepts and constructs presented below are already implemented in our prototypes and commercial systems.
Last modified: July 30, 2006