Developed at the

Polish-Japanese Institute of Information Technology

Chair of Software Engineering

SBA and SBQL home pages

© Copyright by ODRA team, © Copyright by PJIIT

 

 

 

 

ODRA – Object Database for Rapid Application development

Description and Programmer Manual

 

 

 

by Radosław Adamus and the ODRA team

 

6. SBQL (Stack-Based Query Language) - Queries

ODRA and its query and programming language SBQL are based on the  Stack-Based Architecture[1] (SBA). It is a formal methodology addressing object-oriented database query and programming languages. SBA is a great come back from database theories such as (relational, object) algebras, calculi, etc. to the well-known concepts that are recognized in the programming languages domain for about 40 years. In SBA we reconstruct query languages’ concepts from the point of view of programming languages (PLs). The approach is motivated by our belief that there is no definite border line between querying and programming; thus there should be a universal theory that uniformly covers both aspects. SBA offers a unified and universal conceptual and semantic basis for queries and programs involving queries, including programming abstractions such as procedures, functions, classes, types, methods, views, etc.

6.1 Basic Pragmatic, Syntactic and Semantic Assumptions

The power of SBQL concerns a wide spectrum of data structures that it is able to serve and complete algorithmic power of querying and manipulation capabilities. At the same time, SBQL is fully precise with respect to the specification of semantics. SBQL has been carefully designed from the pragmatic (practical) point of view. We were struggling severely with parasite syntactic sugar, redundant operators and semantic reefs (when human intuitive semantics does not match machine semantics). The pragmatic quality of SBQL is achieved by orthogonality of introduced data/object constructors, orthogonality of all the language constructs, object relativism, orthogonal persistence, typing safety, introducing all the classical and some new programming abstractions (procedures, functions, modules, types, classes, methods, views, etc.) and following commonly accepted programming languages’ principles.

The SBA solution relies on adopting a run-time mechanism of PLs and introducing necessary improvements to it. The main syntactic decision is the unification of PL expressions and queries; queries remain the only kind of PL expressions. For instance, in SBA there is no conceptual difference between expressions such as

2+2

and

(x+y)*z

and queries such as

Employee where salary = 1000

or

(Employee where salary = (x+y)*z).name

All such expressions or queries can be used as arguments of imperative statements, as parameters of procedures, functions or methods and as a return from a functional procedure.

SBQL is the first query language that abandons big syntactic and semantic patterns of queries, such as  the SQL-like pattern select…from…where…groupby…having…orderby…. We have come back to the tradition of programming languages, where syntactic patterns of expressions are as small as possible and consist of unary or binary operators that can be freely combined. All combination of operators are possible, providing the combination has a sense for the user and does not violate typing constraints. Small syntactic patterns and full orthogonality concerning their combination much supports the robustness and power of the language, keeping at the same time its lean specification, easier learning, shorter documentation, much easier implementation and much more potential for query optimization.

Concerning semantics, we focus on the classical naming-scoping-binding paradigm. Each name occurring in a query is bound to run-time programming entities (persistent data, procedures, actual parameters of procedures, local procedure objects, etc.), according to the actual scope for the name. The common PLs’ approach that we follow in SBA is that the scopes are organized in an environmental stack with the “search from the top” rule. Some extensions to the structure of stacks used in PLs are necessary to accommodate the fact that in a database we have persistent and bulk data structures and the fact that the data is kept on a server machine, while the stack is kept on a client machine. Hence the stack contains references to data rather than data themselves (i.e., we separate the stack from a store of objects), and possibly multiple objects can be simultaneously bound to a name occurring in a query (for processing collections). The operational semantics (abstract implementation) of query operators, imperative programming constructs and procedures (functions, methods, views, etc.) is defined in terms of the three abstract data structures: object store, environmental stack (ENVS) and query results stack (QRES). All these structures have their static incarnation that is necessary for strong type checking, that is, a metabase, a static environment stack and a static result stack.

6.2 Strong Type Checking

In SBQL we assume strong type checking, as in many other programming languages, but we extend the checking to collection types that are determined by cardinalities. Because the cardinality lower number can be 0, e.g. [0..1] or [0..*], the store model assumed in SBQL covers so-called semi-structured data. In such cases ODRA is prepared to semi-strong type checking that assumes static checking of everything that can be statically checked, and dynamic (runtime) checking of all the cases that are impossible to check statically.

6.3 Results returned by SBQL Queries

Results of SBQL queries can be defined recursively as follows. A result returned by the SBQL query can be one of:

·      value of atomic types integer, real, string, boolean and date;

·      reference to an object (including attributes, sub-attributes, links, procedures, methods, views, etc.);

·      structure - a non-empty n-tuple of any query results (including named collections);

·      binder, that is, a pair name(value), where value can be any result; binder is a named value of any type.

·      collection (bag or sequence) of results.

 

6.4 Atomic SBQL Queries

The atomic SBQL queries are literals (integers, floating points, strings, dates, booleans) and names. Names can represent any data, procedures or methods stored in the object store. Names can also be auxiliary names defined in a query by as or groups operators, names of parameters, names of views, etc.

Example queries:

2

3.14

“Winnie the pooh”

2007-06-12 03:04:12

Person

salary

age

RichEmployee

6.5 SBQL Operators - Summary

SBQL provides a common set of predefined operators for arithmetic, string, logical and other operations. Unlike other query languages (in particular, SQL and OQL) SBQL follows the tradition of programming languages’ expressions (known e.g. from Pascal and Java) which avoids big syntactic templates (such as select…from…where…group by…having…). In SBQL almost all operators are unary or binary and can be freely combined (if necessary, with the use of parentheses), providing a combination does not violate typing constraints. This also concerns non-algebraic operators, such as selection, projection/navigation, join, quantifiers, etc.

The set of the operators is shown in the following table and described in details in the following sections.

 

Table 6-1. SBQL operators

Operator category

Operator symbols

Numerical

+ - * / %

Logical

and or not

String concatenation

+

Comparisons

=  <>  >  <  >=  <=  ~~  !~

Auxiliary naming

as groupas

Date operators

now dateprec

Aggregate functions

sum count min max avg

Algebraic operators on collections

bag union struct , subtract in contains intersect unique distinct exists

Non-algebraic operators on collections

. join where forall forsome orderby closeby leavesby closeuniqueby leavesuniqueby

Reference and dereference

ref deref

Conditional

if … then … else …

Range

[], rangeas

 

Precedence rules for query operators are described by the following table. Operators in the same row have the same precedence. Operators in higher rows have lower
precedence than operators from lower rows. For instance, * has higher precedence than + hence one can write a*b + c, what means (a*b) + c.

 

forall   forsome   

quantifiers

join   where   .

join, selection, projection

union intersect subtract

bag operators

orderby closeby closeuniqueby leavesby leavesuniqueby

 

,

structure constructor

or and

 

=  !=  <  >  =<  =>  ~~  !~

~~ !~ are string matching operators

in contains

 

+ -

binary operators

* / %

 

as groupas rangeas

 

not

 

avg sum min max

 

ref deref

 

bag struct unique count exists

unary operators

 

In the following we assume that:

query1 ::= query

query2 ::= query

query3 ::= query

sequenceOfQueries ::= query {, query}

 

6.6 Numerical and String Operators

All the numerical and string operators call automatic dereference if one or both sub-queries that an operator connects return references. For example, in the query x+1 the sub-query x returns a reference to an object x, which is automatically de-referenced to the value stored at x.

query ::= query1 + query2

query ::= - query1

query ::= query1 - query2

query ::= query1 * query2

query ::= query1 / query2

query ::= query1 % query2

The operators are defined for integer and real types. For numeric types their meaning is typical (according to the elementary arithmetic); operator % computes the reminder of dividing the first operand by the second operand. The operators require operands with the cardinality [1..1]; however, other cardinalities do not cause a type error but shifting the type checking to runtime to check whether eventually the argument is a single value. The operators call automatically the dereference of an operand if it returns a reference.

When one or both operands of + are string types, the result is the concatenation of the string operands.

The precedence of operators is typical, as in the elementary arithmetic; if necessary or in case of doubts parentheses can be used.

Example queries:

2 + 2

4.6 + 5.2

3 + 2.3

“My favourite number is: ” + 27

4 – 2

5.3 – 1.2

2 * 2

5.3 * 1.2

4 / 2

5.3 / 1.2

15 % 5

sal + 1000

sal + (10 * age)

"Winnie " + "the " + “Pooh”

fName + “ “ + lName

sal – 100

- (netSal + budget / 10000)

sal * 100

netSal * (age/10)

sal / 100 + budget / 1000

netSal / (age + 10)

sal – (sal % 1000) + 500

 

6.7 Comparison Operators

All the comparison operators call automatic dereference if one or both sub-queries that an operator connects return references.

6.7.1 = and <> operators

Binary comparison operator.

query ::= query1 = query2

query ::= query1 <> query2

The equality and inequality operators are defined for numerical values, dates, strings, references, and complex (structure) value types.  They return boolean values. The inequality operator returns the boolean negation of the equality operator. For numerical and string values the equality operator returns true if the values of its operands are equal, false otherwise. For references it returns true if its two operands refer to the same object. For structures it returns true if two operands have the same structure. The operator requires operands with the cardinality [1..1]; however, other cardinalities do not cause a type error but shifting the type checking to runtime to check whether eventually the argument is a single value. The operator calls automatically the dereference of an operand if it returns a reference.

The precedence of operators (including all other operators in a query) is typical, as in the elementary arithmetic; if necessary or in case of doubts parentheses can be used.

Example queries:

2 = 2

“Winnie" = “winnie”

(ref x) = (ref y)

(1, "ala") = (1, "ala")

salary = 1000

lName = “Doe”

“Winnie" <> “winnie”

fName <> “Bill”

(address.street) <> “Boogie”

location <> name

 

6.7.2 < , > , <= and => operators

Binary comparison operators.

query ::= query1 < query2

query ::= query1 > query2

query ::= query1 <= query2

query ::= query1 >= query2

The less_than, greater_than, less_or_equal_than and greater_or_equal_than operators are defined for numerical values and date values. The first one returns true if the first operand is less than the second, false otherwise. The second ones – just otherwise. The less_or_equal_than one returns true if the first operand is less or equal than the second, false otherwise. The greater_or_equal – otherwise.The operators require numerical operands with the cardinality [1..1]. The operators call automatically the dereference of an operand if it returns a reference.

The precedence of operators (including all other operators in a query) is typical, as in the elementary arithmetic; if necessary or in case of doubts parentheses can be used.

Example queries:

2 < 2

2 <= 2

3.5 < 3.1

16.4 >= 3

sal < 1000

sal >= 5000

budget >= sum(Emp.sal)

min(Emp.sal) >= 2000

6.7.3 ~~ and !~ operators

Binary string comparison operators.

query ::= query1 ~~ query2

query ::= query1 !~ query2

The matches and not_matches operators are defined for string values. They are very similar to the SQL like and not like operators. The first one returns true if the first operand is matches a regular expression given with the second, false otherwise. The other one – just otherwise. The operators require string operands with the cardinality [1..1]. The operators call automatically the dereference of an operand if it returns a reference.

Regular expressions accepted by these operators correspond to the SQL notation. i.e. % means any sequence of characters (including empty ones) and _ stands for a single character. The precedence of operators (including all other operators in a query) is typical, as in arithmetic comparison operators defined above.

Example queries:

“my string” ~~ “another string”

“my string” !~ “another string”

“my string” ~~ “%string”

“my string” ~~ “__ str%”

surname ~~ “%berg”

surname !~ “_oad%”

surname ~~ anotherName + “%”

 

 

6.8 Boolean Operators

6.8.1 and and or operators

Binary logical operators.

query ::= query1 and query2

query ::= query1 or query2

The operators perform a logical and and or on boolean operands. They should have the cardinality [1..1]. The operators call automatically the dereference of an operand if it returns a reference.

Example queries:

(2 = 2) and (2 <> 3)

sal = 1000 and job = “clerk”

(sal<3000 or job<>“programmer”) and (address.city) = “Rome

 

6.8.2 not operator

Unary logical operator.

query ::= not query1

The operator negates its boolean operand. It requires a boolean operand with the cardinality [1..1]. The operator calls automatically the dereference of an operand if it returns a reference.

Example queries:

not forall Emp (sal >= 2000)

not exists(sal)

not ((Emp where sal >= 1000) in (Emp where job = “clerk”))

 

6.9 Date Operators and Comparisons

6.9.1 now operator

The now operator returns the current day and time with the precision of milliseconds starting from the Unix epoch beginning, i.e. since 00:00:00 UTC of January 1, 1970. The operator returns a value of the ODRA date type (it corresponds to timestamps in other systems).

query ::= now()

6.9.2 dateprec operator

The dateprec operator allows for formatting date precision since no separate data types are introduced for expressing different time accuracies used in natural languages for expressing actual date/time semantics, e.g. for a birth date (only a calendar date), a meeting time (a calendar date plus and an hour with minutes, ignoring seconds and milliseconds), etc. The dateprec operator returns the date data type, minus the irrelevant date parts (e.g. an hour, minutes, seconds and milliseconds) are just set to 0. The syntax is following:

query ::= dateprec(query1, formatstring)

where query1 returns a date type value and formatstring can be one of:

-        "low" – date exact to a day.

-        "medium" – date and time exact to minutes.

-        "high" – date and time exact to seconds.

-        “full” – date and time exact to milliseconds (default).

Example queries:

dateprec(now(), “low”)

dateprec(now(), “full”)

Employee where dateprec(birthDate, “low”) >= dateprec((Employee as e where e.id = “ABC12345”).e.birthDate, “low”)

(Guest where name = “Smith”).checkInTime := dateprec(now(), “medium”)

(Event where dateprec(timestamp, “high”) = 2007-06-12 03:04:12).description

 

6.10 Algebraic Operators on Collections

6.10.1 union operator (bag constructor)

Binary collection operator.

query ::= query1 union query2

Alternatively:

query ::= bag(sequenceOfQueries)

The union operator returns a bag that is a result of ‘set-sum’ of the operands. If query1 returns bag{a1, a2, ...} and query2 returns bag{b1, b2, ...}, then the query bag( query1, query2) returns bag{ a1, a2, ..., b1, b2, ...}. If query1 or query2 returns an individual element, it is treated as a one-element bag. bag{ a, bag{b, c} } is equivalent to bag{ bag{ a, b}, c } } and is equivalent to bag{ a, b, c }. A bag with one element is equivalent to this element: bag{ a } = a. The strong typing requires that operands of union or bag operators must have the same collection types (with the structural type conformance).

Example queries:

1 union 3

3.4 union 2.0 union 5.1

bag(1, 3, 5)

bag((Emp where sal > 1000), (Emp where job = “analyst”) )

 

6.10.2 struct operator (structure constructor/ cartesian product)

Binary collection operator.

query ::= [struct](sequenceOfQueries)

The struct operator returns a structure that is the result of the Cartesian product of operands (providing the Cartesian product operator is naturally extended to bags). If query1 returns bag{a1, a2, ...} and query2 returns bag{b1, b2, ...}, then the query struct( query1, query2) returns bag{ struct{ a1, b1}, struct{ a1, b2},..., struct{ a2, b1}, struct{ a2, b2}, ...}. If query1 returns an individual element a, and query2 returns an individual element b, then struct(query1, query2) returns an individual element struct{a, b}. In all other cases individual elements are converted to one-element bags. struct{ a, struct{b, c} } is equivalent to struct{ struct{ a, b}, c } } and is equivalent to struct{ a, b, c }. A structure with one element is equivalent to this element: struct{ a } = a, and v/v.

Example queries:

(1 , 3)

struct(“Winnie”, “the”, pooh”)

Emp. struct(sal as s , 2.0 as x , (worksIn.Dept.dName) as d)

 

6.10.3 subtract operator

Binary collection operator.

query ::= query1 subtract query2

The subtract operator returns a bag that includes the elements from first operand that do not occur in the second operand. The strong typing requires that operands must have the same collection types.

Example queries:

(1 union 3 union 2) subtract (3 union 2)

(Emp where job = “clerk”) subtract (Emp where sal > 1000)

 

6.10.4 intersect operator

Binary collection operator.

query ::= query1 intersect query2

The intersect operator returns a bag that includes the elements from first operand that also occur in the second operand.

Example queries:

(1 union 3 union 2) intersect (3 union 2)

(Emp where job = “clerk”) intersect (Emp where sal > 1000)

 

6.10.5 in and contains operators

Binary collection operators.

query ::= query1 in query2

query ::= query1 contains query2

The in operator returns TRUE if the result returned by second operand query includes the result returned by first operand query. The contains operator – just otherwise.

Example queries:

(2 union 3) in (1 union 3 union 2)

(Emp where job = “clerk”) contains ((Emp where sal > 1000)

 

6.10.6 count operator

Unary collection operator.

query ::= count query1

The count operator returns the number of elements in the operand collection.

Example queries:

count(1 union 3 union 5)

count Emp

count(Emp where job = “clerk”)

count(Dept join employs.Emp )

 

6.10.7 exists operator

Unary collection operator.

query ::= exists query1

The exists operator returns TRUE if query1 returns at least one element, FALSE otherwise.

Example queries:

exists (Emp where address.city =Warsaw”)

Emp where not exists(address)

 

6.10.8 deref and ref operators

Unary collection operators.

query ::= deref query1

query ::= ref query1

The deref operator takes a single references or a collection of references returned by the operand query1 and returns a corresponding value or a collection of values stored in the referenced objects. The result of the dereference operator depend on the object type:

-        dereference of simple objects returns a simple value (e.g. integer, real, etc.)

-        dereference of a pointer object returns the reference of the pointed object (i.e. the value of the pointer object).

-        dereference of a complex object returns a structure with named fields (binders). Each field name corresponds to a sub-object name and each filed value corresponds to the (dereferenced) sub-object value.

The ref operator takes a single references or a collection of references returned by the operand query and set the flag informing that the dereference operator is not allowed. The ref operator is used to avoid automatic dereference and to compare references rather than values.

Example queries:

deref(Emp.lName)

ref(Emp where lName = “Kim”) = ref(Emp where id = 4326)

 

6.10.9 unique and distinct operator

Unary collection operators.

query ::= unique query1

query ::= distinct query1

The unique operator removes duplicate object references from the result returned by the operand query1. The result is the unique set of object references. The distinct operator removes duplicate values from the result returned by the operand query1. The result is a set of values; the operator automatically calls the deref operator.

Example queries:

unique (Emp where lName=“Kim” union Emp where (worksIn.Dept.dname) = “pr”)

distinct (Emp.lName)

 

6.11 Aggregate Functions

ODRA SBQL implements the following aggregate functions: count (already described), sum, avg, min and max. They are known from SQL. Aggregate functions are defined very generally thus orthogonal combination of them allows the programmer to achieve all the possibilities that are associated with the SQL group by and having clauses. In SBQL we do not introduce such clauses considering them redundant and unnecessary.

query ::= sum query1

query ::= avg query1

query ::= min query1

query ::= max query1

The sum function computes the sum of all the argument collection elements. The avg function computes the average value of a collection of numerical values. min and max return the minimal and maximal value of the argument collection elements, correspondingly.  If the collection has only one element, all the functions return the value of the element. An empty result of argument query1 causes a runtime error. Automatic dereferences are performed.

Example queries:

sum(1 union 3 union 5)

sum(Emp.sal)

avg(Emp.sal)

min(3.4 union 2.0 union 5.1)

max(Emp.sal)

sum(3.4 union 2.0 union 5.1)

avg(1 union 3 union 554)

(Dept as d) join avg(d.employs.Emp.sal)

min(Emp.sal)

max(Dept.count(employs))

 

6.12 Non-algebraic operators

Each non-algebraic SBQL operator is binary and its evaluation differs from the algebraic ones. The difference concerns the use of the environment stack. In effect, the non-algebraic operators, in their full generality, cannot be specified by any mathematically correct algebra built in the style of relational or object algebras. For non-algebraic operators the order of evaluation of operand queries is significant[2]. The first query is evaluated at the beginning. Then, for each result returned by it the second query is evaluated. The evaluation is performed in the following steps:

  1. a new environment is calculated for the currently processed element of the first query result.
  2. the second query is evaluated against the environment opened in first step. The result of evaluation depends on the non-algebraic operator and is saved as a partial result of a non-algebraic operator evaluation.
  3. The environment opened in first step is destroyed.

Finally, all the partial results are merged into the final result of evaluation. The process of constructing final result from the partial results depends on a non-algebraic operator.

The new environment that is opened for a processed element is calculated by the function nested. For complex object references the function returns the environment referring to all its internal subobjects. For a pointer object the function returns an environment that consists of a single reference to the object that the pointer points to. For a binder n(x) the function returns this binder. For a structure the function returns the union of environments calculated for all structure’s elements. For other elements the function returns an empty environment.

For processing classes the above scenario is a bit modified. If an object X is processed by a non-algebraic operator and X is a member of a class C1 that inherits from C2 that inherits from C3, etc. then the environment stack contains the following environments (starting from its top): the environment of X, the environment of C1, the environment of C2, the environment of C3, etc. More detailed description on how the environment and the qeuery result work for non-algebraic operators can be found at http://www.sbql.pl.

Note that this semantics should be understood in full generality and in combination with arbitrarily complex queries that can use all the SBQL operators.

 

6.12.1 Dot operator (navigation/projection)

Binary collection operator.

query ::= query1 . query2

The dot operator returns a bag that includes the union of results returned by second operand query. If first query returns an empty result, the final result will be an empty bag.

Example queries:

Get references of all salaries of the employees:

Emp.sal

Starting from the Toys department, get references of streets of its employees:

(Dept where dName = “Toys”).employs.Emp.address.street

Path expressions are composed from several binary dot operators, e.g. a.b.c.d.e is interpreted as (((a.b).c).d).e The programmer can freely combine path expressions with other SBQL operators.

 

6.12.2 where operator (selection)

Binary collection operator.

query ::= query1 where query2

The second operand query2 must return a boolean value. The where operator returns a bag that includes those elements from the first operand query result for which the second query returns true.

Example queries:

Get references to Emp objects with the salary greater than 1000:

Emp where sal > 1000

Get references to Emp objects with the salary lower than 1% of the budget of his/her department:

Emp where sal < ((worksIn.Dept.budget)/100)

 

6.12.3 join operator (dependent or navigational join)

Binary collection operator.

query ::= query1 join query2

The join operator returns a bag of structures that are build up from pairs (e1, e2) where e1 is an element of first query result and e2 is the result of second query evaluated against this e1. If the first query returns an empty result, the final result will be an empty bag. The operator is known as dependent join or navigational join. Semantically it is much different from the join operator known from the relational algebra, but is able to achieve the same power, and much more.

Example queries:

Get references to all employees with the references to their names:

Emp join lName

Get references to all employees with references to their departments:

Emp join worksIn.Dept

For each department get its reference and the maximum and average salary of its employees:

(Dept as d) join (avg(d.employs.Emp.sal)as a, max(d.employs.Emp.sal)as m)

 

6.12.4 forsome and forall operators (quantifiers)

Binary collection operators.

query ::= forsome(query1) query2

query ::= forall(query1) query2

Operator forsome returns true if the second operand query at least one returns true, otherwise false. Operator forall returns false if the second operand query at least one returns false, otherwise it returns true. Notice that for query1 returning an empty bag the operator forall always returns true.

Example queries:

Is it true that at least one employee earns less than 1000?

forsome(Emp)sal < 1000

Get departments where all employees are females:

Dept where forall(employs.Emp) sex = “female”

Get departments where at least one employee is female:

Dept where forsome(employs.Emp) sex = “female”

Is it true that each department employs an employee earning more than his/her boss?

forall Dept as d ( forsome employs.Emp.sal as s (d.boss.Emp.sal < s))

 

6.13 Auxiliary Naming Operators

ODRA SBQL introduces two auxiliary naming operators, as and groupas. Both are unary operators parameterized by a name.

6.13.1 Operator as

Operator as names each result of the argument query:

query ::= query1 as name

The operand query1 returns a single value r or a bag {r1, r2, …}. Values r, r1, r2, … can be of any type, in particular, they can be references. In the first case the operator returns the binder name(r). In the second case for each result ri in the bag the as operator creates a binder name(ri). The final result is a bag of binders. If the query1 returns an empty bag the result of as operator is an empty bag. There is a lot of contexts where the operator as can be applied, in particular, it can be used to make structures with named components, it can be used as a “variable” bound by a quantifier, as an “iterator variable” in foreach statements, for determining names in results of views, etc.

Example queries:

For each employee return a structure with reference to name attribute as first element and reference to salary attribute as second element. Name the elements N and S correspondingly.

Emp.(lName as N, salary as S)

Get references to employees associated with references to their departments.

Emp as e.(e, e.worksIn.Dept)

 

6.13.2 Operator groupas

Operator groupas names the whole result of the argument query.

query ::= query1 groupas name

The operand query1 may return any result. The groupas operator creates a binder name(result). The final result is a single binder. If query1 returns an empty bag, the result of groupas operator is a binder with the empty bag as a value.

Semantically, the groupas operator has almost nothing in common with the SQL group by operator, although in many cases it allows the programmer to achieve similar effects. Operator groupas is known as the nest operator known from other proposals. SBQL does not introduce the opposite unnest operator, as this role is performed by the implicit operator of binding names (which removes a name from a binder).

Example queries:

Get references to all employees working in the PR department. Name the whole result PR_Staff.

(Emp where worksIn.Dept.dName = “PR”) groupas PR_Staff

Get the name of the department and the names of its employees. The employee names should be grouped together and named Staff. The name of the department should be returned also for the departments with no employees.

Dept.(deref(dName), employs.Emp.name groupas Staff)

 

6.14 Explicit and Implicit Type Conversions and Dereferences

6.14.1 Implicit type conversion

Coercions are functions that change the types and representation of values. In many cases coercions are implicit, to avoid the annoying, too verbose style of programming. For instance, if x is an integer number and y is a real number, then in the query x + y the value returned by x is automatically coerced to a real number. Similarly, in the query:

Emp.(lName + “ earns “ + sal)

the operator + is recognized as concatenation of strings, hence an integer value returned by sal is implicitly coerced to a string.

Implicit type conversion might occur in context of arithmetic and conditional operators as well as in many other situations, including method invocations and assignment statements. Implicit conversion is allowed from integer to real and from integer or real to string when the operator + is recognized as concatenation of strings. The imlicit conversion rule is assigned to many operators, including equality (‘=’) and non-equality (‘<>’) operators.

Another kind of implicit coercions concerns changing bags or sequences into single elements, and v/v. For instance, in SQL a select clause can occur within a where clause, but in this case the result of the select clause is automatically coerced to an individual value. Such coercions are typical for SBQL, which unifies a structure, bag or sequence having one element x and the element x itself. For instance, one can use a query (get employees earning more than Kim):

Emp where sal > ((Emp where lName = “Kim”).sal)

The query assumes implicit coercion of the bag returned by the sub-query:

(Emp where lName = “Kim”).sal

into a single value (of the Kim’s salary). If this is impossible, because the company does not employ a person named Kim, or the company employs more than one person named Kim, the dynamic type check will return a typing error (exception).

6.14.2 Implicit dereference

Another commonly assumed kind of implicit coercions are implicit dereferences. Assuming an object <i, n, v>, the dereference of i returns v. For instance in the query:

 Emp where lName = “Poe”

the sub-query lName returns a reference; it is automatically dereferenced to a value of the object pointed to by this reference.

In some cases there is also a need for explicit dereference operators. This is possible by the operator deref (described previously).

6.14.3 Explicit coercions (casts)

The syntax for an explicit coercion is the following:

query ::= (type_name)query1

where type_name is the name of the simple type and operand query1 returns a simple type value.

The explicit coercion allows the programmer to perform type change other than the default. For instance in the query:

(string)2 + 2

the first operand is explicitly coerced to string. In consequence the operator + performs automatic coercion to string of the second operand and the result of the query is the string “22”.

The explicit simple type coercion follows the rules presented in the following table:

Table 6-2. Type coercions

Coerce to:

integer

real

string

date

boolean

integer

-

real

string

error

error

real

truncate fraction

-

string

error

error

string

integer (with runtime check)

real (with runtime check)

-

date (with runtime check)

boolean (with runtime check)

date

error

error

string

-

error

boolean

error

error

string

error

-

 

6.15 Conditional operator (if…then…else…)

The syntax of conditional operator is following:

query ::= if query1 then query2

query ::= if query1 then query2 else query3

The first operand query1 must return a boolean value with the cardinality [1..1]. If it returns true the second operand query2 is evaluated; otherwise the third operand query3 is evaluated. For the first version of the operator (without else) we assume that if query1 returns false, then the result of query is an empty bag.

Example queries:

If the average salary of employees is less than a 1000, get references to employees earning less than a 1000, otherwise get references to employees earning less than average.

avg(Emp.sal) as a.( if a < 1000

                    then Emp where sal < 1000

                    else Emp where sal < a )  

 

6.16 Ordering Operator

The ordering operator allows one to sort a query result according to a given key (keys). The syntax is following:

query ::= query1 orderby query2

The first operand query1 returns a bag – a subject to the order by operation. The second operand query2 determines a key. The result of query2 (the key) is a structure (generally a bag of structures) which elements represents sorting keys. The domain of key values must possess the property of linear ordering. If the structure contains one element, this is the only sorting keys. If the structure contains more than one element first the sorting is executed against first element in the structure (the first key), next the result is sorted against second element in the structure (the second key), etc. The process follows until the last structure element (the last key).

Example queries:

Order employees by age.

Emp orderby age

Order employees by age and then by last name.

Emp orderby (age, lName)

Ordering of strings is performed according to the lexical order of characters and strings in English. Till now ODRA does not support alphabetic ordering for other native languages, but such options are considered for further development.

 

6.17 Range operators

Range operators allow for selecting required elements from an operand treated as a sequence. ODRA has two operators that belongs to this group. The first one uses the traditional square brackets syntax known from many languages. The second one (rangeas) is based on a named numerical index.

 

6.17.1 Operator […]:

 

query ::= query1[query2]

 

The first operand query1 returns a sequence (or a bag; in this case it is randomly converted into a sequence). The second operand query2 returns a bag of integer values that determine indices of elements in the query1 result sequence. All sequences are indexed starting from 1 (unlike C, C++ and Java, which number sequence elements starting from 0). The result of the operator is a bag of sequence elements having given indices. If an index is negative or bigger then the sequence size then it does not contribute to the result (it is ignored). Thus the minimal cardinality of the result bag is 0 (the query2 result is empty or all the returned indices are out of range). The maximal cardinality of the result bag is equal to the maximal cardinality of the query2 result.

Example queries (c.f. Fig.2-4):

Take the first element from the integer sequence:

((bag(9, 1, 3, 5, 8, 7) as p) orderby p)[1] 

                              //result is: bag{p(1)}

 

Take the second and the fourth element from the integer sequence:

((bag(9, 1, 3, 5, 8, 7) as p) orderby p)[bag(2,4)] 

                              //result is: bag{p(3),p(7)}

 

Take the fifth and seventh element from the integer sequence:

((bag(9, 1, 3, 5, 8, 7) as p) orderby p)[bag(5,7)]

               //result is: bag{p(8)}

               //because the sequence has only six elements.

Get the last name of the best-paid employee:

(Emp orderby –sal)[1].lname

Get 3 departments with the smallest number of employees:

(Dept orderby count(employs))[bag(1,2,3)]

Get the median of salaries:

(Emp orderby sal)[(integer)(count(Emp)/2)].sal

 

6.17.2 Operator rangeas:

Unary operator parameterized with a name.

query ::= query1 rangeas name

The operand query1 returns a sequence (or a bag; in this case it is randomly converted into a sequence). If the result of query1 is seqence(e1,e2, …eN) then the result of rangeas query is the bag of structures

bag{ struct{e1, name(1)}, struct{e2, name(2)}, …, struct{eN, name(N) } }

where the first structure element is the element of parameter sequence and second one is a binder with the name name and the value being the index of an element in the parameter sequence.

In contrast to the square brackets operator, the use of the rangeas operator does not extracts given elements from the sequence. Its role is to prepare the sequence for further processing by other query operators.

Example queries (c.f. Fig.2-4):

Return 10 best paid employees:

((Emp orderby -sal) as e rangeas i where i <= 10).e

Return employees having the rank in the salary category (descending) at least on 10 higher from the rank in the age category (descending):

 

((((Emp orderby -sal) as e rangeas i)

    orderby -e.age) rangeas j )

    where i >= j+10).e

Return the average salary of employees, disregarding 25% of the worst paid and 25% of the best paid ones.

avg((((Emp orderby sal) rangeas r)

  where r > (integer)(0.25 * count(Emp))

    and r < (integer)(0.75 * count(Emp))).sal)

 

6.18 Transitive Closures

Binary collection operators.

query ::= query1 closeby query2

query ::= query1 closeuniqueby query2

query ::= query1 leavesby query2

query ::= query1 leavesuniqueby query2

A transitive closure in SBQL is an operator having the syntax q1 close by q2, where q1 and q2 are queries. Let final_result be the final result of q1 close by q2, let union denote the bag union and let dot denote projection/navigation (as usual in SBQL). Semantics of this query can be expressed as a least fixpoint equation:

final_result = q1 union (final_result.q2)

or as an infinite iteration (continued till some next component will be Æ):

final_result = q1 union q1.q2 union q1.q2.q2 union q1.q2.q2.q2 union ...

Note that the transitive closure concerns any query results returned by q1 and q2, thus the relation being the subject of the closure is calculated on-the-fly and need not be stored in the database. This implies that the operator can perform any computations.

SBQL offers the following variants of the transitive closure:

·         close by, as described above;

·         leaves by, which returns only leaf objects, i.e. objects which do not result in adding any further element to the result set;

·         close unique by which eliminates duplicate elements on the fly to avoid infinite cycles;

·         leaves unique by, which eliminates duplicate elements on the fly to avoid infinite cycles and returns only leaf elements;

Example queries:

Let’s consider a simple data schema concerning parts, similar to descriptions used in Bill of Material (BOM) applications. Each Part has name and kind. If the kind is “detail”, the part has also detailCost and detailMass (the cost and mass of this part) and has no assemblyCost, assemblyMass attributes. If kind is “aggregate”, the part has no detailCost and detailMass, but it has assemblyCost and assemblyMass. The attributes represent the cost of assembling this part and mass added to the mass of the components as the result of the assembly process. Aggregates have one or more Component sub-objects. Each Component has the amount attribute (number of components of specific type in a part), and a pointer object leadsTo, showing the part used to construct this part. A SBQL schema for this example is depicted in Fig.6.1.

6-1. A Bill-of-Material example schema

The simplest transitive closure SBQL query over this schema finds all components of a part named “engine”.

(Part where name = “engine“) closeby (Component.leadsTo.Part)

This query first selects parts having name attribute equal to “engine”. The transitive closure relation is described by the subquery (Component.leadsTo.Part). It returns all Part objects which are reached by the leadsTo pointer from already selected objects.

One of the basic BOM problems, i.e. “find all components of a specific part, along with their amount required to make this part”, may be formulated using the transitive closure as follows:

((Part where name=”engine”), (1 as howMany))

closeby (Component.((leadsTo.Part), (howMany*amount) as howMany))

The query uses a named value in order to calculate the number of components. The number of parts the user wants to assemble (in this case 1) is named howMany and paired with the found part. In subsequent iterations the howMany value from parent object is used to calculate the required amount of child elements. It is also named howMany and paired with the child object.

The above query does not sum up amounts of identical sub-parts from different branches of a BOM lattice. Below we present a modified query which returns aggregated data – sums of distinct components from all branches of the BOM tree:

((((Part where name=”engine”) as x, (1 as howMany)) closeby (Component.((leadsTo.Part) as x, (howMany*amount) as howMany)) )

groupas allEngineParts).

((distinct(allEngineParts.x) as y).

(y, sum((allEngineParts where x=y).howMany)))

This query uses grouping in order to divide the problem into two parts. First, all the components named x, along with their amounts named howMany are found. The pairs are then grouped and named allEngineParts. The grouped pairs are further processed, by finding all distinct elements and summing the amounts for each distinct element.

This query could be further refined, in order to remove all aggregate parts (so only the detail parts will be returned). There are many ways to accomplish this goal. One of them is to use the operator leaves by in place of close by. The operator leaves by returns only leaf objects, i.e. objects which do not result in adding any further objects to the result set:

((((Part where name=”engine”) as x, (1 as howMany)) leavesby (Component.((leadsTo.Part) as x, (howMany*amount) as howMany)))

groupas allEngineParts).

((distinct(allEngineParts.x) as y).

(y, sum((allEngineParts where x=y).howMany)))

Thanks to the full orthogonality (including orthogonal persistence) SBQL can perform calculations without referring to the database The query below calculates approximation of the square root of a, using the fixpoint equation x = (a/x + x)/2, starting from x=1 and making 5 iterations.

((1 as x, 1 as counter)

closeby (((a/x + x)/2 as x, counter +1 as counter) where counter <=5)).

(x where counter = 5)

If a queried graph contains cycles, to avoid infinite loop the programmer can use operators closeuniqueby (close unique by) and leavesuniqueby (leaves unique by). The operators remove duplicates on the fly after each closure iteration, thus cycles do not imply infinite loops. In all other aspects the operators are similar to closeby and leavesby.

 

6.19 Function, Procedure, and Method Calls

The syntax for a procedure and functional procedure call is the following:

    query ::= name([parameters])

where name is the procedure name and the parameters is an optional list of actual procedure parameters (separated by semicolons):

    parameters ::= parameter[; parameters]

    parameter ::= query

A method call differs form a procedure call by the context of calling. A method call requires a reference to an object thus can be called in the context of a non-algebraic operator (see: non-algebraic operators) or a non-algebraic-like operator (e.g. transitive closure, ordering, foreach) that operates on class instance references.

Example queries:

Call the procedure named add with two integer parameters:

add(12, 35)  

Sum salaries of employees working in the PR department. Assume that Emp are instances of EmpClass with defined methods getDepartmentName() and getSalary()

sum ((Emp where getDepartmentName() = “PR”).getSalary())

 

 

 

Last modified: June 28, 2008

 

 



[1] In other sources SBA is also referred to as Stack-Based Approach

[2] This statement is frequently confused with the assertion“the order of operands is significant”. We are talking about the order of evaluation of operands rather than the order of operands.