Environment
Stack, Name Scoping and Binding
by Kazimierz Subieta (January 2006)
Back to Description of SBA and SBQL.
The concept of environment stack
(a.k.a environmental stack or call stack) appeared in 50-ties or early 60-ties
of the previous century, together with compilers of oldest high-level programming
languages such as Algol-60. From that time the stack is a basic concept of
semantics and implementation of majority of well-known languages, including
Pascal, C, C++, Smalltalk, Java, C#, etc. The idea of the stack is well-known
for the developers of compilers. It is simple and obvious, but not always well
explained in manuals and textbooks. According to our experience, the database
community folks rarely realize what the stack is for, what problems it solves
and which properties it has. It is quite common confusing it with another stack
known as arithmetic stack or result stack which is necessary for
quite different reasons. Lack of proper knowledge, simplistic reasoning,
neglecting hard semantic problems and hurrah-theoretical attitude are probably
the reasons that developers of popular theories of query languages (such as
relational algebra, relational calculus and formal logic) totally ignore this
concept, or consider it an implementation issue having no relevance to
high-level semantics. Unfortunately for the world (and fortunately for us) such
theories are dramatically limited w.r.t. to a lot of semantic phenomena that
occur in query languages such as SQL. Moreover, richer database models, such us
object-oriented and XML-oriented models, show total inadequacy of these
theories. Even simplest queries, such as
2+2 or select
Salary - Tax from Employees are non-expressible in these theories.
However real problems with these theories appear when one would like to
describe unified semantics for languages that integrate querying and
programming. Nowadays there are many such languages, including recent SQL
standards, Oracle PL/SQL, MS Transact SQL, some 4GLs, and others. Therefore we
abandon these theories as totally worthless for our purposes. Instead of them
we introduce concepts with no such limitations. We will show that all
advantages of these theories over our concepts, including their potential for
query optimization, distributed query processing, etc., are overrated and inessential.
We do that in a way that is better, more general and corresponding to real
practical query languages.
In this way we are coming back to the idea of an environment stack. We
will consider its roles in programming languages to conclude that the same
reasons for introducing it occur in query languages. To this end, however, we
will need to modify the stack structure to accomplish new features stemming
from the properties of query languages.
Environment
Stack in Programming Languages
The concept of environment of
program execution denotes all the run-time entities (variables, constants,
objects, functions, procedures, types, classes, etc.) that are available at the
given point of the program control. The environment is not a flat structure and
is changing during the run of the program. The most convenient (and clear for
the programmers) assumption is that that the environment is subdivided into
sub-environments that appear and disappear during the program run. From the
ergonomic point of view there is a need to follow some principles that govern
the appearance and disappearance of the sub-environments, as follows:
These principles lead to the concept of environment stack, which is just
responsible for appearance and disappearance of sub-environments related to
particular procedures. The stack fulfills the following roles (in our
terminology functions and methods are procedures too):
A stack is a main memory (or virtual memory) data structure and is
assigned to a single client application program or to a single process or
thread. It is subdivided into sections,
where each section (also called activation
record) corresponds to a particular sub-environment. A section is
associated with a particular procedure
call or an executed program block.
When the control is shifted to a procedure call, a new section with all
entities local to this call is pushed on the top of the stack. The same is done
when the control is shifted to a module (except that a return track is
unnecessary). The section is popped from the sack when the procedure or program
block is terminated. For the procedure that is currently running all values of
parameters, local variables/objects and any other local entities are stored
within the top stack section.
Note that the environment stack has more functionality than a typical
stack understood as an abstract data type with four operators: push (push a new element on the top), pop (remove an element from the top), top (read top element) and isEmpty? (checking if the stack is
empty). Additional functionalities concern name binding (which implies the
search on the entire stack), scoping rule (skipping visiting some sections) and
(in some cases) inserting new sections in the middle of the stack. All these
additional functionalities stem from the definition of semantics and will be explained
later.
Name Binding
Binding is a compiler
or run-time action that for a given name occurring in a source program
subordinates a corresponding run-time entity, e.g. a main memory address, an object
identifier, a start addresses of an executable procedure code, etc.
For instance, binding of a variable X
is a substitution of this name by a main-memory address, where the value of X will be recorded. All occurrences of X in a source code are substituted by
this address.
Providing an environment stack stores environments together with all
source names of run-time entities, a binding action for some name X is performed according to the
following steps:
In fact, this search is usually done during compile time on the static
environment stack rather than on run-time stack; we will return to this issue a
bit later. In Fig.14 we show the case when name X occurs within a block b
that is a part of procedure p2 that
is called from procedure p1. The
control is within the block b. Arrows
show the order of searching the stack. The search is terminated when X is found.

Fig.14. Example situation on the environment stack
In Fig.14 we also show the scoping rule: variables and actual parameters
of the procedure p1, and perhaps former sections of procedures, are not seen.
The search is going directly to global variables. Procedures p1 and p2 can be written at different time, by different programmers who
do not and need not to communicate and make some agreement. The programmer
writing procedure p1 can use the name
X and the programmer writing
procedure p2 can use X, but these two X denote different things. Due to scoping rules X occurring within p1 is not seen when p2 is
processed, thus there is no conflict.
If a programming language introduces modules, classes, inheritance and
perhaps other notions the situation on the environment stack and scoping rules
can be much more complex. We will discuss this issue after we introduce AS0-AS3
store models.
The stack mechanism makes it possible to accomplish the following
features of programming languages:
As we will show, the above features have fundamental meaning for query
languages, allowing one to accomplish their properties in a simple, consistent
and universal way. This concerns, in particular, the possibility to define
auxiliary names (“correlation variables”) that are defined within
queries, the possibility of nesting queries, scope rules for all names
occurring in queries, combining in queries names of database properties with
names of applications’ properties, and so on. The stack mechanism is
absolutely inevitable if one seriously thinks on integration of queries with
programming constructs and abstractions. Lack of the environment stack concept
in artifacts such as various algebras (object algebras, in particular),
F-logic, Datalog, monoid calculus, OQL, XQuery, etc. in advance sentences them
to limitations and handicaps.
Static and Dynamic Environment Stack
In languages with early
(compile-time) binding names occurring in a program have the second-class
citizenship, i.e. they are lost after compilation. The names do not exist
during run-time. For such languages the environment stack must exists in two
forms: a static stack that is managed
during compilation, and a dynamic stack
managed during run-time. The binding of source program names is performed on
the static stack. During compilation this stack simulates the behavior of the
dynamic stack, i.e. it grows and shrinks according to the moving of the
syntactic analysis control point along the source program being compiled. In
such cases binding is subdivided into two phases. In the static phase binding
of variables is expressed relatively to the size of stack, i.e. for each
variable name the static binding returns so-called offset, e.g. the distance in bytes between the top of the dynamic
stack and the address of the run-time entity corresponding to this name.
Eventually the binding is performed during run-time, when the address of the
stack top is known. For instance, the address of the variable name is
calculated by subtracting the offset from the address of the stack top.
Due to some irregularities of data structures practically all languages
work with more advanced dynamic binding, when variable names (in some
representation) still exist on the dynamic stack. In interpreted (script)
languages the binding is fully dynamic, i.e. names of variables, together with
their values, physically exist on the dynamic stack. The subdivision between static binding (during compile time) and
dynamic binding (during run time) is
almost always a bit rough and implementation dependent: usually the static
binding is supported by calculations during run time, and dynamic binding
operations (for optimization) are partly done during compile time.
Query languages are interpreted hence for them the dynamic binding and
dynamic stack is relevant. To avoid considerations related to physical
implementation of the static and dynamic stacks we assume that all bindings are
dynamic. Hence at least at the beginning we avid the static stack. All run time
entities that are necessary for query processing will be stored at the run-time
stack together with their source names. In fact, majority of the semantics of
query languages we are able to explain without referring to the static
environment stack.
However, some issues in query languages require the static stack. These
are the following:
In our further discussion we present the stack in conceptual rather than
physical form. In this way our considerations will be still abstract and
high-level. Of course, physical implementation of the stack can much differ in details from our
conceptual view.
Go to Environment
Stack in SBA in the AS0 Store Model.
Last modified: December 31, 2007