SBQL for the AS1 Store Model
Back to Description
of SBA and SBQL.
Back to the AS1
store model.
In the AS1
store model we have introduced a set of objects C representing classes, a relation
CC representing inheritance among classes and a relation SC representing the
membership of objects in classes. In this section we specify how these new
features can be involved into the mechanism of query evaluation based on SBA.
In fact, the corresponding extensions of semantics and implementation
mechanisms are simple and natural. We have to consider possible corrections to
the features of SBA that we have introduced to the model AS0, namely:
·
The
domain Result of results returned by
queries;
·
The
construction of the ENVS stack;
·
Rules
for binding names on the ENVS stack;
·
The
construction of QRES stack;
·
Function
nested;
·
Procedure
eval for evaluation of queries.
Among six presented features only the last one
is affected by the change of the store model from AS0 into AS1. The idea is
that for a non-algebraic SBQL operator that acts on an object that is a member
of some classes the ENVS stack is augmented not only by the internal features
of this object, but also by the internal features of all the classes that the
object belongs to. The internal features of an object and of its classes are
calculated by the same function nested
and pushed on ENVS in the proper order, which reflects a psychological or
ergonomic view of the programmer: the object being processed is most close to
his/her imagination, then he/she considers its direct class, then its
superclass, etc. In this way invariants of the object that are stored within
its classes would become available for binding during processing of the object
by a non-algebraic operator. The situation on ENVS during processing an object
by a non-algebraic operator is presented in Fig.32.

Fig.32. A fragment of an object and class
diagram and the state of ENVS during
processing of an object by a non-algebraic operator
For instance, consider the situation presented
in Fig.9 and assume that we have a
query:
(Emp where
name = “Poe”) . (name, netSal, age)
In Fig.33 we present the situation on ENVS when
the dot operator processes the Poe’s object. In the subquery after the
dot name is bound on the top section
of the stack (binders to internal properties of the Poe’s object), netSal is bound in the section below the
top (binders to internal properties of the EmpClass),
and age is bound in the third section
from the top (binders to internal properties of the PersonClass).

Fig.33. State of ENVS during processing the
Poe’s object by the dot operator
Let us to assume that a function allClasses(e) acts on any query result. For e being a single object identifier it returns sequence{ c1,
c2, ..., ck } of identifiers of the
classes such that:
<e,
ck> Î SC, <ck, ck-1>
Î CC, ... , <c3, c2> Î CC, <c2, c1> Î CC
In other words, the function allClasses(e) returns the identifiers of all the classes that the object
identified by e is a member, in the
order from the most general class c1
till the most specific class ck.
For e being struct{ i1,
i2, ..., it} we assume that:
allClasses(e) = allClasses(i1) ° allClasses(i2) ° ... ° allClasses(it)
where ° is a concatenation of sequences. For all other forms of e the function allClasses(e) returns the
empty result. So far we do not consider multiple inheritance and the situation
when classes identified by e = struct{ i1, i2,
..., it} are in some
conflict. We discuss the cases a bit later.
The corresponding part of the procedure eval that processes a non-algebraic
operator can be described as follows.
procedure eval( q : query ) // change for the AS1 store
model
begin
.......
case q is
recognized as q1 θ q2: // the query q consists of a non- algebraic operator θ joining subqueries q1 and q2
begin
partialResults: bag of Result;
partialResult, finalResult, e: Result;
partialResults := Ø; //empty bag
eval( q1 );
//Evaluation of the first query; the result is at the top of QRES
for each e in top(
QRES ) do
begin
for each c in allClasses (e) do
push( ENVS, nested(c)); //pushing internals of classes at ENVS
push( ENVS, nested(e) ); //new section at ENVS
eval( q2 );
//Evaluation of the second query; the result is at the top of QRES
partialResult := partialResultOfθ (e, top(QRES) );
partialResults := partialResults ∪ { partialResult }; // new partialResult included to the partialResults bag
pop( QRES ); //removing the result(q2)
from QRES
pop( ENVS ); //removing the
top section with nested(e) from ENVS
for each c in allClasses (e) do
pop( ENVS ); //removing class sections from ENVS
end;
finalResult := mergePartialResultsθ( partialResults ); // forming the final result
pop( QRES ); //removing the result(q1)
from QRES
push( QRES, finalResult); //the final result of the
entire query is pushed at the top of QRES
end
.......
end
Old parts of the procedure are in blue, the new
parts are in black. These simple changes, plus the mechanism of invoking
methods (see below), are the only amendments to the SBQL query engine. However,
in the following we will show that the AS1 model implies difficulties with
multiple inheritance and with collections, hence some next decisions and
improvements can be necessary.
Our policy of pushing internals of classes on
ENVS automatically supports the property known as overriding. If for instance a method income is defined for both classes, PersonClass and EmpClass,
then in the query:
Emp where
income > 2000
the name income
is bound to the income method that is
a property of the EmpClass; the income method within the PersonClass is invisible (is
overridden). The policy in general supports the substitutability
principle. In particular, if in a correct query having name Person one substitutes this name by Emp, the query is still correct; for
instance:
Person where
age > 20
and
Emp where
age > 20
are both correct queries. Substitutability,
however, leads to some semantic problems with collections and with updating
operations. We will return to this issue later.
Invoking
Methods in AS1
Till now we have assumed that binding a name is
simply a search within ENVS for a proper binder (binders) and then, return its
value (values). In case of binding names of methods this action is extended by
invoking the method. So far we are unable to present all the semantics related
to invocation of methods; this will be done in the section devoted to
imperative (procedural) abstractions of SBQL. Below we roughly explain what
should happen when the name m of a
method is to be bound:
·
Before
the binding of the method its parameters are evaluated and stored at QRES.
·
The
name m is bound on ENVS as
previously.
·
After
the binding the method m is invoked.
The invocation requires a new section that is pushed at the top of ENVS
(so-called activation record).
·
The
new section contains binders to the local environment of the method m, which consists of binders to the
actual method’s parameters (taken from QRES) and binders to local
method’s objects. Local objects and (sometimes) values of parameters are
stored in a special (volatile) part of the object store. The section contains
also so-called return track, i.e.
information on the place of the program where the method was invoked.
·
The
program control goes to the method body.
·
When
the control leaves the method body, the mechanism performs the following
actions:
§
If
the method is terminated by a “return
q” command, where q is a query, result(q) is pushed on
QRES as usual.
§
The
volatile part of the object store dedicated to this method is cleaned.
§
The
return track is recorded, and then the new section is removed from ENVS.
§
The
control goes to the method calling program according to the return track.
The action is a bit more complex if the class
that the method m belongs to is
subdivided into public and private parts; we return to this issue when we
consider the AS3 store model. Note also that during executing of the
method’s body a part of the ENVS stack should be invisible due to the
lexical scope rules; this issue is also discussed later.
For instance, let us to consider a class MyClass that contains a method MyMethod having the following code:
method MyMethod(
p1: T1, p2: T2): T {
x1: T3; x2:
T4;
method body;
return q;
};
MyMethod has two parameters: p1 of type T1 and p2 of type T2, and returns
the result (determined by query q) of
type T. MyMethod declares two local
objects x1 of type T3 and x2 of type T4. Let MyMethod be invoked in a query:
q0 where MyMethod(q1, q2) = 1000
where q0 is a query returning identifiers of
objects being members of the class MyClass,
q1 and q2 are actual parameters of MyMethod, which is called in the context of the non-algebraic
operator where (the same reasoning
is for other non-algebraic operators and for any other query involving MyMethod after such an operator). Let iMC denotes the identifier of
the MyClass object and let result(q0) = bag{r1, r2, ...}.
Fig.34 shows some states of ENVS (above
the time line) and QRES (below the time line).

Fig.34. Evaluation of a method
The above picture will be discussed in the
section devoted to imperative abstractions, where we will consider methods of
parameter passing and general scope rules.
Multiple
Inheritance
The definition of the AS1 store model allows
for multiple inheritance among classes and multiple membership of objects in classes.
The multiple membership can be reduced to multiple inheritance, hence we do not
consider this case separately. If we allow that a class can inherits from two
or more super-classes, in general it is possible that there will be a conflict.
Let class C inherits from A and B. If A contains a method named m and B contain a method named m, then the class C can inherit only one
of the methods; the other one will be invisible. In the AS1 model there is no
cure for such a situation if we assume the open-close
principle; i.e. when classes A and B are developed independently, e.g. by
different external companies and their source codes are unavailable. In this
case the class C, which inherits both from A and B, but one of the methods
cannot be inherited, violates the substitutability
principle. For instance, let C inherits the method m from A, but not inherits the method m from B. In such a case objects of the class C cannot be
considered as objects of the class B. Some proposals allow in such a case for
changing (virtually) during inheritance the name of the method m for the class C; for instance, the
method m that is not inherited from B
during inheritance receives a virtual name mB.
This option solves the problem only partially, when a given object is
considered as a member of C. When an object of the class C is considered as an
object of the class B (due to substitutability) it still requires the method m, which will be taken from A rather
than from B.
Despite this anomaly, multiple inheritance is a
very useful option for conceptual modeling, hence it is not a good strategy to
forbid it. In particular, multiple inheritance makes it possible to involve
abstract classes, i.e. classes having no members, including so-called mixin classes, i.e. classes storing
invariants of many different classes. The model with multiple inheritance is
introduced in many practical proposals, including programming languages such as
C++, the CORBA standard, the ODMG standard, etc. Smalltalk and Java, however,
do not introduce multiple inheritance, what can be considered disadvantageous
for conceptual modeling. In programming languages such as C++ multiple
inheritance leads also to some problems with physical representation of
objects.
In our case the multiple inheritance can be
involved into the definition of the function allClasses, which returns the sequence of class identifiers for a
given object (objects), in the reverse order. In case of multiple inheritance
there are many such possible orders and the algorithm for calculating the
function must choose one of them. It is not important which of the orders is to
be taken, because in the case of conflicts between names of methods no good
order exists.
A problem with multiple inheritance is the
result of mixing in one environment properties coming from different
independently developed classes. Such a mix must sometimes lead to naming
conflicts. The problem has a radical solution in the AS2 store model, which
allows one to substitute multiple inheritance by dynamic object roles and
dynamic inheritance.
A problem similar to multiple inheritance appears
when the function allClasses(e) acts on e = struct{ i1,
i2, ..., it}, where i1, i2, ..., it
are identifiers of objects being perhaps members of different classes. In this
case the conflict between names of methods occurring in these classes is also possible.
This time, however, the conflict can be explicitly removed by the programmer by
using proper SBQL options, for instance, by auxiliary naming.
Collections
in the AS1 Store Model
The biggest problem with the AS1 model concerns
collections. The problem is recognized, in particular, in the ODMG standard.
There is a conflict between substitutability and collections. According to
substitutability, each Emp object is
also a Person object. However, in the
AS1 model each object has one name hence it belongs to one collection. For
instance, if some non-algebraic operator processes Person objects, one can expect that it automatically will take also
all Emp objects residing in the same
environment (e.g. a database). However, the standard binding mechanism
disallows for that. For example, (c.f. Fig.9)
the query:
Person where
age < 20
will take into account the Doe’s object
only. Poe and Kim will not be processed, because the name of corresponding
objects is Emp rather than Person.
One can consider in such a case that the query
can be formulated as follow:
bag(Person,
Emp) where age < 20
Such a solution is not only awkward, it still
does not solve the problem due to the open-close
principle. Assume that one writes a PersonCollectionClass
for the PersonClass, implementing all
the required methods. Then, much later someone defines a specialized class EmpClass class that inherits from PersonClass. After that, some amendments
must be introduced to the PersonCollectionClass
and the class must be recompiled. This can be impossible if the source code of
the PersonCollectionClass is
unavailable. Anyway, the open-close principle is violated.
There are several solutions of this problem,
but all of them require some additional assumptions concerning the AS1 model
and some changes to the binding mechanism. In particular, we can assume that
each object has as many external names as classes it belongs to. For instance,
(c.f. Fig.9), we can assume that
the Poe’s and Kim’s objects have additionally the external name Person. The binding mechanism is
modified in such a way that whenever a binder Emp( oid ) is put on
ENVS, the binder Person( oid ) is put
on ENVS too. For instance, the bottom section of ENVS presented at Fig.33 will
have the binders:
Person(i1), Person(i5), Person (i9),
... , Emp(i5), Emp(i9), ...
The strong typing mechanism should assure in
such cases that specific properties of Emp
cannot be used when the object is regarded as Person.
Alternatively, we can assume that each concrete
class contains a name of its members as an invariant. This method does not require
assigning many names for an object. For instance, the PersonClass requires that all the members have the name Person, and the EmpClass requires that all the members have the name Emp. The binding mechanism is modified
in such a way that, for instance, when the name Person is bound first the class containing such name as an
invariant is identified, PersonClass
in this case. Then, according to the CC relation, all classes that inherit from
the PersonClass are identified, in
particular, the EmpClass. Then, the bind function is extended to bindAll function that takes all
invariant names of objects from all the subclasses of the given class. For
instance, it holds:
bindAll( Person
) = bind( Person ) È bind( Emp )
bindAll( Emp
) = bind( Emp )
In further examples concerning SBQL for the AS1
model we assume this method. Classes with invariant names of their member
objects are implicitly assumed in UML class diagrams and in the ODMG standard.
Such a binding strategy solves the problem, but
the requirement that each concrete class must possess an invariant name for all
its members may introduce limitations for some applications. As some cure for
this inconvenience, we can assume the possibility that a specialized class may
“override” this invariant, in particular, it may drop it. For
instance, an EmpClass having Emp as the invariant object name can be
specialized by AnyEmpClass, which
inherits everything from EmpClass but
the invariant object name. Alternatively, we can assume that the name of an object
is determined by its class, but the programmer can change it for a concrete
object (after such a change the object is not seen as a member of an extent of
the class, although inherits from it all the properties). Another solution is
that an object name is determined by its class, but the programmer can assign
to the object an arbitrary alias. Yet another solution is that an invariant
name determined by a class concerns only persistent (database) objects and do
not concern volatile objects being local to a user session or belonging to
local environments of procedures or methods. There are perhaps other solutions.
Without experiments made for real applications is hard to say which is the best
for the further programmers.
In AS1 there is also a problem with typing of
pointer objects. In the tradition of programming languages, a pointer object is
specified by the type of an object
that it points too. The tradition of databases is that a pointer link is
specified by the name of an object
that the pointer link leads to. Such an assumption is necessary if one wants to
specify precisely a database schema, which (as we have discussed previously) is
an inherent pragmatic part of any query language. The assumption is explicitly
taken by the ODMG standard and implicitly by a lot of other proposals aiming
specification of query languages. Trying to make a coincidence of these two
ideas we can say that a pointer link should be specified both by the type of an
object it leads to and by the name of the object. Assuming that an object name
is an invariant for a class we can reconcile both ideas with no such pain:
pointer objects will be specified by the name of the class of an object that
the pointer leads too, and the class will contain both the type of the object
and its invariant name. More detailed discussion on typing issues for SBQL we
present later.
The AS1 has also other disadvantages, for
instance, lack of multi-aspect inheritance, lack of repeating inheritance, etc.
As in the case of the problem with multiple inheritance, the radical solution
to all the problems with the AS1 store model is the AS2 store model.
Examples of SBQL Queries for the AS1 Store Model
Fig.35 presents an object-oriented database
schema as an UML-like class diagram. Note that cardinalities are assigned not
only to relationships, but to any object declaration, to any attribute, etc.
Lack of cardinalities means the default cardinality [1..1]. Relationships such
as worksIn/employs should be
understood as declarations of pointer objects. Address is a complex attribute
(which cannot be directly expressed in UML). Person, Emp and Dept are invariant names of the member
objects for the corresponding classes. Binding of name from a super-class means
implies binding all names from its sub-classes; for instance, binding of Person implies additional binding of Emp. The method budget returns the annual budget of a department.
Fig.35. A database schema according to the AS1
store model
|
E.AS1.1 |
Get names of departments and the average age of
their employees (inheritance of the method age). |
|
SBQL: |
Dept . struct(dname, avg(employs.Emp.age() ) ) |
|
SBQL: |
Dept . (dname, avg(employs.Emp.age)) |
|
SBQL: |
Dept . (dname, avg(employs.Emp.age) as avgAge) |
In general, methods having no parameters can be
written without parentheses, for instance, age
is an equivalent to age(). The last
query returns a bag of structures struct{iDept, avgAge(some_real) }
consisting of the identifier of a department iDept and a binder with name avgAge and some real value.
|
E.AS1.2 |
Get name, the net salary and the boss name
for programmers working in departments located in “ |
|
SBQL: |
(Dept where $ loc as x (x =
“ join (employs.Emp
where “programmer” in job). (name as
name, netSal as netSal, (boss.Emp.name)
as boss ) |
The query returns a bag of structures having
three binders: struct{ name(iname1), netSal(some_real), boss(iname2)
}.
|
E.AS1.3 |
Get employees that for sure live in the
cities where their departments are located (inheritance of Address). |
|
SBQL: |
Emp where $ Address as a ($ (worksIn.Dept.loc) as l (a.city = l)) |
|
E.AS1.4 |
For each employee get the name and the
percent of the annual budget of his/her department that is consumed by
his/her salary. |
|
SBQL: |
Emp . (name
as n, (((if exists(sal) then sal else 0) as s). ((s * 12 * 100)/(worksIn.Dept.budget)) as
percentOf Budget) |
|
E.AS1.5 |
Imperative construct: for each person having
no salary give the minimal salary in his/her department. |
|
SBQL: |
for each (Emp where not exists(sal)) as e do
e.changeSal( min(e.works_in.Dept.employs.Emp.sal) ); |
In this example we have assumed that the method
changeSal inserts a subobject sal with a proper value into the given Emp object. We also assume that for each
department there is at least one employee having the salary. Note that the
program is not optimal - it counts the minimal salary for each employee having
no salary. More optimal program should calculate the minimal salaries for
proper departments, then get it to proper employees, for instance:
|
SBQL: |
for each (Dept where $ employs.Emp (not exists(sal)) as d join min(d.employs.Emp.sal) as
minSal do for each (d.Employs.Emp where not exists(sal)) as e do e.changeSal( minSal
); |
This manual optimization can be substituted by an
automatic optimization of the first SBQL program, but developing a proper
algorithm could be a challenge.
Last modified: December 31, 2007