|
|
Developed
at the Polish-Japanese Institute of Information
Technology © Copyright by ODRA team, © Copyright by PJIIT |
|
ODRA – Object
Database for Rapid Application development Description and Programmer Manual |
|
|
by Jan Murlewski and the ODRA team |
|
17. ODRA Distributed Database |
|
17.1 The Idea of Distributed DatabasesODRA provides advanced mechanisms
for creating distributed databases based on the parallel architecture. Such
an approach allows for creating a virtual repository for transparent
processing of distributed ODRA objects. Furthermore, an ODRA-based
distributed database is capable of handling simultaneous access and modifications
of data among connected nodes. Each node of an ODRA database in a
distributed environment can act as a client, a server, or both, depending on
the requirements, as shown in Fig.17-1.
17-1. Distributed
database Whenever a remote query is being
executed, a client sends a request containing a textual form of a query to a
server. After receiving the query by the server, the query is locally executed
and its result is sent back to the client. The request may also contain
additional parameters for a query or for a remote procedure call.
17-2. Distributed
database After receiving the result of a query,
remote objects references are pushed to the client environment stack and are implicitly
converted into remote object identifiers. Thus, references to the original
objects in a distributed database can be retrieved for updates or lazy
loading. The concept of a remote object identifier allows for unique object
identification and tracking in a distributed environment. In this way,
updates and remote procedure calls on a remote database objects are permitted
by transparent forwarding operations to original objects in a distributed
environment. This approach has benefit of simplicity and flexibility. Thus,
objects in the distributed environment can be transparently queried and
updated through SBQL with no regard to data location. 17.2 Database LinksThe key concept in ODRA
distributed database systems is a database link. The main idea behind database
links it to provide simple and transparent access to data and services offered
by distributed databases. The database
link is represented as a connection between two ODRA-based databases. The
main advantage behind the database link is possibility to take the advantage
of SBQL approach, while hiding the underlying network and communication
infrastructure. Moreover, database users do not have to know physical
locations of distributed databases. Thus, the database link mechanism is
responsible for making requests to an appropriate server, while clients are using
logical names instead of IP addresses, port numbers, etc. The database link connection is
only one-way in a sense it allows only one side to sent requests, while the
other side must acts as a client. In order to create fully bidirectional
communication between distributed databases it is necessary to create
database links on both sides. The database link is using for
communication a TCP/IP protocol, which ensures reliability of data transfers.
However, a link manager module is responsible for managing connections with
remote nodes. Whenever an established connection between two databases is
lost, the link manager will attempt to establish a new connection
automatically. Thus, the database user
does not have to be responsible for establishing and closing connections
explicitly. This capability greatly reduces the complexity and enhances software
maintainability. Users connect thought the database
link to the remote database using the username and password referenced in a
databases link. The username and password are set by the system administrator
during link creation. As so, a local user can access remote database
resources through a database link without even having been explicitly granted
access to data on the remote database. By using the database link, it is
possible for users to access remote database module, as they would be bounded
by the privileges set in the database link. The database link is a database
complex object. However, its notable feature is that is contains embedded
copy of remote module metabase to which it is attached. The remote module
copy, to which the database link is attached, is required for type checking
of remote queries. Such an approach for fetching remote module metabase is
also used for distributed queries optimizations. The process of synchronizing
remote module metabase is fully transparent and automated. In this way database
users does not have to be even aware of remote metabase synchronization. 17.2.1 Creating and removing database link
The database link is registered and
managed by a Link Manager. A new
database link is created in the user current module. For this reasons each database
link must have a unique name within current module. The syntax for creating the
database link is as follows: add link <linkname>
<username>.<modulename>/<password>@<hostname>:<port> where: ·
linkname - is a name of a database link
that is added in the current module, ·
username - is a name of user account at
remote database, ·
modulename - is a path to the linked module
relative to the user main module, ·
password - is a password for a given user
account at remote database, ·
hostname – is a host name/IP address
of the ODRA remote host, ·
port - ODRA
database instance port number (default is 1521). In case of creating, the database
link with a name that already exists in the current module an exception would
be thrown. Below is an example of creating a
database link named empLink, which provides
access to remote module emps on host with 192.168.1.3 IP address: add link empLink admin.emps/password@192.168.1.3:1521 The syntax for removing the
database link is as follows: remove
link <linkname> where: ·
linkname - is a name of a database link to
be removed. Sample command for removing a
database link with name empLink: remove
link empLink 17.2.2 Executing remote queries
Executing remote queries with ODRA-based
distributed databases is straightforward and flexible, while the database
link is treated as an ordinary database object within SBQL syntax. The
proposed architecture for database links allows for taking full advantage of
SBQL semantics. This approach has benefit of simplicity and flexibility as querying
remote databases does not introduce any special semantics. In this way, the
database link is treated as complex object, which queried returns bag
containing remote objects .The remote query that is to be executed on remote
database must be followed with a database link name. The same rule applies
for accessing remote views and procedures. It worth noticing that the
database link is regular database object and it might be a part of more
complex queries involving data from distributed and local databases within a
single query. For example, to retrieve remote
objects of employees from remote database module admin.emps using a database
link names empLink, one can issue: empLink.Emp Here is another sample query that returns
salaries of a person who is Smith and is male: (linka.Emp where lName =
"Smith" and sex =
“M” ).salary The sample query presented bellow allows
updating remote object data by seting new salary for ‘Smith’: (linka.Emp where lName = "Smith").salary := 3250; A database link mechanism provides
capabilities for calling remote procedures. The example below makes a call to
remote procedure getMaxSalaries
stored on remote database. The remote procedure returns maximum salaries for
employees: empLink.getMaxSalaries(); 17.3 Virtual RepositoryA virtual repository constitutes a
backbone for data intensive grid applications incorporating data and services
distribution. The virtual repository integrates ODRA-based distributed
databases into a single virtual database. The virtual repository assumes
autonomy of individual services treated as independent and self-contained
units, with respect to their maintenance and business logic. Regarding this,
local sites are fully autonomous, which means that it is not necessary to
change them in order to make their content visible to global users of the virtual
repository. The core technologies that enable
creation of the virtual repository are updatable
views and database links. An
updatable view provides full transparency of location for data and services
by introducing virtual objects, which encapsulate the location and fragmentation
of data. In this way, all contributory resources are integrated into a global
virtual store by facilitating updatable view mechanism. Moreover, database
links are responsible for data and services provisioning among connected nodes.
Another advantage of the virtual
repository is its extensibility and scalability. Plugging a new node into the
virtual repository does not affect other clients as so it does not imply any
changes in business logic of currently available applications. Such capabilities
are achieved by the updatable views and database links, which encapsulate the
complexity of data distribution. 17.3.1 Integration of Distributed ResourcesThe virtual repository must have its
schema designed. The schema is named integration
schema, as it is responsible for integrating all distributed resources. The integration
schema defines resources and services, which are available for clients. It must be ensured that local
servers expose their resources to virtual repository in a form, which
conforms to an integration schema. This
requirement is provided by contributory
views, as their main task is to handle transformation of local servers schema
into an integration schema form. It is local administrator responsibility for
creating and maintaining contributory views. First, the local administrator
must design contributory schema, which must conform to the virtual repository
schema. The next step involves creation of integration views, which provides
data and services to global clients. The integration
view transforms and glues together resources exposed by all contributory
views. This view is also responsible for transformation and integration of data.
This approach allows for data to be fragmented horizontally, vertically or even
in mixed mode. In this way, clients are available to make queries and updates
through an integration view in a transparent manner. The integration view solves
many issues as clients are not obliged to have any knowledge about data
location, fragmentation, replications, caching, etc. on the other hand, the
integration view designer must have sufficient information regarding such issues
as data location, replications, caching, etc. The integration view is
provided with access to contributory resources through database links mechanism.
17.3.2 Integration
Example
The example presented below is based
on a virtual repository that keeps and manages information on employees and
departments in the form of Employee
(fname, lname, position) and Dept
(name) objects. Objects are horizontally
distributed among ODRA distributed database servers A and B. The sample
distributed databases horizontally fragmented data and virtual repository are
depicted in Fig. 17-3. The virtual repository provides resources, which are the
union of two ODRA-based distributed databases. In this example, virtual
repository and local server schemas are similar. Thus, it is not required to
create any additional contributory views at local servers. To integrate distributed resources
into a virtual repository, according to the horizontal fragmentation, one must
use the union operator. The
proposed solution is to create an updatable view, which would allow combining
objects from remote ODRA databases exposed by contributory views.
17-3. Integration
of horizontally fragmented data from servers A and B Before creating the integration
view, it is required to create necessary database links referencing to
distributed ODRA databases. However, database links would not be explicitly
exposed by the integration view to clients. The database links definitions
referencing distributed ODRA databases are listed below: add link linkContrib1 admin.contrib1/admin@192.168.1.2:1521 add link
linkContrib2
admin.contrib2/admin@192.168.1.3:1521 The integration view listed below
is responsible for merging object from distributed databases and transforming
them to the required schema. The updatable views are created with respect to views
syntax presented in chapter 10. The updatable view DeptVRDef integrates Department objects from ODRA-based distributed
databases. The integration view definition for DeptVRDef is as follows: add view DeptVRDef { virtual
DeptVR: record{ id:integer; name:string; }[0..*]; seed:
record{d:linkContrib1.Dept;}[0..*]{ return(linkContrib1.Dept union linkContrib2.Dept) as d; } on_retrieve{ return d.(deref(id) as id, deref(name) as name); } on_delete{
delete d; } view
nameIntegrDef{ virtual name: string; seed: record{_name: linkContrib1.Dept.name;}{ return d.name as _name; } on_retrieve { return _name; } on_update { _name := value; } } view
idIntegrDef { virtual id: integer; seed: record{_id: linkContrib1.Dept.id;}{ return d.id as _id; } on_retrieve { return _id; } on_update { _id := value; } } } The updatable view EmployeeVRDef integrates Employee objects from ODRA-based
distributed databases. The integration view definition EmployeeVRDef is as follows: add view EmployeeVRDef { virtual
EmployeeVR: record{ fName:
string; lName:
string; salary:
integer; worksIn:
DeptVR; }[0..*]; seed:
record {p: linkContrib1.Employee;}[0..*]{ return (linkContrib1.Employee union linkContrib2.Employee) as p; } on_retrieve{
return
p.( deref(fName)
as fName, deref(lName) as lName, deref(salary) as salary, (DeptVR where id = p.worksIn.id) as
worksIn ); } on_delete{
delete p; } view
fNameIntegrDef { virtual fName: string; seed: record{_fName: linkContrib1.Employee.fName;}{ return p.fName as _fName; } on_retrieve{ return _fName; } on_update{ _fName := value; } } view
lNameIntegrDef{ virtual lName: string; seed: record { _lName:
linkContrib1.Employee.lName;}{ return p.lName as _lName; } on_retrieve{ return _lName; } on_update{ _lName := value; } } view
salaryIntegrDef{ virtual salary: integer; seed: record{_salary: linkContrib1.Employee.salary;}{ return p.salary as _salary; } on_retrieve{ return _salary; } on_update{ _salary := value; } } view
worksIntegrDef{ virtual worksIn:DeptVR; seed: record{_worksIn: DeptVR;}{ return (DeptVR where id = p.worksIn.id) as
_worksIn; } on_retrieve{ return _worksIn; } view worksNameIntegrDef{ virtual
name: string; seed:
record{_name: DeptVR.name;}{ return
(_worksIn.name) as _name; } on_retrieve{
return _name; } } } } 17.3.3 Usage Examples
Below are sample queries
concerning presented earlier virtual repository schema and integration view. Sample query for counting
employees who are earning more than 3000: count(EmployeeVR
where salary > 3000) Retrieve employee names with department
names where they work: (EmployeeVR join worksIn).(fName, name ) Retrieve employees who earn more
than 4500 with their department names: ((EmployeeVR
where salary > 4500) join (worksIn.Dept)).
(fName, name) Return all employees with first
name ‘Tom’ EmployeeVR where fName = "Tom" |
|
|
Last modified: July 10, 2008 |