Developed at the
ODRA – Object Database for Rapid Application development
Description and Programmer Manual
by Jacek Wiślicki and the ODRA team
The goal of the generic wrapper to relational databases is to enable integration of data stored in a relational database into a virtual repository. Wrapped relational tables can be transparently queried and updated with object-oriented features SBQL. They can also be mapped with updateable object-oriented views in order to comply with the global schema of a virtual repository. In this way relational data can be made indistinguishable from other ODRA objects. There is no limits in sophistication of the mapping between relational tables and ODRA objects, due to the fact that SBQL updatable views offer the full algorithmic power. This is a significant difference with many currently available Object-Relational Mappers (ORM-s). During the development of the wrapper we put much effort on effective query optimization. The optimization is performed both at the ODRA side and the wrapped relational database side. In particular, the wrapper is developed in such a way that major SQL optimization methods (fast joins, rewriting, indices) are fully utilized. This is a second significant difference with existing ORM-s, which offer proper performance for simple mappings between object-oriented queries and SQL, but totally compromise performance if the mapping is sophisticated.
The wrapper can realize two main goals:
· Storage of data presented in the virtual repository in relational databases (top-down approach).
The first goal strictly complies with the eGov-Bus project objectives, i.e. assembling, integrating and combining pre-existing information systems and business solutions (based mainly on relational databases) in the final environment. The other one is implied by the reasonable assumption that designers of future systems integrated with the virtual repository would still tend to use familiar relational databases, being somehow afraid of the new object-oriented database technology. Both usage scenarios can be successfully realized by the wrapper.
The wrapper is realized in a client-server architecture. The client is transparently embedded in the ODRA server, while the wrapper server is a standalone application. The schematic wrapper structure is presented in Fig.14-1:
An ODRA virtual repository uses object-oriented views to map virtual data delivered by the wrapper to the form assumed by the virtual repository canonical data model. The virtual repository itself does not see the wrapper, since the ODRA database is opaque – the wrapped relational schema is presented via a regular ODRA schema. The actual processing is executed between the wrapper server and the client. The client is responsible for issuing SQL queries (implied by SBQL queries from global virtual repository clients) to the server, receiving results and creating temporary ODRA results returned to the ODRA database. Such temporary results are then returned to the virtual repository and to the global client.
When the wrapper server is running and its host is available from the ODRA server machine, the wrapper can be used from within ODRA (the server configuration and startup procedures are described with illustrative examples below). The wrapper is realized as an ODRA database module, therefore the syntax is similar:
add module <modulename> as wrapper on <host>:<port>
where <modulename> is a name of a module to add, <host> is a wrapper server host (IP or name), and <port> is its listener port. A new wrapper module is created as a submodule of the current module.
A wrapper is instantiated when a wrapper module is created. A wrapper module is a regular database module, but it asserts that all the names within the module are “relational” (i.e. imported from a relational schema) except for automatically generated views referring to these names (this procedure is described in the following paragraphs). A wrapper instance contains a wrapper client capable of communication with the appropriate server. All the wrapper instances are stored in a global (static) session object and are available with a wrapper module name. Thus, once a wrapper is created, it is available to any session (including the ones initialized in the future) as its module is.
A wrapper module can be dropped with the same command as any other ODRA module.
All relational names appearing in the wrapped resource XML description are available with a View postfix (e.g. an employees table is visible as employeesView, its name column is nameView, etc.). The primary wrapper views are introduced automatically so that virtual pointers corresponding to primary-foreign key relations and integrity constraints are preserved. The wrapper objects can be queried as other ODRA objects. Queries combining regular ODRA objects and virtual objects delivered by the wrapper are allowed.
Before querying wrappers make sure that the current optimization sequence contains view rewriting and wrapper optimization. Otherwise queries are not sent to appropriate wrappers and empty results are returned.
Assume that some object-oriented schema is given (according to the virtual repository integration and contribution model) and its data is to be stored in a relational database. This is the case of the top-down design, i.e. the system designer is obliged to create a relational database schema capable for storing and retrieving data being a part of the virtual repository.
The main relational schema designing rule is that all the tables must have unique row identifiers – primary keys (the current prototype implementation supports only single-column keys, this limitation will be removed in the future wrapper development). The unique identifiers are used (usually in background) for data updates, as they allow for keeping data integrity.
The designer must also realize that the wrapper creates virtual pointers for each primary-foreign key pair. Therefore whenever a pointer appears in the assumed object-oriented schema, the foreign key must be created in the corresponding relational schema. Another temporary wrapper prototype limitation that is to be removed in the future is that only a single primary-foreign key constraint per table is supported.
The top-down designing procedure can be illustrated with the following simple example. Consider an object-oriented schema for people and their cars, Fig.14-2:
This simple schema corresponds to a two-table relational schema. Each table should have its primary key (some automatically incremented sequences are a good choice). The pointers (owns and isOwnedBy) should be realized as foreign keys. The resulting relational schema could look as the one shown in the figure below:
During the relational schema wrapping procedure the wrapper creates automatically views covering plain metaobjects. The procedure is described in details in the following. The resulting names differ from the ones in the relational database, thus the designer should cover the wrapped schema with additional views mapping the automatic wrapper views one-to-one to the desired names (assumed in the target object-oriented schema), including virtual pointers.
In case of more complex object-oriented schemata the designer may need to create a more sophisticated relational schema (introducing additional tables not resulting directly form the object-oriented schema) that can be adjusted to the desired object-oriented one with appropriate views – a single SBQL view can operate on an arbitrary join of wrapped tables so that the actual relational schema is not visible to the users.
Consider wrapping an existing (legacy) relational database, used e.g. by some company or public organization. In the eGovernment domain this is the main wrapper application. The approach we call bottom-up. In this approach some object-oriented model, preferably a set of views, must be designed to cover the existing relational schema.
We present the following very simple example. It consists of three tables: employees, departments and locations. The tables are related by primary-foreign key constraints: an employee works in some department that in turn is located in some town/location. Each table has a primary key column (named id); there are also non-unique (secondary) indices on employees’ surnames and salaries, departments’ names and locations’ names. The schema is presented in Fig. 14-4.
The primary step of creation of the schema description expressed as an XML document; technical issues are described in details below. This wrapper reads this description and it creates appropriate metadata in the ODRA metabase (one-to-one mapping applied, each table is represented as a single complex object). The corresponding object-oriented schema is presented in Fig.14-5.
14-5. Imported object-oriented schema
Names generated by the wrapper are prefixed with $, which prevents from using them in ad-hoc queries. Thus, the metaobjects are covered by automatically generated views. This is the final automatically generated stage for the wrapped relational schema. It can be already queried or covered by a set of views so it can contribute to the global schema of the virtual repository.
The views shown below are query-ready, however they do not support relational integrity constraints and they allow full access to wrapped data (including updating and deleting), which is not always a good choice.
14-6. Automatically generated views
The next stage is performed by the administrator or programmer who should design final end-user views.
Relational integrity constraints and table relations should be mapped as virtual pointers (notice that in these views subobjects corresponding to foreign key columns are expressed as virtual pointers). The resulting relational schema representation (available for querying and further processing) is shown in the Fig.14-7.
14-7. Final end-user relational schema
The end-user views provide virtual pointers instead of foreign-key columns, for simplicity of the example they do not modify the schema further. The assumed security constraints do not allow updating foreign-key columns (virtual pointers) and primary key columns. Sample code for the views is listed below:
The sample queries concerning directly the given schema (i.e. the views designed by the administrator/programmer) are presented below:
Retrieve names and surnames of employees earning more than 1000:
Retrieve employees with their departments (application of a join by a virtual pointer):
Calculate the sum of salaries of all
employees named Smith working in any department located in
Retrieve the surname and the department’s location name for the employee with the ABC12345 identifier:
The wrapped schema transformations for the global schema are performed by means of updateable object-oriented views. Below, there are shown a few sample views covering the wrapped sample schema. Notice that views’ definitions can completely rearrange the wrapped schema, also relational integrity constraints expressed as virtual pointers can be overridden (ignored) as other virtual pointers can be introduced in the upper-level views covering the presented wrapper schema.
The view retrieves full names, sexes and salaries of rich employees, i.e. employees earning more than 2000:
The next presented view presents employees’ full names and salaries with names of departments they work in:
The RichEmployee and EmployeeDepartment views can be queried directly or further referenced by other views, e.g. in the data integration process executed by the virtual repository. Some simple direct queries referring these views are presented below:
The next view example presents integration of two separate schemata – the “employees” schema is the same wrapped relational schema as the one used above, the “cars” schema is wrapped from another relational database whose model is shown below. The wrapping process description is skipped as it is performed analogically to the previous one. This example realizes a very simple case of integration of distributed data.
14-8. Logically related separate relational schema
The cars.owner_id column (marked with light gray) is logically related to the employees.id column in the other database, nevertheless both schemata are maintained locally in different locations and they are physically independent.
The EmployeeCar view combines both wrapped schemata and retrieves employees’ full names and salaries with their cars’ make names, model names, colours and manufacturing years:
Here are some simple queries targeting the EmployeeCar view:
Similarly, the wrapped data can be combined with native ODRA objects, including local declarations in views. All the queries, including views’ retrieved objects, are processed by the ODRA optimizers and their executed in the resources; the partial results are then composed and the final result is returned to the client.
The wrapping procedure requires some deterministic mapping between relational data types and primitive ODRA data types. The default type applied for an undefined relational data type (due to heterogeneity between various RDBMSs there might be some types not covered by the prototype definitions) is string. The string type is also assumed for relational data types currently not implemented in ODRA (including binary data types like BLOB).
The type mapping table is presented below:
Table 14-1. Type mapping between SQL and SBQL
The wrapper is realized in the client-server architecture. A client is embedded in an ODRA database; a server needs individual configuration and startup (usually it runs on a separate machine).
A wrapper server requires a JDBC
driver for a database to be connected. The currently supported databases are:
Axion, Cloudscape, DB2, DB2/AS400,
14.6.1 Resource Connection Configuration
A connection configuration file is connection.properties whose sample can be found the project root directory is the standard Apache Torque configuration file. Its content is listed below:
The sample file contains four data sources defined (named postgres_employees, firebird_employees, postgres_cars and sdsql) for different RDBMSs and schemata – the same configuration file can be used for different wrapped databases. However, a separate server must be started for each resource. A torque.database.default property defines a default database if none is specified as an input of an application (e.g., a wrapper server). The other properties mean:
The xxx word should be substituted with a unique data source name that is further used for pointing at the resource.
14.6.2 Relational Schema Description Generation
A schema description file in an XML document is similar to the one used by Apache Torque. Its DTD is available at http://jacenty.kis.p.lodz.pl/relational-schema.dtd. In most cases the file is automatically generated, nevertheless if such solution for some reasons is impossible or some changes must be introduced (e.g. only selected relational tables or views should be exposed to the wrapper), the file can be also created or edited manually. The schema description file is generated (typed) only once and it can be reused until resource schema changes. After, the wrapper server must be restarted too and a new description is to be loaded.
The automatic generation process requires the connection configuration file described above available. Once a configuration.properties is defined for a wrapped RDBMS, the schema generator can be launched by odra.wrapper.generator.SchemaGeneratorApp. The application can run without parameters (a configuration.properties file is searched in the application home directory) and the default database name is used. One can also specify an optional parameter for a configuration file path. If it is specified, also a database name can be provided as the second parameter.
The schema generator application standard output is as below:
Schema generation started...
Schema generation finished in 5875 ms...
As a result the schema description XML file is created in the application home (launch) directory. The file name is created according to a pattern <dbnam>e-schema.generated.xml, where <dbname> is a database name specified as an application startup parameter or a default one in the properties file.
14.6.3 Wrapper Server Running
The server (odra.wrapper.net.Server) is a multithreaded application (a separate parallel thread is invoked for each client request). It can be launched as a standalone application or as a system service.
A standalone launch should not be used in a production environment, its aim are only testing purposes. In order to start the server a system service, read the instructions in the next subsection.
If the server is launched without startup parameters, it searches for the connection.properties file in the application home directory and uses a default database name declared in this file. Other default values are a listener port (specified as 2000) and a verbose mode (specified as true). If one needs to override these values, use syntax as in the sample below:
odra.wrapper.net.Server -Ddbname -Vfalse -P5124 -C/path/to/config/
All the startup parameters are optional and their order is arbitrary:
The path denoted with a -C parameter must be a valid directory where all the configuration files are stored, including connection.properties and schema description XML document(s).
A server output at a successful startup is shown below:
Database model successfully build from schema in './postgres-schema.generated.xml'
SBQL wrapper listener started on port 2000...
SBQL wrapper listener is running under Java Service Wrapper
Big thanks to Tanuki Software http://wrapper.tanukisoftware.org
Running the server as a system service is realized with the Java Service Wrapper (JSW, http://wrapper.tanukisoftware.org). The JSW can be downloaded as binaries or a source code. It can be run on different platforms (e.g., MS Windows, Linux, Solaris, MacOS X) and the appropriate version must be installed in a system (binary download should be enough).
The following instructions refer to MS Windows environment (they are similar on other platforms). Detailed descriptions and examples of installation and configuration procedures are available at the JSW web site. Below, $JSW_HOME denotes a home directory of JSW.
The main JSW configuration is defined in $JSW_HOME/conf/wrapper.conf. The file example is listed below:
# TestWrapper Properties
# NOTE - Please use src/conf/wrapper.conf.in as a template for your
# own application rather than the values used for the
# TestWrapper sample.
# Java Application
# Java Main class. This class must implement the WrapperListener interface
# or guarantee that the WrapperManager class is initialized. Helper
# classes are provided to do this for you. See the Integration section
# of the documentation for details.
# Java Classpath (include wrapper.jar) Add class path elements as
# needed starting from 1
wrapper.java.classpath.2=C:/Documents and Settings/jacek/eclipse/EGB/dist/lib/jodra.jar
wrapper.java.classpath.3=C:/Documents and Settings/jacek/eclipse/EGB/lib/postgresql-8.1-405.jdbc3.jar
wrapper.java.classpath.4=C:/Documents and Settings/jacek/eclipse/EGB/lib/jaybird-full-2.1.1.jar
wrapper.java.classpath.5=C:/Documents and Settings/jacek/eclipse/EGB/lib/jdom.jar
wrapper.java.classpath.6=C:/Documents and Settings/jacek/eclipse/EGB/lib/zql.jar
wrapper.java.classpath.7=C:/Documents and Settings/jacek/eclipse/EGB/lib/commons-configuration-1.1.jar
wrapper.java.classpath.8=C:/Documents and Settings/jacek/eclipse/EGB/lib/commons-collections-3.1.jar
wrapper.java.classpath.9=C:/Documents and Settings/jacek/eclipse/EGB/lib/commons-lang-2.1.jar
wrapper.java.classpath.10=C:/Documents and Settings/jacek/eclipse/EGB/lib/commons-logging-1.0.4.jar
# Java Library Path (location of Wrapper.DLL or libwrapper.so)
# Java Additional Parameters
# Initial Java Heap Size (in MB)
# Maximum Java Heap Size (in MB)
# Application parameters. Add parameters as needed starting from 1
wrapper.app.parameter.2=-C"C:/Documents and Settings/jacek/eclipse/EGB/"
# Wrapper Logging Properties
# Format of output for the console. (See docs for formats)
# Log Level for console output. (See docs for log levels)
# Log file to use for wrapper output logging.
# Format of output for the log file. (See docs for formats)
# Log Level for log file output. (See docs for log levels)
# Maximum size that the log file will be allowed to grow to before
# the log is rolled. Size is specified in bytes. The default value
# of 0, disables log rolling. May abbreviate with the 'k' (kb) or
# 'm' (mb) suffix. For example: 10m = 10 megabytes.
# Maximum number of rolled log files which will be allowed before old
# files are deleted. The default value of 0 implies no limit.
# Log Level for sys/event log output. (See docs for log levels)
# Wrapper Windows Properties
# Title to use when running as a console
wrapper.console.title=ODRA wrapper server
# Wrapper Windows NT/2000/XP Service Properties
# WARNING - Do not modify any of these properties when an application
# using this configuration file has been installed as a service.
# Please uninstall the service before modifying this section. The
# service can then be reinstalled.
# Name of the service
# Display name of the service
wrapper.ntservice.displayname=ODRA wrapper server
# Description of the service
wrapper.ntservice.description=ODRA relational database wrapper server
# Service dependencies. Add dependencies as needed starting from 1
# Mode in which the service is installed. AUTO_START or DEMAND_START
# Allow the service to interact with the desktop.
The most important properties in wrapper.conf are:
· wrapper.java.command – which JVM use (depending on a system configuration one might need to specify a full path to the java program),
· wrapper.java.mainclass – an JSW integration method (with the value specified in the above listing it does not require a JSW implementation, do not modify this one),
· wrapper.java.classpath.N – Java classpath elements (do not modify the first classpath element, as it denotes a JSW JAR location, the other elements refer to libraries used by the ODRA wrapper server, including JDBC drivers),
· wrapper.java.additional.N – JVM startup parameters (in the example only -ea used for enabling assertions),
· wrapper.java.maxmemory – JVM heap size, probably it would require more than the default 64 MB for real-life databases,
· wrapper.app.parameter.1 – ODRA wrapper server main class (do not modify this one),
· wrapper.app.parameter.2 – a path to ODRA wrapper server configuration files directory (i.e. connection.properties and <dbname>-schema.generated.xml) passed as a server startup parameter,
· wrapper.app.parameter.2.stripquotes – important when a parameter name contains extra quotes,
· wrapper.app.parameter.3 – database name passed as a server startup parameter,
· wrapper.app.parameter.4 – server listener port passed as a server startup parameter,
· wrapper.app.parameter.5 – server verbose mode passed as a server startup parameter,
· wrapper.logfile.maxsize – a maximum size of a single log file before it is split,
· wrapper.logfile.maxfiles – a maximum number of log files until the old ones are deleted.
Notice that wrapper.app.parameter.[2...5] conform server startup parameters syntax described above. They are optional and their order is arbitrary. Other configuration properties' descriptions are available at the JSW web site.
In order to test a configuration one can run $JSW_HOME/bin/test.bat. The JSW is launched as a standalone application and runs the ODRA wrapper server (any misconfiguration can be easily detected). If a test succeeds, a JSW is ready to install as a system service. A service is installed with install.bat and deinstalled with uninstall.bat. A sample preconfigured JSW installation for MS Windows can be downloaded from http://jacenty.kis.p.lodz.pl/jsw.win.zip – only some paths need to be adjusted.
Last modified: July 9, 2008