In Part 1 of this article, I have discussed about the Object-Relational (O-R) impedance mismatch, the problem it causes as well as the pitfalls of some O-R/M tools. In this part, I will examine Object Database Management System (ODBMS) and compare it with Relational Database Management System (RDBMS).
1. What is ODBMS
Basically, an ODBMS is a DBMS which stores objects as opposed to rows or tuples in respectively a SQL or a RDBMS [Wikipedia, Object Database] and bornt out of the need for transparent and non-intrusive persistence of complex object model, tasks which could not easily be addressed by RDBMS because of the O-R impedance mismatch. As a DBMS, besides being a data respository for storing object graphs (together with their identities, attributes, associations, and inheritance information), an ODBMS would, at the very least, include a query engine, a concurrency management system, and a data recovery mechanism. (The very first effort to define the features of ODBMS was the ODBMS Manifesto first published in 1989 by Malcolm Atkinson et al.)
Standardization
Before examing the types of ODBMS, it’s worth to learn about the ODMG (Object Data Management Group), which is a standardization committee established in 1991 with the goal of promoting the adoption of ODBMS via the creation of standardized ODBMS specifications. In 1999, the latest version of the ODMG specification (3.0) was released with the four major components:
- Object Model: defines the common data model (which is a common denominator for OO database systems and programming languages) to be supported by all ODMG-compliant ODBMSs. With this common data model, object definitions within object databases can be portable among different applications, programming languages, and platforms
- Object Specification Languages: include Object Defition Language (ODL) and Object Interchange Format (OIF). ODL is used to define the database’ object schema and is equivalent to the Data Definition Language (DDL) in the relational world. On the other hand, OIF is a means to dump and load the object databases’ state to and from files (e.g. XML files) (e.g. to support the exchanging of objects between different object databases)
- Object Query Language: is a query language based on SQL 92 and is equivalent to SQL in the relational world. OQL supports the querying of complex objects, polymorphism and late-binding calls, and is interoperable with specific language bindings
- Language Bindings: written for C++, Smalltalk, and Java and expose a persistence API so that these languages can interact with ODMG-compliant object databases
Despites this standardization effort, as of 2001, there was no ODBMS fully compliant to all ODMG standards [Barry, 2001]. In this same year, the ODMG disbanded as the member companies decided to concentrate their effort on the Java Data Objects specification, which was resulted from the ODMG Java Language Binding submitted to the Java Community Process. In 2006, the Object Management Group (OMG) announced that they would develop a new specifications on the ODMG 3.0 specification and has yet to release any specification since then. As a result, while many standards (including SQL and the mathematics-based relational model) have been consistently adopted by virtually all RDBMS vendors, widely adopted ODBMS standards simply do not exist yet.
2. Types of ODBMS
Depending on how an ODBMS implementation chooses to persist objects, there are two types of ODBMS: non-native and native.
a. Non-native ODBMS
In a non-native object database, there are two separate object models, one of the application and the other of the database itself. ODMG-compliant ODBMSs are examples of non-native ODBMSs since they require a separate data schema to be defined, regardless of the existance of the application object model. In order to query or persist objects from and to non-native object databases, the mapping between these two distinct models must be performed. For ODMG-compliant databases, the schema is defined by the ODL and the application object model can either be generated from that schema or manually written by developers and then modified by a source-code or bytecode/CIL enhancers (as part of the persistent API, such as JDO, for that particular database implementation) to add persistent behaviors (e.g. to make the class an Active Record) and information (e.g. mapping information).
While the separation between the application object model from the database object model gives non-native ODBMS the advantage of having its databases portable across applications, programming languages, and platforms, it is also the source of problems because application developers have to maintain both of these models as the application evolves.
b. Native ODBMS
In native object databases, objects are stored exactly as they are, without the need to map them into a different object model supported by the databases and vice versa. In other words, in the world of native object databases, there is just one single object model: the application object model and thus, unlike non-native ODBMS, no ODL and common object model are necessary. (Note that while no new object model is required, it does not mean that a native ODBMS cannot have its proprietary data format to represent the application object models in the data store.)
The interesting thing is that one can easily implement a simple non-native ODBMS in Java, Ruby or a .NET language using the built-in serialization mechanism which can serialize objects into byte-stream, which can then be stored into a file or sent over the network, and deserialize objects from the same byte-stream. With the serialization infrastructure, no extra work is necessary for storing objects’ attributes, associations, and inheritance information, and thus one will only need to add an object identification mechanism (e.g. assign an OID field to each object, either hand-coding or, more sophisticated, using bytecode/CIL enhancement [no need in Ruby thanks to its "open-class" feature]), a query API (e.g. query-by-example) and a simple concurrency system (assume the database is shared by just one application at a time, the built-in thread locking mechanism is sufficient) in order to have an ODBMS implementation 2.
In contrast with its non-native counterpart, native ODMBS while simplifies the querying and persistence of application object model to the minimum, its databases are not easily portable across applications, programming languages and platforms. In fact, for two or more applications to make use of the same database, they must have the exact same persistence classes bundled with them (same name, same package/namespace, and attributes with their types). It is even harder for applications written in different languages to share the same database file because of the differences in naming conventions and base types (framework classes) 3.
3. ODBMS Versus RDBMS
Having looked at the basic features and types of ODBMS, let’s examine its advantages and disadvantages in comparison to RDBMS
a. Advantages
- Rich domain model: since ODBMS can store objects at any level of granularity and has built-in suport for identity, association, and inheritance, OO developers can model their domain classes as richly and expressively as they want without being constraint as with the relational world
- Maintainability: since the application object model and database object model are closely related to each other (in non-native object databases) or even are the exact same model (in native object databa
ses), it takes less effort to maintain these models as the application evolves
- Development effort: the ability of developers to implement rich domain domain with the least maintainance effort would result in a significant reduction in development time and cost.
- Performance: ODBMS is supposed to perform much better than its RDBMS counterpart, regardless of whether O-R/M tools are used or not, in systems with highly complex object model, since no complex queries (e.g. joins) and mapping are required
b. Disadvantages
- Portability: data in RDBMS can be shared by applications written in any paradigm and platform while ODBMS is tied to the OO world. The situation is even worse for native object databases since the data cannot be used by multiple applications with different domain classes, even if they share the same data. As a matter of fact, the clear separation between the relational model and object model further assists portability of RDBMS since these two can evolve independent of each other
- Legacy applications: there are so many applications built with a RDBMS back-end that it is impractical to migrate all these data into ODBMS. In addition, not only the data has to be migrated, the applications which consume the data will also need to be modified to make use of a OO data access mechanism
- Maturity: the ODBMS industry are new (emerged since the 90′s) and thus is far from close to RDBMS world (emerged since the 70′s) in terms of available system vendors (including the compatibility among database systems from different vendors) and tool supports (such as reporting, OLAP, data transformation, and clustering services etc.)
With the above analysis, ODBMS is not so much of a silver-bullet that some people hope for and thus the decision whether to use ODBMS in a project or not must be considered very carefully. However, once we have done the homework and decided that object databases can be used for our applications, then we can sit down and enjoy the huge productivity gain which cannot be achieved if we stick with relational databases. In the final part of this article, we will look at the DB4O object database system. Stay tuned!
Endnotes
1 The very first effort to specify the features of ODBMS was the work of Malcolm Atkinson and others in 1989 with the ODBMS Manifesto
2 This simplistic implementation, besides the lack of functionalities, has several major drawbacks. First, searching of serialized objects requires all objects to be deserialized into memory first before an in-memory search can occur while deserialization is a extremely costly operation. Next, databases created by this implementation are not portable across platforms (e.g. Java to .NET) since the proprietary serialization mechanism of the host programming language is used. Finally, the serialization infrastructure will break down as soon as the object model evolves with new attributes, associations, and data types. An alternative could serialize objects into a custom XML format so that searching can be performed quickly using XPath, schema evolution and portability can be handled via the XML binding layer; but then, this is not simple anymore and it is usually better to make use an existing ODBMS system, like DB4O, instead
3 While these are hard, they are not impossible in native ODBMS such as DB4O which allows developers to change the default mapping rules in code (although awkward and should be avoided as much as possible), which we will see in Part 3 of the article
References
- Database Systems, Paul Beynon-Davies, Palgrave Macmillan, 2004
- Java Data Object, David Jordan and Craig Russell, O’Reilly, 2003
- [Barry, 2001], ODMG Compliance, Barry et al, Barry & Associates, Inc., 2001
- ODMG 2.0: A Standard for Object Storage, Doug Barry, Component Strategies, 1998
- ODBMS Manifesto, Malcolm Atkinson et al, University of Glasgow, 1995
- ODMG’s website: http://www.odmg.org/
- DB4O’s website: http://www.db4o.com/
- JDO’s website: http://java.sun.com/products/jdo/
- http://en.wikipedia.org/wiki/Object-oriented_database_management_system
- http://en.wikipedia.org/wiki/Java_Data_Objects
By PRIMEBTS free bet