Archive

Posts Tagged ‘db4o’

Object-oriented database programming with db4o – Part 2

March 26th, 2008 Comments off

Finally, I could manage some time writing up the follow-up post about other interesting features of db4o, specifically about client-server feature and transaction & concurrency support. You can read the article here: http://www.codeproject.com/KB/cs/oop_db4o_part_2.aspx.

This write-up also gives me a chance to learn about some cool new features of db4o 7.2 (currently development version) such as LINQ integration, transparent activation and transparent persistence. These are really big changes from the previous version I tried (6.3). Hope that I can find some time writing about all these features. But don’t wait for me though, just go ahead and try them yourself…

I am a dVP

September 19th, 2007 5 comments

I have just been recognized as a db4o Most Valued Professional (dVP) for the year 2008 and won a trip to Berlin next year to attend the ICOODB 2008 conference. It has always been a pleasure working with a great product like db4o and I surely enjoy this award.

BTW, to those who are expecting to see part 2 of my db4o article, I am a bit overwhelmed with other stuffs lately and could not have time to start working on it; but I’ll surely do that as soon as I can.

Categories: Technologies Tags: , ,

Discussions around db4o

April 4th, 2007 1 comment

There are several interesting discussions at The Code Project regarding my article on db4o, I think it may be useful to compile those discussions into a blog entry so that those who missed the comments section at The Code Project can read.
Read more…

Categories: .NET, Technologies Tags:

My article on The Code Projects

March 9th, 2007 8 comments

I have edited and submitted my blog entry about db4o to The Code Project. This is the first time I published an article over there. Let’s see how well (or badly) it is received :-) .

Categories: .NET, Java Tags:

The Legend of Data Persistence – Part 1

February 11th, 2007 5 comments

1. Abstract

Have you ever felt frustrated for having to develop applications whose back-end making use of a Relational Database Management System (RDBMS), such as MS SQL Server, or Oracle?  Do you think it is a pain to write SQL (or stored procedures) to query some data and then manually map the result set to your object model and back?  Great, you have Hibernate, EJB, iBATIS, and Active Record, but do they really really make the work of object-relational mapping (O-R/M) simple enough and completely transparent while imposing no compromises to the richness and expressiveness of the object model?  If O-R/M is such a big problem, why do we not use an Object Database Management System (ODBMS) instead?  And if ODBMS is possible for certain applications that we are developing, which ODBMS implementation can we use at a start?

In this three-part article, I will attempt to provide the answers to all the above questions.  Please note that most of the concepts and tools described in this article will certainly take more than just one or two pages to be fully presented (in fact, 500-page+ books have been written for several of them), thus I will not discuss in dept about any particular concept or tool – instead the aim is to provide a high-level overview of the key points and interested readers are recommended to learn about the specifics via their own research (the References section can serve as a start)

Okay, with that in mind, the contents of the article are organized as follows:

  • In Part 1, I will discuss about the object-relational (O-R) impedance mismatch, its consequences, and ORM tools as a rescue
  • In Part 2, I will introduce the readers to ODBMS, its benefits, and the reasons why it still cannot replace RDBMS
  • In Part 3, I will introduce the readers to DB4O, one of today’s most popular ODBMS implementations

2. The O-R Impedance Mismatch

As OO languages such as Java and C# have become the mainstream programming languages, the O-R impedance mismatch has become among the biggest problems that application developers are facing today.  Of all the troubles caused by the O-R mismatch, the followings are most notorious

  • Representation: in the OO world, classes are, inherently, represented in a nested hierarchical structure (i.e. a Customer object consists of many Order objects which in turn consist of many OrderLine objects and so on), in the RDBMS world, things can only be represented flatly in tables (relations) which consists of multiple rows (records, or tuples) and columns (attributes).  In other words, while classes can be represented in any level of granularity, relational schema is limited to only four primitives: the table, the record, the column, and the cell (intersection between a row and a column).  As a result, the richness of the object model is often compromised (inheritance trees are flattened out, associations are simplified or even removed) for the sake of having it easily mapped to the relational model.  The representational difference between the object and relational worlds is the core of all problems
  • Object Identity: two objects, despite having the exact same attributes (and even referencing to same nested objects), can be separate entities in the OO world because objects are identified based on their location in memory.  On the other hand, there is no way for the RDBMS to distinguish between the two records with the exact same data.  Imagine two exact same records in the DB are loaded intro a result set and mapped to two distinct objects, when these objects are updated and persisted back to the database, the database cannot distinguish which record the updates should go to.  To resolve this, the concept of primary-key, while not necessary in the object world, is introduced in the relational world to help distinguish records within a table
  • Association: while associations can easily be traversed in the OO world using the built-in object referencing mechanism of the host programming language, they are not very straightforward in the flat RDBMS world in which tables can only be linked together using the concept of foreign-key.  To retrieve an associated record in one table for another record in another table, one must use different SQL “join” statements, instead of “object.attribute”.  (To retrieve representation of deeply nested objects, multiple levels of joins are required.)  Finally, while many-to-many relationship (e.g. Singer and Song) can easily be represented in the OO world, you need to have a link table to represent this relationship in the RDBMS world
  • Inheritance: although inheritance can easily be modeled in the OO world (e.g. using the extends keyword in Java), it is much harder to be represented in the RDBMS world, which does not have the concept of “table inheritance”.  Thus, several work-arounds are required to represent inheritance, ranging from complete normalization (aka table-per-concrete-class, which has separate independent tables for all sub-classes of an inheritance hierarchy), or complete denormalization (aka table-per-class-family, which has one big table to contain all the possible attributes of all types in an inheritance tree as well as a “discriminator” column to distinguish among the types), to hybrid solutions (such as table-per-class, which represents each class in an inheritance tree by a table with the children tables linking to the parent tables via the foreign-key mechanism)

3. The O-R Impedance Mismatch’s Consequences and ORM Tools

The most obvious consequence of the O-R impedance mismatch is that developers tempt to create simplistic object model so that the mapping between relational data set into objects (and vice versa) can be done in a straight-forward and less error-prone manner.  In fact, it is not hard to see projects in which domain classes and their attributes are simply one-to-one mappings of the database tables and columns respectively.  And while that does help the data mapping task less painful, it means a huge sacrifice to the richness and expressiveness of the domain model and this in turn affects the maintainability and extensibility of the system.  (The discussion about as to why a simplistic object model negatively affects the ability to be evolved of a system [esp. complex system] will be one of the main topic of my future post[s] about Domain-Driven Design.)

As object-oriented developers are crying for the need of rich domain model, numerous ORM tools are bornt to address it.  Ideally, an ORM tool is expected to1:

  1. Make the mapping between the relational database and the object model as simple and transparent as possible
  2. Minimize the constraints imposed on the object model and the relational database schema and allow them to evolve as independent as possible

Unfortunately, these two goals, in many cases, contradict with each other: the simpler and more transparent the mapping is, the more constraints required for the object model and the schema and vice versa.  For example, Hibernate takes the Data Mapper approach [Fowler, 2002], and bases on the mapping rules defined by developers to dynamically generate SQL statements required for the mapping.  While this means a simple usage and an almost transparent mapping, it does impose many constraints onto the object model (e.g. requires certain collection interfaces to be used for object associations so that dynamic proxies can be injected at runtime) and the database schema (e.g. to represent inheritance).  Like Hibernate, a particular implementation of the JDO specification2 for RDBMS would impose similar constraints onto the object model and relational schema.  On the other hand, iBATIS3 takes a hybrid SQL-map approach which offers a configurable layer of indirection (expressed in SQL and XML) to “map the parameters and results (i.e., the inputs and outputs) of a SQL statement to a class” [Begin et al, 2007].  While iBATIS is very flexible in term of constraints placed onto the model and schema (because developers still take ownership of writing SQL), it requires more work from the application developers than O-R/M solutions like Hibernate.  Next, Active Record, based on the power of the Ruby programming language and implementing the Active Record pattern [Fowler, 2002], while requires the application developers to write the least amount of data persistence code (in comparison with other full-scaled O-R/M solutions such as Hibernate), it does impose a lot of constraints onto the domain model and database schema, especially by many conventions serving as implicit contract between application developers and the framework (so that no XML configuration file or annotation is necessary).  And finally, it’s worth mentioning about the once-considered a silver bullet EJB 2.x, which is not only hard to use but also significantly pollutes the domain model with all kinds of interfaces and conventions.  As a result, until today there is still no O-R/M tool which can completely resolve those two contradictory goals and really make the developers’ lives as easy as they should be…

That goes back to the question that if O-R/M is such a big problem, why do we not use an ODBMS instead?  That will be the topic of the second part of this article, in which I will introduce the readers to the concept of ODBMS, its benefits, as well as the reasons why RDBMS, despite all of the problems it causes to the object world, will still be there to live.