Most of us are accustomed to using a relational database to store large volumes of data. We rarely look for alternatives unless we run into a bottleneck. Even then, you are likely to put in a lot more effort into optimising the database, rather than stepping outside the relational model.
Non-relational databases have been around for many years. When object-oriented programming became popular, a number of object databases were created, but none captured any substantial share of mind. Object-relational-mapping software like Hibernate for Java, QLAlchemy for Python, and ActiveRecord for Ruby, fulfilled the need of using relational databases within the object-oriented programming paradigm.
SQL is a wonderful tool for arbitrary queries on a relational database. However, you may overestimate the need for it. For example, when dealing with a content-management system, you are more likely to need a keyword-retrieval option, rather than a flexible SQL query.
I use a keyword search with GMail, and I have rarely felt the need to narrow the search to, say, the subject only. Even if I search based on the subject line, I still need a keyword search. I can’t recall any need for a search where the use of an index on the subject would have been beneficial — for example, matching a prefix. Hence, a keyword-search tool like Apache Lucene along with any database, whether relational or not, can be a superb solution.
In the last few years, the need for Web-scale databases has increased the interest in NoSQL databases — a misleading term, which is now often interpreted as “not only SQL”. One category of such databases is object database management systems (ODBMS), and among them is a native object database for Python — ZODB.
Object databases provide ACID support. They reduce the friction of having to transform objects into relational table rows and vice versa — thus improving the efficiency of accessing and manipulating objects. There is no need to map all your information needs into a well-defined schema, which can be very difficult at times.
Imagine a shopping engine. Each category, or even a product group, may need attributes that are a unique combination for the product. So do we create a superset of all attributes, or do we create a keyword-value pair? Or, better still, should we just dump them in a string description and interpret the string at runtime?
ZODB, in practice
ZODB is like a (Python) dictionary. It stores data in a key-value pair, where the value is a pickled (serialised) object. An object could be a container, which is like a dictionary for storing a very large number of elements. Let us look at a simple example that would be perfectly suitable for a relational database, and see how it may be implemented in ZODB.
We have a set of albums, and a set of tracks. Now, you may wish to access the tracks, and from there, if need be, access the album of which it is a part. On the other hand, you may access an album, and then want to access the tracks that make up that album.
In the relational model, you would need a table for each, albums and tracks, and a foreign key from a track to an album. You’d need an additional table to maintain the relationship between the album and tracks.
When you realise that a track can be in multiple albums, you’d have to create one more table for that relationship, instead of using a foreign key.
Now, let us look at how to do this using ZODB. The initial step is to create/open the database, open a connection and access its root.
Let’s write this basic code in
app_db.py, as you will need to use it in each script that uses the application database.
from ZODB import FileStorage, DB class app_db(object): def __init__(self, path='./Data.fs'): self.storage = FileStorage.FileStorage(path) self.db = DB(self.storage) self.connection = self.db.open() self.dbroot = self.connection.root() def close(self): self.connection.close() self.db.close() self.storage.close()
Let’s next write a script,
create_containers.py, to create b-tree containers for albums and tracks.
from app_db import app_db import transaction from BTrees.OOBTree import OOBTree db = app_db() dbroot = db.dbroot dbroot['Albums']=OOBTree() dbroot['Tracks']=OOBTree() transaction.commit() db.close()
The next step is to define the models you need. Let’s write them in
app_models.py. Each track can belong to multiple albums, and each album contains multiple tracks. The only note-worthy line is the assignment of 1 to the
_p_changed variable, to tell ZODB that a mutable structure like a list, or a dictionary, has changed.
from persistent import Persistent class Track(Persistent): def __init__(self, title, artist=None, albums=): self.title = title self.artist = artist self.albums = albums def add_album(self, album): self.albums.append(album) self._p_changed = 1 class Album(Persistent): def __init__(self, name, year=None): self.name = name self.year = year self.tracks =  def add_track(self, track): self.tracks.append(track) self._p_changed = 1
Let us create a simple script,
store_data.py, to add some tracks and an album.
from app_db import app_db from app_models import Album, Track import transaction db = app_db() albums = db.dbroot['Albums'] tracks = db.dbroot['Tracks'] tracks['Blowing in the Wind'] = Track('Blowing in the Wind', artist='Bob Dylan') tracks['Like a Rolling Stone'] = Track('Like a Rolling Stone', artist='Bob Dylan') # the key can be any unique id albums['U1'] = Album('Ultimate Collection') # add relationships album = albums['U1'] track = tracks['Blowing in the Wind'] track.add_album(album) album.add_track(track) track = tracks['Like a Rolling Stone'] track.add_album(album) album.add_track(track) transaction.commit() db.close()
Finally, print the data, to see how to access the data in ZODB. Iterate over each album and each track, and print the values of the object. The details flag is used to prevent an indefinite recursive loop.
from app_db import app_db from app_models import Track, Album def print_album(album, details=True): print('Name: %s in %s '%(album.name, album.year)) if details: for track in album.tracks: print_track(track, details=False) print('') def print_track(track,details=True): print('Title: %s by %s'%(track.title, track.artist)) if details: for album in track.albums: print_album(album,details=False) print('') db = app_db() # iterate over albums and tracks print('List of Albums') for album in db.dbroot['Albums'].values(): print_album(album) print('List of Tracks') for track in db.dbroot['Tracks'].values(): print_track(track) db.close()
Working with ZODB is almost as easy as dealing with dictionaries. You can use the Python method
isinstanceof to determine the type of an object you are dealing with, and write very versatile and flexible code. ZODB has been around for over a decade, and has been used in various production environments, though the Zope community does not seem to have been successful in marketing it to developers for use outside the Zope (or Plone) environments.