Berkeley DB – intro to concepts

so let’s dive a little deeper into Berkeley DB

Berkeley DB is a general purpose embedded database engine. It is extremely fast. It is compiled and linked into your application. It runs in the same process space as your application. The database can store upto 256 terabytes of data and support 4 gigabytes of record keys.

There are four different in-memory cache designs: BTree, Hash, Queue, and Recno. The BTree and Hash are both for fast indexing and retreival. BTree is good for data that has locality of reference, each element relates to each other in some way. Hash is good for extremely large data sets. Queue is used for fast insert at the tail of the queue. Queue is good for high degrees of concurrency. Renco provides support for databases whose permenant storage is a flat text file.

To insert data into the database, it is different from the other versions of databases offered from Oracle. If you look at the ExampleDatabaseLoad.java that comes with the binary distribution, you can see that entries are stored in the database with a put command.

   myDbs.getVendorDB().put(null, theKey, theData);

In the command, we reference an already opened database, get the database instance known as VendorDB and call the put function to insert data. The data is inserted as a key and data elements. The key is a single element and the data is an array of elements. The function getVendorDB returns a string that points to a database that we create using the BTree construct in a file. This is done with the

   new Database(“file”, null, DbConfig)

function. The file parameter points to a directory and a file to store data.

Note that with Berkeley DB, you need to manage where everything is, how things are created, and how to add and search elements from the repository. It does not differ much from keeping records in a file but it gives you a good way of indexing and searching files that could potentially contain large amounts of data. Data is read using a get function to retrieve data as if it were an element of a structure. The key is used to point to the right element so that the right data is accessed.

Records can also be deleted using the delete function. It is important to remember that records are not truly deleted until the cache has not been written to the disk. This can be done manually with a sync or a close function call.

Cursors can be used to iterate over records in a database. If a database allows duplicate records off one key, then the cursor is the easiest way to access something other than the first record. Records are read using the cursor.getNext() function or the getPrev() function. Data can be written using cursors either with the putNoDupData, putNoOverwrite,  putKeyFirst, or putKeyLast functions. Updates are done with cursor.putCurrent and deletes are done with cursor.delete.

It is important to remember that you can open multiple databases at the same time and run them separately in different threads and even join data between multiple databases. I will not go into this detail here. The intention of this blog entry is to introduce the concept of Berkeley DB and how to insert, delete, update, and search for database elements.

the value of experience

Yesterday I did something that some would say was an incredible waste of time. Others say that it would be one of those mastercard moments, you know “priceless”. It turns out that our local major league soccer team won the playoff games and qualified for the MLS Cup Championship. The game was scheduled for Pizza Hut Park in Dallas which is 250 miles north of here.


Let me set the stage here. My wife is a big soccer fan. She works for a men’s soccer league in town. She plays on two or three teams. We played together for a while on a co-ed team before I was advised not to play any more if I wanted to walk when I got older. My oldest son played for a while with us as well. My daughter played for a while before she figured out that she did not like running that much. Swimming and volleyball are more her sport. My youngest son plays and his mom is the coach. Her family is also big into soccer so much so that they go out of town almost every weekend for a tournament or a game. I have been volunteering for the Houston Dynamos all season as a stats keeper for home games. I have sat in the stands for three games and watch as my wife kept stats so both of us know the Dynamo staff and a few of the players.


Did I say we have a vested interest in soccer? Well, the championship game was yesterday in Dallas. We paid $45 per ticket to go to the game. We got up at 8am to make a 2:30pm game. Yes, 4 1/2 hours of driving to get to the field with some time to tailgate before the game. The game was tied at the end, 0-0. After a 15 minute overtime still 0-0. After a second 15 minute overtime, 1-1. After penalty kicks, 4-3.


When I looked at my daughter who was wearing her orange team shirt with the team scarf wrapped around her and her team hat signed by all of the starters she was smiling. She knew the importance of the day. She new that she would remember this season for a long time. After the game she wanted a copy of the special release of the Houston Chronicle pronouncing “Dynamo Wins” so that she could put it up in her school locker next to the picture of the top two stars with their arms around her.


10 hours driving in the car: $175


5 minutes while we get a speeding ticket: $125


3 hours at the game: $180 (tickets)


2 minutes of smiles, yelling, and cheering: priceless


some things are worth it.