Phantasmal MUD Lib for DGD

Phantasmal Site > DGD > Writing a Library > Object Database

Using an Object Database

A question that appears frequently on the MUD-Dev list and occasionally on the DGD list is whether it would be a good idea to use an external database to store MUD objects and have DGD access that database. Normally, DGD stores the objects itself in its own internal database.

(Note: you might not 'really' mean a database. Maybe you're just talking about having a bunch of master data and then instances of it, like if you had a 'database' of monsters and then all your actual in-MUD critters pointed to that master list. If so, skip down to 'Internal Databases' and skip the rest of this page.)

Storing DGD Objects in a Database

You might quite reasonably think, 'well, MySQL or PostgreSQL must be better databases than DGD.' And, in general, you'd be right. However, there are some specific integration issues with DGD's dialect of LPC that can cause problems.

There's a reason that DGD doesn't normally do callout to other languages. DGD's LPC dialect has features that are very difficult to implement cross-language. For instance, rlimits() and atomic functions are found in basically no other language, and are not part of almost any cross-language callout interface. To use an external database, you'd need to create a module for the DGD server to handle interruptions gracefully, and figure out how many ticks different database operations took, and convert the errors they raised to DGD errors. That's not impossible, but it's quite difficult. You could declare that rlimits() and atomic functions don't work correctly for database operations. But since those features are all about guaranteeing uniform behavior in worst cases, that means you're giving up some of DGD's most powerful and compelling features.

Another of DGD's unique features, upgrade-in-place, may present difficulties. Since DGD can have multiple versions of the same object with different code and different behavior, you must figure out how (and whether) to represent that in the external database's data. It's possible to reflect that by recording the issue number of each object's code, and being able to look up objects by issue number in order to call the appropriate function. However, this will require extensive and subtle changes to your object daemon. If you don't know what any of this means (and most people don't), just believe me that you will sometimes get weird behavior from DGD if you don't understand its upgrade-in-place mechanism very well and interact with it carefully. Being able to recompile an object on the fly has some potentially sticky consequences.

You could just store all DGD objects as DGD objects and make them query their field data out of the database. That will give you the worst of the two behaviors in most cases, though, and require a lot of database queries, which won't give you good database performance. Databases usually prefer a small number of larger queries so that they can optimize properly.

It's quite possible to do all this, but it's another reason that DGD is simply a poor fit for the idea of storing objects in an external database. The fact that DGD already stores them so well and so quickly means that the possible benefit of using an external database is much smaller as well. In general, DGD is simply poorly suited to the use of an external database.

Dworkin has mentioned the possibility of eventually modifying DGD to be able to use an external object database. However, he's not planning to do it at least until multiprocessor support is done. When⁄if DGD is upgraded that way, the 'poor fit' comments above may be obsolete. Then again, they may not.

Non-LPC Information Databases

You can get around some problems by not storing each DGD object individually in a database, but just querying big blocks of information in less-structured form in them. This is possible, but is essentially using the database as an external server, one on which no actual LPC objects are stored. This is useful, for instance, if web pages are there and DGD will be looking at them and maybe processing or serving them, but will not be upgrading them directly or otherwise modifying them as LPC objects. In that case, the database can be accessed like any other external server. Doing so may require the network patch if outgoing connections must be made from the DGD server.

The problem here is that you still need to do all the usual management of DGD objects, and that's where you're storing almost everything. Anything that's in a DGD object isn't in the database, so you can't do big complicated queries on most of your MUD's information. You're just storing some other random stuff in a database. It works fine, but it probably won't do what you want it to.

Internal Databases

Well, what about a database written in LPC? You could have an object that redirects all calls to get descriptions or other textual data to a large database daemon. Would this cut down on memory and increase processor use?

Usually, yes. Some considerations:

DGD makes sure that strings are shared within objects. Constructed strings (such as the results of string addition or taking a substring) are not shared with identical strings created in a different manner until the object is swapped out. Basically, if you want to be sure you don't have lots of copies of the same string, swap the object(s) out. Normally you don't need to manually do this, it'll get done for you reasonably quickly.

Since DGD does string sharing for you, you may get less benefit from doing your own string sharing than you think. When DGD is doing this well, it does it better than you can.

Putting many shared strings in a single object creates a huge object that would have to be in memory all the time, quite possibly <increasing> memory use, even though it would decrease the total space (memory plus disk) used. So if you're going to make an LPC in-memory database, make sure to distribute it across many DGD objects, not just a few. That way only the currently active objects need to be in your server's memory at a given time.

Simple Cases

So what if you're here because you think it'd be cool to have a master list of monsters or objects or rooms and just return references to that shared information? Well, for starters you probably didn't understand much of the above. It's way more than you need.

Just returning the master copy of the string or array every time will do pretty well for you. It'll take very little more CPU time, and you'll save a lot of memory. It's a good trade-off. Make sure each of your master objects is its own DGD object (note: this is not the same thing as the master object of a clone, and using clones won't work here). Basically, don't make a big list of short descriptions and a big list of long descriptions and so on, and keep them all in arrays. If you do, the whole array will need to be in memory all the time. Far better to keep the data per-monster, with each master monster object having a short, a long, some combat statistics, and so on. Then only the monsters you're using need to be in memory.

You can do the same thing with objects or rooms, which would be good if you're going to have a lot of some object (rocks or swords, say), or if you're going to make a procedural wilderness for the players to wander through. The more copies you'll have of the objects in question, the better an idea this is.

Note that in the 'procedural wilderness' thing there's a whole other set of questions about whether you have rooms existing at all when players aren't in them. That's beyond the scope of this specific document. But yes, it's a good question and there are various interesting answers to it.

From: DGD Mailing List (Noah Gibbs)
Date: Wed Nov 20 10:01:01 2002
Subject: Speculative Execution -- WAS:[DGD] Using MySQL C++ API

On Wed, 20 Nov 2002, Erwin Harte wrote:
> It's only right out with MySQL if you use auto-commit features or use
> tables that lack a rollback feature, I would think?

  Correct, assuming there are no side-effects and you can break out
interruptibly on command.  However, you can't just use the existing C++
API unless you integrate it pretty closely into DGD -- just doing callout
won't work.  Even if MySQL can support those features, they have to be
integrated very carefully and very specifically in a way that currently
requires putting them into a new, modified DGD server.

  Conceivably, DGD could allow a callout interface where DGD code can call
into other code, and the other code would just have to supply all kinds of
hooks for interruptability and atomicity.  In practice, so few other
pieces of code of any kind allow this that there's no reason for it --
it's less work to just integrate MySQL into DGD once (and possibly add
more layers of rollback, which you'd have to either way) than to build a
general interface like this and then only use it for MySQL.  I can't think
of very many other pieces of code you can do those things with, and
adding them pretty much eliminates the other code's speed advantage over
DGD.  Plus you've gotta deal with things like tick counts for non-DGD
code.  It all gets to be a mess.

  Granted, it'd be cool to be able to write in Perl and do the atomicity,
interruptability and CPU usage stuff.  But then why get DGD involved?

> I would describe speculative executation as "Let's start executing
> this thread, keeping track of the changes we've made so far, and just
> see where we run into trouble".  Trouble is defined as two threads
> modifying the same thing, since only one should be able to succeed in
> doing so.  If this happens, the one detecting the conflict will be
> rolled back and reattempted from the start.
> 
> At least, I hope I described that accurately. :)

  That's my understanding.  In case I haven't said before:  I think that's
a *really* cool approach to multithreading!

- Noah Gibbs