Archive for the ‘web’ Category

Database Transplant Successful

Wednesday, January 31st, 2007

OmniBase is dead.

I killed it.

The replacement with GLORP has been successful. Rate of error discovery has stabilized and, after about 3 weeks of production experience and many small fixes, is now below OmniBase’s on its best day.

A few things I’ve learned along the way:

1) Put limits on all search queries. All of them. Too many times in the last couple weeks the system became unresponsive for a couple minutes as it fetched every blessed row in the database. If you’re getting that much stuff, you’re not gonna find it. I’ve limited all searches to 20 rows.

2) Use plain old objects wherever possible. OODBs have a history of being rotten at schema migration. So all objects used dictionaries for ivar storage. This makes it easy to add or remove ivars without getting messed up in the OODB as the object’s ‘format’ is always the same. However, all that hashing and fetching, and searching, isn’t free. I didn’t realize how expensive it was until I ripped it out and replaced them with vanilla ivars. Zooooom!

Because the mapping is from the objects to the RDBMS through GLORP on an attribute level rather than relying on blitting a whole object as a chunk, the format of the objects is decoupled from the database representation. Changes in the object’s binary format has no bearing on the database representation.

3) When generating accessors – add lazy initialization where nil is not an acceptable value.

4) Keep it meta. The meta model is the most important thing we have. With a good meta model all else can be rapidly replaced. Consequently, I think the next step in evolution – conversion of the meta model to use Magritte and then replacement of the UI to use Magritte as well, will radically improve agility.

Meanwhile, to prepare, I’m updating the version of Seaside that the application is built on. This is turning out to be harder than I thought as Seaside has moved on quite a bit and many classes in my old version have been simply replaced in the newer one.

The really great thing about using meta facilities like glorp and magritte, is that the application continues to shrink until it is mostly just the meta model.

Less code, less bugs.

Object Oriented Databases (OODBMS)

Wednesday, December 20th, 2006

I used to be something of an expert on Object Oriented Database Systems, how to use them, etc. The way you get to be an expert at a technology is to simply get ahead of the adoption curve and spend a bunch of time figuring out how to make the technology work before it becomes common knowledge. At this point, you can charge premium rates as you have scarce knowledge in your posession.

I got to this position by working for some adventurous folks that were willing to take a chance on the hype. The first one I came across was ObjectStore with C++. The project didn’t launch, but we built lots of prototypes and I got a good indoctrination into the ups and downs of OODBMS lore. Later on I was exposed to some others, Versant, Poet, and some lesser known ones. Like relational databases, once you get the hang of one, the rest are pretty similar. The key concepts are:

1) OODBS allow you to create one or more named “roots”. A root is basically a variable – you ask for the object at root “foo” and get it back. Some only give you one root. If you only get one root, then almost always you just stick a hash/map/dictionary at the root and pretend you have several anyhow. The root is your entry point to the data.

2) All object manipulations/accesses must be done within a transaction context. So you end up digging through your app looking for sensible transaction boundaries. For a web app, you typically begin a transaction at the beginning of a request and commit it just before sending the response. You want to keep transactions short so as not to have other users waiting on locks.

3) Objects become part of the database via “reachability”. The OODBMS will “trace” your object graph starting at the root upon commit, calculate changes to the graph, and then write the changes to the database. Any new objects reachable from the root object automatically becomes part of the database. While this might sound expensive, it generally is quite cheap.

So you generally open a transaction, lookup an object from a root, navigate to the object of interest, make changes, and then commit the transaction. Many also let you hang onto an object reference across transactions. The object reference can only be accessed within a transaction – trying to read data from it outside of a transaction will fail with an exception. This makes redrawing user interfaces problematic.

OODBs come from the CAD world where you have a network of a zillion objects, all slightly different, where mapping them to a regular container like a db table would be really expensive. They’re really good at this object persistence game.

OODBs are seductive. They are easy to get started with. For one thing, you don’t have to do a data model, just your object model. Your code is your model. You make objects, stick them in containers, and forget about them. Sounds great, right?

But as anyone who has lived with an OODBMS for any period of time knows, Object databases are great, until they’re not, and then they truly suck. Here’s why:

1) Concurrency is very poor. As I mentioned, OODBs come from the CAD world and work well for storing complex cad models. But CAD models are seldom updated concurrently by large numbers of people. As you modify objects within a transaction, the OODB has to obtain locks on your modified objects to guarantee consistency. Unfortunately, none of them (that I know of) implement object level locking. Most implement locking at the memory page level. Spurious lock conflicts where two unrelated objects share a memory page can be common. Resolving these conflicts can be expensive. Because, all work must happen withing a transaction, transactions tend to be on the long side.

2) Constant re-fetching of data every transaction makes keeping user interface elements up to date very expensive. There is no user level in-memory caching without writing user level code to create transient copies.

3) Schema migration is hard, if not impossible. Your object defines your format. Adding a field to a class makes your in-memory model inconsistent with the slabs of bits you wrote out before you added the field. There are ways around this. The usual one is to have one ivar that is a dictionary. Otherwise, there are usually some very user un-friendly scripts that have to be run. In many cases, the database must be taken offline to do this. So much for your three nines availability.

4) Death by a trillion bug fixes. I can’t speak for all, but ObjectStore would require the database be taken offline and an update script be run for every upgrade. For a site that is supposed to be up all the time, this isn’t acceptable. So upgrades were deferred. When we did this, we found that

5) OODBMS providers have limited resources and will only support versions up to one year old. If you get too far out of date and your db goes down, you are flat out of luck. The support people won’t help you. Only a really large organization could afford to keep up with all the little point fix releases ObjectStore made in a year – we couldn’t afford the man power or the down time.

6) Bugs are forever. If you put a bug into your program that damages the object model, it becomes enshrined in the database. Subsequent read code that finds the malformed chunk of the object model will usually fail. Subtle corruptions build up over time making a full database walk harder and harder to complete over time. Conventional databases can avoid this by implementing appropriate constraints.

7) No security. Any screwball developer can destroy your reference data (usually stored in ordered collections off of the root). A conventional relational database can safeguard important data with roles, permissions, and constraints.

8) Garbage Collection is not universally available. Orphaned junk is common. Some OODBs provide GC utilities, however they can fail if there is corrupt data (see items 6 and 7).

9) No ad hoc query capability. You have to write a new program to view any data at all. You need to write programs to update reference data. You need a program to do anything at all with your data. No fixing problems with a quick line of SQL. Searching for unanticipated patterns is difficult.

I’ve been bitten by all of these issues at one time or another and have recently inherited an application written using a Smalltalk OODB called OmniBase. Debugging this application is extremely painful because launching a debugger results in the transaction being terminated and all object references becoming invalid. Thus, the data that might provide a clue as to the source of the error is gone. Additionally, while the author claims to provide support, he simply collects fees and then tells you that your application doesn’t run in his environment, blames you for writing rotten code, and declines future contact.

So this dog has to go.

Fortunately, you can get most of the benefits of an OODB without the drawbacks by using an Object Relational Mapping framework. I’ve selected GLORP, an open source mapping framework that is improving all the time, and found that I can implement support the part of OmniBase’s API with very little change to the user interface, which is written in Seaside under Squeak.

Next time, I’ll talk a little bit about how this works.

BadPage.info moved to BadPage.net

Tuesday, August 29th, 2006

Spammers have been hammering BadPage.info (also hosted on this machine) with click fraud traffic in hopes that the proxy will forward it.  It did for a few days – but I’ve locked it down and it won’t anymore.

That didn’t stop the spammers from continuing to pound the machine so I instituted dynamic countermeasures.   This helped for awhile but now the machine seems to be spending all of its time figuring out which traffic to drop.  Bandwidth dropped to about 20k/s.  Enough is enough.  I parked the domain and moved the app to BadPage.net.  Let NetworkSolutions handle the traffic for awhile.

This server is under attack

Wednesday, August 23rd, 2006

And has been for the last week. I was using mod_proxy to proxy requests to BadPage.info and, due to the lack of sensible documentation (which I still don’t quite understand – a few examples would be useful apache foundation), my web server was turned into a spam forwarding machine.

I’ve disabled proxies (and thus BadPage.info is off the air just now). I’m still getting hammered from indousa.us sending malformed http requests in hopes of them turning into spam. Since I’ve set up explicit deny rules for them, I’m hopeful they’ll check their logs soon and get the hint.

If you’re in the mood to test a server loading script, indousa.us seems like a dandy test bed. Give them my regards.

Update: As soon as I deny that domain, a dozen others arise to take its place. Anyone with extensive skill on the use of mod_rewrite and mod_proxy that cares to help, please give me a shout.

Update: Solved it I think. BadPage.info is back on the air but all other proxy requests are being denied. I am still being hammered though.

Wikis – The new generation.

Wednesday, August 16th, 2006

Scoble is investigating collaborative tools – primarily chat and wiki tools. Seaside and Squeak are powering some really cool new capabilities. Like LogoWiki, a wiki that allows people to embed executable Logo programs. Useful for making educational sites about geometry and introductory programming. LogoWiki is built upon Pier, a wiki so rich in features that it approaches the level of a content management system. Given that the wiki was invented by a Smalltalker, this seems like a return to wiki’s roots.

Gnomedex is imminent

Wednesday, June 28th, 2006

And I am psyched. Should be lots of fun. I’ll be staying at Bell Harbor Marina aboard my sailing yacht Aurora on B-Dock. It’ll be the one with the hammock on the foredeck.

I’m still looking for the killer startup idea and ideal partner(s). Perhaps I’ll stumble onto something there. If you are going to Gnomedex and want to talk about doing something cool, find me. Look for the loudest Hawaiian shirt you can find.

BadPage.info has a new purpose?

Wednesday, May 17th, 2006

Scoble claims that the MS Word team finally generates clean html. I’ll believe it when it passes checks at http://badpage.info with no warnings or errors.

It will be nice for people to actually care about web standards.  The web 2.0 people have to because bad dom’s leads to broken javascript.  But so many large websites are still awful.

So I’ve met the queen of the VC’s

Wednesday, May 10th, 2006

of Seattle and she was very helpful and a lot of fun to talk with. Wish we had more time.

Now I’ve got a huge list of names to follow up with.

How to pick a startup idea

Thursday, May 4th, 2006

Something that was clear from Startup School was what kinds of ideas seem to take off.  A few things I’m looking for:

  • It should change basic user behavior – like Flickr has become the way to share photos on the web.
  • It must be “head slapping easy” to use.
  • It must encourage formation of a community.
  • It must provide embeddable content and thus be viral- like Flickr photos on a blog.
  • It should be hackable/employ web services, like google maps.

Thinking, thinking, thinking.  I’ll be talking to lots of people in the next couple weeks.  Keeping those things in mind.

Too many choices

Friday, April 28th, 2006

This weekend I can

1) Go to Seattle Mind Camp again and meet Scoble, Winer, and lots of otheres
2) Go to Smalltalk Solutions and hear about how Avi and Andrew launced DabbleDB
3) Go to Stanford and do Startup School.

I choose 3. Tough choice. Of course, there was nothing going on last week, or next week. I wish these people would coordinate.

Based on the BS review I got from my very clueless manager yesterday, I’m pretty ready to chuck that and join something new and interesting.