Archive for the ‘programming’ Category

Object Oriented Databases (OODBMS)

Wednesday, December 20th, 2006

I used to be something of an expert on Object Oriented Database Systems, how to use them, etc. The way you get to be an expert at a technology is to simply get ahead of the adoption curve and spend a bunch of time figuring out how to make the technology work before it becomes common knowledge. At this point, you can charge premium rates as you have scarce knowledge in your posession.

I got to this position by working for some adventurous folks that were willing to take a chance on the hype. The first one I came across was ObjectStore with C++. The project didn’t launch, but we built lots of prototypes and I got a good indoctrination into the ups and downs of OODBMS lore. Later on I was exposed to some others, Versant, Poet, and some lesser known ones. Like relational databases, once you get the hang of one, the rest are pretty similar. The key concepts are:

1) OODBS allow you to create one or more named “roots”. A root is basically a variable – you ask for the object at root “foo” and get it back. Some only give you one root. If you only get one root, then almost always you just stick a hash/map/dictionary at the root and pretend you have several anyhow. The root is your entry point to the data.

2) All object manipulations/accesses must be done within a transaction context. So you end up digging through your app looking for sensible transaction boundaries. For a web app, you typically begin a transaction at the beginning of a request and commit it just before sending the response. You want to keep transactions short so as not to have other users waiting on locks.

3) Objects become part of the database via “reachability”. The OODBMS will “trace” your object graph starting at the root upon commit, calculate changes to the graph, and then write the changes to the database. Any new objects reachable from the root object automatically becomes part of the database. While this might sound expensive, it generally is quite cheap.

So you generally open a transaction, lookup an object from a root, navigate to the object of interest, make changes, and then commit the transaction. Many also let you hang onto an object reference across transactions. The object reference can only be accessed within a transaction – trying to read data from it outside of a transaction will fail with an exception. This makes redrawing user interfaces problematic.

OODBs come from the CAD world where you have a network of a zillion objects, all slightly different, where mapping them to a regular container like a db table would be really expensive. They’re really good at this object persistence game.

OODBs are seductive. They are easy to get started with. For one thing, you don’t have to do a data model, just your object model. Your code is your model. You make objects, stick them in containers, and forget about them. Sounds great, right?

But as anyone who has lived with an OODBMS for any period of time knows, Object databases are great, until they’re not, and then they truly suck. Here’s why:

1) Concurrency is very poor. As I mentioned, OODBs come from the CAD world and work well for storing complex cad models. But CAD models are seldom updated concurrently by large numbers of people. As you modify objects within a transaction, the OODB has to obtain locks on your modified objects to guarantee consistency. Unfortunately, none of them (that I know of) implement object level locking. Most implement locking at the memory page level. Spurious lock conflicts where two unrelated objects share a memory page can be common. Resolving these conflicts can be expensive. Because, all work must happen withing a transaction, transactions tend to be on the long side.

2) Constant re-fetching of data every transaction makes keeping user interface elements up to date very expensive. There is no user level in-memory caching without writing user level code to create transient copies.

3) Schema migration is hard, if not impossible. Your object defines your format. Adding a field to a class makes your in-memory model inconsistent with the slabs of bits you wrote out before you added the field. There are ways around this. The usual one is to have one ivar that is a dictionary. Otherwise, there are usually some very user un-friendly scripts that have to be run. In many cases, the database must be taken offline to do this. So much for your three nines availability.

4) Death by a trillion bug fixes. I can’t speak for all, but ObjectStore would require the database be taken offline and an update script be run for every upgrade. For a site that is supposed to be up all the time, this isn’t acceptable. So upgrades were deferred. When we did this, we found that

5) OODBMS providers have limited resources and will only support versions up to one year old. If you get too far out of date and your db goes down, you are flat out of luck. The support people won’t help you. Only a really large organization could afford to keep up with all the little point fix releases ObjectStore made in a year – we couldn’t afford the man power or the down time.

6) Bugs are forever. If you put a bug into your program that damages the object model, it becomes enshrined in the database. Subsequent read code that finds the malformed chunk of the object model will usually fail. Subtle corruptions build up over time making a full database walk harder and harder to complete over time. Conventional databases can avoid this by implementing appropriate constraints.

7) No security. Any screwball developer can destroy your reference data (usually stored in ordered collections off of the root). A conventional relational database can safeguard important data with roles, permissions, and constraints.

8) Garbage Collection is not universally available. Orphaned junk is common. Some OODBs provide GC utilities, however they can fail if there is corrupt data (see items 6 and 7).

9) No ad hoc query capability. You have to write a new program to view any data at all. You need to write programs to update reference data. You need a program to do anything at all with your data. No fixing problems with a quick line of SQL. Searching for unanticipated patterns is difficult.

I’ve been bitten by all of these issues at one time or another and have recently inherited an application written using a Smalltalk OODB called OmniBase. Debugging this application is extremely painful because launching a debugger results in the transaction being terminated and all object references becoming invalid. Thus, the data that might provide a clue as to the source of the error is gone. Additionally, while the author claims to provide support, he simply collects fees and then tells you that your application doesn’t run in his environment, blames you for writing rotten code, and declines future contact.

So this dog has to go.

Fortunately, you can get most of the benefits of an OODB without the drawbacks by using an Object Relational Mapping framework. I’ve selected GLORP, an open source mapping framework that is improving all the time, and found that I can implement support the part of OmniBase’s API with very little change to the user interface, which is written in Seaside under Squeak.

Next time, I’ll talk a little bit about how this works.

Education vs Training

Wednesday, November 29th, 2006

Scoble points to a post by Steve Sloan who has been teaching a podcasting/new media class at SJSU. It seems the administration would prefer to teach tools rather than theory.

I taught CS at University of Colorado, Denver for several years. Just one course per semester. We had many discussions about just this problem. UCD is a satellite campus. Most courses are taught at night. Students often have full time jobs. They want better jobs. They go to school specifically to get better jobs and they don’t have a lot spare resources to spend on school. They just want to learn the latest hot technologies like Java and .NET and be marketable. In other words, they want job training.

The problem with job training is that the information is perishable. In the mid-90s all the rage was C++. Many universities, ours included, altered their curriculum to use C++ as the teaching language in order to be more appealing to students. After all, you can teach theory using pretty much any general purpose programming language.

Except that C++ is a terrible teaching language. (Actually, its just a terrible langage). It is too complicated and I wasted many classroom hours helping students cope with the quirks in the language instead of focusing on the content. And now, the C++ knowledge is mostly useless. Nearly all C++ work has been supplanted by Java work. So the students need to retrain.

A better idea is to use languages that can most clearly illustrate the concepts with minimum extraneous complexity. More languages means more viewpoints, and tends to make students understand that the language or tool isn’t that important. It is the underlying concepts.

The University is between a rock and a hard place. With education funding cuts, they need to attract students to survive. To attract students, they need to offer classes the students want to take. Students find training classes most attractive as they offer instant gratification. But training classes are like candy. They’re not good for you in the long run and the fix is short lived. Education is more like vegetables. It is good for you, but maybe not so pleasant to digest. The University would like to stick to vegetables, but if noone orders them, they have to sell candy too.

Which is unfortunate. The state of the software industry is deplorable. I think 90% of the people programming out there ought to be doing something else. They suck at their job and aren’t even educated enough to understand that they suck much less how they suck. One trick ponies, they flounder if given a problem that isn’t pre-solved in their platform. FACT: the best indicator that a candidate is likely to fail the interview process at big river books is if they characterize themselves as a “Java Architect” or “Java Developer”. It usually means they don’t know anything else. They think Java (or .NET) is the pinnacle of software achievement. Without a proper education, they can’t conceive of anything else.

The real solution is to properly fund universities as institutes of higher learning and stick to education. If the universites want to also offer vocational training, that’s fine. Just don’t cheapen the academic programs by offering “degrees”. Certificates of completion should be adequate.

Downriver

Monday, November 27th, 2006

For the past 2 and a half years I worked at Amazon.com. It was fun for the first year – so many old assumptions and prejudices shattered. But Amazon is a special case. For most normal sized systems, my old design sense was pretty solid.

Still, it was a horizon broadening experience and I enjoyed that. I managed teams of people and we built software and I liked that as a change from the endless parade of crummy short term java contracts I was getting.

But I left last month. I joined as something of a new manager. My pay grade was commensurate with my lack of experience in that area. But eventually I grew weary of it and was itching to get back to doing nifty code if I could find a way to do it on my terms. Which means dynamic expressive languages and I get creative control of the technology. No “You’re the architect – so you’ll use this language and that vendor’s solution”. Huh, I thought I was the architect.

The other main driver to leave is no work/life balance. This isn’t Amazon specific. This is US company specfific. In the US, if you work for an established company, this is just how it is. You get 2, maybe 3 weeks of vacation and a few holidays here and there. You are expected to put in 50 hours a week. With ever rising property values and congested highways, you have to live about an hour away from work, meaning you lose 2 unbillable hours a day just travelling to work. You’re working your butt off, but you can’t enjoy the fruits of your labor.

I lived in France for about half a year. I’ve seen how Europeans live. They take 5-6 weeks of paid vacation. They can take long leaves of absence. They are able to travel the world. In the US, you can’t get enough days off to drive across the country, much less travel abroad. No wonder we are such an ignorant xenophobic lot.

I have a boat. I’d like to take the boat in the summer and explore Puget Sound, where I live. I’ll need about 4 contiguous weeks to do it. I couldn’t get the time off. Why have a boat if I can’t take the time to enjoy it?

I have friends abroad. I can never get the time to go see them. I have the money. Just not the time. Again, this is lame. So I walked. I give up on work camp America. US companies say they can’t find qualified workers. We’re around. But your terms stink. Improve them or go pound sand.

I left the big company to work for myself. I build software using tools I like. Unconventional, but productive and low-cost tools like Squeak and Seaside. I use other things too, depending on requirements. I work when I want to, from anywhere I like.

I think this is the future as more and more of my colleagues are opting for this kind of situation. The big company life holds no attraction for the seasoned employee.

On Database Initialization Scripts

Friday, November 10th, 2006

When building a new system that has a database, there will be a collection of tables that are reference tables. Lists of possible values that contain human readable versions of numeric codes. An example might be a status code for an order. The statuses might be OPEN, PENDING, CANCELLED, SHIPPED. Well written applications will make use of these reference tables to populate the user interface pick lists, thus allowing a new status, like RETURNED, to be added quickly by just adding a row to the database.

Order Status
ID Name
1 Open
2 Pending
3 Cancelled
4 Shipped
5 Returned

Of course, that means that those tables must be populated. There are right ways and wrong ways to go about this. The worst thing you can do is to simply sit down at a terminal and start typing INSERT statements right to the database. If you do this, you have no reproducibility. You may have several instances of this database, for instance a test version and a production version. Being able to build an application database from a virgin install is critical. So write scripts, keep them in version controlled files, and keep them up to date. If, somewhere down the line, you do add a status code for RETURNED, you should just update the script for that table and run it.

Which brings me to the part that is frequently overlooked and that messes up a lot of people. Database initialization scripts must be written to be IDEMPOTENT. Idempotency is that property that guarantees that performing an action multiple times has exactly the same result as performing it once. Each table initialization script should be IDEMPOTENT.

This means checking for the table and if it is missing, create it. Check for each record and if it is missing, insert it. Hard code the primary key in your reference script so it will not change. If there is a sequence involved, be sure to reinitialize it appropriately.

The reason for this is that, one day, some idiot will accidentally delete or damage part of your database’s reference data, either through coding error or just mistyping commands. Probably, the idiot will be you. You’ll look like much less of an idiot if you can just whip out your trusty database initialization script and know you can instantly repair the damage because your scripts are all written to be IDEMPOTENT.

I’m a topic on Lambda The Ultimate

Thursday, November 2nd, 2006

It seems Blanchard’s Law is gaining traction.

It is now a discussion topic at Lambda the Ultimate. A discussion group for language and programming freaks.

Pretty cool.

Got Class?

Tuesday, October 31st, 2006

In Object Oriented languages, the class is the definition of a kind of object. Classes are arranged in tree-like hierachies with the roots of the tree being very general and the leaves being most specific. So a dessert is a kind of food, and cake is a kind of dessert. Because the class serves as the definition of an object, it makes sense that objects are created by using the class as a template. In other words, classes are used to create objects of the class’s type.

All of the languages discussed so far, C++, Java, Smalltalk, and Objective C have the notion of a class. What is different is how the languages represent the class itself.

C++ has no runtime representation of a class at all. Instead, all class information is soaked up by the compiler and represented using standard functions and globals with the class name prepended to their “local” names. The way a programmer designates a function as belonging to a class is to declare it within the class’s declaration using the very overworked keyword ’static’.

class CPlusPlusClass : public BaseObject
{
public:
static void classMember(); // declare the function

}

void CPlusPlusClass::classMember() {…} // define the function

CPlusPlusClass.classMember(); // really just a long name for a global function

Ditto for class variables. There’s no such thing in C++ really. Instead the language fakes it by letting you declare the variable in the class declaration’s namespace and in the end what you’ve got is a global with a very long name. Since there’s no class object, there’s no “this” variable and no way to do anything the least bit object oriented in class methods without explicitly naming super classes.

There’s also no built-in way to abstract object creation. When one wishes to create an object in C++, one uses the ‘new’ operator in conjunction with the class name, which coincidentally is also the required name of the initialization function. So in C++ we write:

BaseObject *c = new CPlusPlusClass();
Notice that here – at the moment of object creation, it is necessary to reveal the actual type of the object. If one desires a more abstract method of instance creation, it is necessary to adopt a factory idiom. Of course, converting to a factory creation mechanism from the use of operator new requires extensive modification of code, similar to the extensive work required to handle a new kind of exception.

Smalltalk takes the opposite extreme. Classes are represented as regular objects and are arranged into an inheritance hierarchy. The special variable “self” refers to the class object itself and ’super’ refers to the object that represents the classes superclass. Smalltalk class methods exhibit full polymorphism and class variables are made available to class methods and instance methods in the class’s instances. The Smalltalk class takes responsibility for instance creation and is thus a factory by default.

For instance, in the Squeak implementation, ImageReadWriter is an abstract class for reading and writing image files. Concrete subclasses exist for JPEG, GIF, PNG and so forth. The abstract base class makes use of class hierarchy navigation to find the correct subclass to render the image data. It looks something like this (error handling omitted):

“Find the first subclass that claims to be able to understand the binary stream’s data”

readerClass := self withAllSubclasses detect: [:subclass | binaryStream reset. (subclass new on: binaryStream) understandsImageFormat].

“Instantiate a reader from the class”
reader := readerClass new on: (binaryStream reset).

“Return the image from the reader”
^reader nextImage.

The beauty of this approach is that one can always write a new subclass of ImageReadWriter for a new image format and the new format is instantly supported without having to deal with registries or factory idioms.

Techniques like this demonstrate the power of using the object paradigm to represent classes.

Objective C uses a similar approach to Smalltalk with a couple of odd constraints. Objective C class objects lack support for class member variables. Instead one fakes it with static (as in C variables with internal linkage) variables. Also, Objective C performs lazy loading of libraries which means that not all subclasses are necessarily present at any given time. So the subclasses trick is a little hard to perform.

On the other hand, Objective C uses the same object creation mechanism as Smalltalk. So its still possible that the actual class of the object is not the same as the class that created it for you. Thus, the power of using objects to represent classes is mostly preserved.

Java’s approach is closer to C++ than to Smalltalk. While Java has ‘objects’ of type java.lang.Class for representing classes, these ‘objects’ have more in common with C structs than with objects in any other sense. Object creation is not performed by the class. Rather Java slavishly copies C++’s use of operator ‘new’.

Java also makes use of a ’static’ keyword to denote ‘class methods’. But these ‘methods’ are actually nothing but functions with long names. In fact, its not possible to do anything polymorphic in a class method. There isn’t even a way to get ahold of the class object representing the class.

public class A
{
// something that doesn’t work – no ‘this’
static void printClassName1() { System.out.println(”"+this.class); }

// this doesn’t work either
static void printClassName2() { System.out.println(”"+getClass()); }

// this does – but in subclasses it will be wrong
static void printClassName3() { System.out.println(”"+A.class); }
}

So crippling is the decision to not provide proper class objects, that the Enterprise Java Bean designers found it necessary to simulate class objects with the notion of “Home” objects. The “Home” object is intended to provide the ability to find existing objects or create new ones for a given class. This is definitely behavior that would ordinarily be put into a class object.

If we had one.

Painters

Friday, October 20th, 2006

About three or four years ago, I was playing around with a concept I called Bricks. It was a way of factoring drawing operations out of UI components to make it easy to create new looks. The drawing operations were encapsulated into objects called Painters. I was doing it in Squeak using Morphic. It looked like this:

It must have been a good idea because I just ran across something similar from the Java Swing people.

So I’ve decided to dust it off and pick it up again. There have been a lot of changes in Morphic and I’ve given up on reworking the event deliver system, choosing to work with Morphic’s system, warts and all. I still think splitting drawing and layout will be valuable.

I Love You – Now Change

Monday, October 2nd, 2006

Which company would you prefer to invest in?

Company A:

“Our company is a dynamic organization with the agility to quickly adapt to new market conditions.”or

Company B:

“Our company is stable, well structured, and organized. What we are doing now is a perfect basis for everything we will do in the future.”(Sounds a little like the Bush administration)

The dynamic organization that can change quickly is going to be more successful than a static organization that is set in its ways.

So how come the software industry pundits continue to try to push static programming systems over dynamic ones when dynamic systems are generally more successful? Its senseless.

In C++ and Java, the assumption is that the superclass designer knows best. Whatever interface the original developer has exposed is expected to be the perfect and complete interface for all time. There is no way to extend an existing interface without owning the source code to it.

The implementation, it is recognized, might not be exactly correct in all cases, so the implementation is left open to extension via the one mechanism made available – subclassing (maybe – Java actually allows the most arrogant developers to forbid subclassing via the ‘final’ keyword).

The problem with leaving only subclassing is that subclassing, by itself only provides for extension of the system, not for modification. Fans of Robert C Martin or Bertrand Meyer might recognize this as The Open Closed Principle. Sadly, The Open Closed Principle is only works if you happen to work for a static company like Company B.

The harsh reality is that organizations are organic – they evolve and grow to adapt to new environment conditions. Failure to evolve is death. How can you modify your organization if the software that runs your organization is closed to modification by design? Worse, the underlying tools and technology on which your software is built actually work to enforce The Open Close Principle.

So how else to evolve your system? Objective C has a construct called a Method Category or more commonly just “category”. A category is a collection of methods for a class that may be loaded dynamically – or not. These are collections of additional methods to be added to existing classes. These additions may be made part of the organizations core software assets, or they can be application specific extensions that are too specialized for general consumption.

For instance, a web services application may find it convenient to add some methods to the string class for parsing up web requests, but the billing system doesn’t need this category of methods and so doesn’t bother to load it.

Categories can also make adapting an existing class to a new protocol easy. Not having a number of separate adaptor classes all over the place keeps the number of classes low and the conceptual size of the application architecture smaller.

Finally, categories can allow the user of a class to replace a buggy or inappropriately implemented method with a new implementation without having the source code.

Another useful tool is the ability to replace one class with another. The Objective C tool for this is known as “posing”. One writes a subclass of the original class and then says

[NewClass poseAs: [OldClass class]];

Now saying [OldClass new] actually constructs an instance of NewClass. This can be handy for sneaking superclasses into the class hierarchy and also for debugging around code you don’t own.

Using these techniques, along with message forwarding and delegation, subclassing takes a back seat to application assembly and drops from the most used tool to the mechanism of last resort. After all, its much better to simply arrange the classes you already have into the right structure than to create entirely new code with entirely new bugs.

Smalltalk has similar mechanisms and results in similar designs. Method categories exist and can be loaded as packages. Posing is done quite easily by replacing a class in the Smalltalk class dictionary with another class and executing a become: on all of the old classes instances. Its easy to insert new classes anywhere in the hierarchy and all of the code is easily accessible and modifiable. Subclassing in Smalltalk application development is a relatively rare event.

Of course, if you’re sure you know what you’re doing – perhaps Java and C++ are the right languages. If you’re sure, that is.

Relax, It’s Nothing!

Tuesday, September 12th, 2006

Originally published August, 14, 2001

What’s the most frequently seen message produced by Java programs?

Its got to be the NullPointerException. In efforts to avoid this dreaded message, programmers have adopted an idiom that looks something like:

if(object != null) object.doSomething();

Which basically means – only send the message if there’s something there to receive it. Thats how you keep the exception from being thrown. The idiom must be burdensome as NullPointerExceptions continue to pop up at unexpected times and the fix is invariably to put this sort of test on the line from which the exception is raised.

This is an idiom adapted from C and later C++ where null is memory address zero by convention. Since important chunks of the operating system live down in that memory region, modern operating systems protect that region from fiddling and view trying to reference anything off of memory address zero as a likely programmer error. So its not allowed in the name of protecting the operating system.

C and C++ programmers, in efforts to get their programs to live long enough to let the program report whats going on, check pointers to make sure that they aren’t null before using them. This is because the penalty for using a null pointer in these environments is death.

These days, applications are developed using higher level languages, like Java and Smalltalk. These have no pointers at all. Instead we have object references and its not possible for the programmer to endanger the operating system by messaging a null object reference. (Objective C uses pointers as object references, but the programmer never directly dereferences them – its all done by the Objective C runtime).

So now, released from the need to steer clear of the operating system’s defense mechanisms, we can take a step back and think about what sending a message to nothing means.

Shouting across the street to someone who isn’t there may make you feel silly – but nothing really happens. Same for sending a letter to a non-existent address. If there’s no return address, the post office will eventually give up and toss it. Nothing too terrible there. There’s simply no one to receive the message.

So why does Java take the position that messaging null is so great a catastrophe that the programmer must be burdened with handling an exception? Especially when most of the time, the programmer’s method of avoiding the unwanted exception is to add a test for null before sending the message. In other words, the programmer will fix it with:

if(object != null) object.doSomething();

which is a just a way of saying “if there’s noone to receive the message – do nothing”. Worse, the programmer has to add this check to every single location in the code that tries to make use of the object reference.

This is the same nuisance condition we identified with typing of object references. Recall that the dynamic languages provided a means of handling this condition in a central location via the doesNotUnderstand, while the Java version required the programmer to handle it at each call site.

A similar situation exists with the use of null in Java. It must be checked for and handled at each call site. There must be a better way.

On the dynamic side of the world things are simpler. In Smalltalk, null (actually, in Smalltalk its referred to as ‘nil’) is a global singleton object of the class UndefinedObject. It implements hardly any messages and so messaging nil results in nil receiving doesNotUnderstand. The default behavior of doesNotUnderstand in nil is to halt the program and throw a debugger around it. Many deployed systems change this behavior on deployment to log the behavior and stack trace, or to simply return nil.

In Objective C, messaging nil results in nil being returned by default. This behavior can be changed in the runtime by adding a hook function to the runtime that could do logging, or raise an exception.

In either case, the consequences of messaging nil are under the control of the programmer and can range from totally benign to fatal, depending on the developer’s preference and the application domain. Experience with developing applications in this environment has shown that messaging nil is nearly always harmless, and not having to place tests for nil before every message send results in smaller, cleaner, and easier to understand code.

Plus, the applications don’t crash nearly as often. This is yet another example of how a feature in Java that is intended to improve software reliability, actually undermines it.

Function Calling vs Message Sending

Thursday, August 24th, 2006

“What happens if your reference refers to a different type of object than you expect?”

Its probably a programming error.

Or not.

In C++, because of the way vtable dispatching works, the program will either crash (hopefully) or the wrong member function will be called. Either way – the end result is probably fatal.

Java is slightly better. In order to call a member function, the java compiler checks to make sure that the function is defined in one of the interfaces implemented by the type of the object reference. The problem arises when the object reference has been widened to a more general type that does not define the desired member function.

How can this happen?

public class A extends Object { public void anAThing() {} … }

A myA = new A(); // make an A
someList.add(myA); // put it in a list … someList.get(0).anAThing; // error – List.get(int) returns Object.

so we have to downcast the Object reference returned by someList.get() to an A reference

((A)someList.get(0)).anAThing(); // Might be OK if someList has an A

Casting is completely type unsafe. Although in Java the result of incorrect casting is an exception. So we could write:

try { ((A)someList.get(0)).anAThing(); }
catch(ClassCastException ex) true

which allows us to handle the casting error and continue. This is a huge step up from the C++ behavior of crashing in that it allows the programmer some control over what to do if the cast is wrong. If we desire a more conventional means of writing this, it is possible to use the instanceof operator and do the check before the cast.

if(someList.get(0) instanceof A) ((A)someList.get(0)).anAThing();

which is pretty much the same thing. Of course, this forces programmers to clutter their code with all sorts of tests or try/catch blocks. It seems to me that getting away from that sort of thing was precisely the reason everybody wanted to switch to object oriented programming in the first place. Polymorphism replaced an awful lot of if ladders and switch statements and here we are putting them back in to work around a runtime system that throws an exception if we guess incorrectly about the type of an object.

Not to mention the idea that its quite possible to have this situation:

public class A extends Object { public void doAThing(); …}
public class B extends Object { public void doAThing(); …}

Object myB = new B();
((B)myB).doAThing(); // fine
((A)myB).doAThing(); // error – object is not an A!

which seems just silly. We have an object that we are pretty sure implements the operation doAThing – but that’s not good enough. We have to know exactly which interface this particular object implements in order to call that method. Thus, type defines protocol in the statically typed world.

The problem is that such an arrangement assumes the existence of what Bart Kosko calls crisp sets and hierarchies. The world isn’t nearly so neat. Its fuzzy. B might well be capable of performing some of A’s operations and in some (but perhaps not all) circumstances, B might be an excellent stand-in for an A. Its a more accurate model. To quote Kosko again. “fuzz up – precision up”.

OK, so what about the dynamically typed languages? How are they better? Assume we have the same classes A and B derived from Object and each implements doAThing.

| anA |

anA := A new.
anA doAThing.

anA := B new.
anA doAThing.

anA := ‘this is a string’.
anA doAThing.

This all works except for the last line when anA refers to a String rather than an instance of A or B. String definitely doesn’t implement doAThing. So what happens?

First, it helps to know that the runtime systems for Smalltalk and Objective C are “message sending” rather than “function calling”. When the programmer tries to send a message to an object that doesn’t respond to that message, the runtime packages the message up as an object and calls a catch-all message instead. In Smalltalk this is usually called doesNotUnderstand: message.

In Objective C, a special message called forwardInvocation: is called to give the programmer a chance to send the message to some other object such as a delegate. The default implementation of forwardInvocation: doesn’t do any forwarding. Instead it just calls another method doesNotRespondToSelector: which raises an exception.

The programmer may choose to respond to these messages in a class specific way – polymorphically, by overriding forwardInvocation: or doesNotUnderstand:

Doing this moves the error handling to a central location rather than forcing the programmer to scatter it throughout the code at the call locations. The end result is cleaner, smaller code.

Plus there’s a bonus. Being able to forward messages provides a clean mechanism for building chain of command patterns and allows an object to be “decorated” with new behaviors dynamically.

Its also easy to do distributed computing by having the forwardInvocation method perform remote procedure calls over the network without the need to do clumsy code generation of proxies and stubs common in C, CORBA, and Java RMI programs. A single proxy class can stand in for any kind of object.

The doesNotUnderstand: message can also provide a trigger for database fetching and object faulting. When a message is sent to a simple database query object that implements almost no messages, doesNotUnderstand: is invoked, the database query is executed, and the object replaced with the results of the fetch. The message is then delivered to the newly fetched object. Such faulting mechanisms can simplify programming and virtually eliminate the need for application programmers to directly interact with a database API.

These extra capabilities are nearly impossible to implement in the statically typed environments and this is clearly a case when the dynamically typed environment yields simpler application code (we don’t have all those try/catch blocks or instanceof tests). Simpler application code means greater reliability with reduced programmer effort. This all translates to faster development times and lower costs.