Archive for August, 2006

BadPage.info moved to BadPage.net

Tuesday, August 29th, 2006

Spammers have been hammering BadPage.info (also hosted on this machine) with click fraud traffic in hopes that the proxy will forward it.  It did for a few days – but I’ve locked it down and it won’t anymore.

That didn’t stop the spammers from continuing to pound the machine so I instituted dynamic countermeasures.   This helped for awhile but now the machine seems to be spending all of its time figuring out which traffic to drop.  Bandwidth dropped to about 20k/s.  Enough is enough.  I parked the domain and moved the app to BadPage.net.  Let NetworkSolutions handle the traffic for awhile.

Closures for Java?

Thursday, August 24th, 2006

Gilad Bracha of Strongtalk fame has a post talking about the proposed introduction of closures to Java.

I consider Java to be so broken at this point that I view it as “lipstick for a pig” and agree with Bracha’s position that its unfortunate that they were delayed to the point that all the common idioms that might have been done with closures have been established using less elegant techniques. In other words, I think its too late.

Its a chain of constraints produced by a couple of root decisions. First was the insistence on mandatory manifest typing. The Strongtalk system and Objective C language both support optional manifest typing. Objective C does it to provide compile time warnings for programmers to help them catch mistakes. Strongtalk actually uses the type information to call type optimized versions of operations. This technique is part of the foundation of the HotSpot JVM.

The second design problem is Java’s choice of C style function calling semantics vs Smalltalk’s message sending. Message sending is the more flexible and uniform technique. You have an entity at some distance, you send it a message asking it to do something. If it knows what you are asking, it does it and replies with your result. If it doesn’t know what to do, some default action occurs. This is how network available services perform and is also how Smalltalk objects interact.

Java, C++, C take the position that the compiler has assured that the receiver is of a type that MUST understand the message you are sending. They do this by constraining the messages you are allowed to send at compile time. This is meant to prevent errors but it also needlessly limits program flexibility and forces the developer to deal with a second interaction model – one for network resources and another for local ones. The Smalltalker has only one interaction model to contend with.

This is important and profound. In fact, Alan Kay, the man who coined the term “Object Oriented” laments the choice of words saying: “Object-oriented programming is about messages, not the objects. We worry about the objects, but it’s the messages that matter.”

But Java is function-call oriented. Consequently, the closures end up looking like local function declarations – which means they are awkward and ugly to declare and use inline and they are littered with extraneous type information. And they will most likely not catch on.

Java is what it is. Like a mutant frog it is trying to make it to land, asymptotically approaching Smalltalk, a language first released in 1980, and yet, its clear that it can’t get there and I suspect it will die trying. At least, I hope it will.

Function Calling vs Message Sending

Thursday, August 24th, 2006

“What happens if your reference refers to a different type of object than you expect?”

Its probably a programming error.

Or not.

In C++, because of the way vtable dispatching works, the program will either crash (hopefully) or the wrong member function will be called. Either way – the end result is probably fatal.

Java is slightly better. In order to call a member function, the java compiler checks to make sure that the function is defined in one of the interfaces implemented by the type of the object reference. The problem arises when the object reference has been widened to a more general type that does not define the desired member function.

How can this happen?

public class A extends Object { public void anAThing() {} … }

A myA = new A(); // make an A
someList.add(myA); // put it in a list … someList.get(0).anAThing; // error – List.get(int) returns Object.

so we have to downcast the Object reference returned by someList.get() to an A reference

((A)someList.get(0)).anAThing(); // Might be OK if someList has an A

Casting is completely type unsafe. Although in Java the result of incorrect casting is an exception. So we could write:

try { ((A)someList.get(0)).anAThing(); }
catch(ClassCastException ex) true

which allows us to handle the casting error and continue. This is a huge step up from the C++ behavior of crashing in that it allows the programmer some control over what to do if the cast is wrong. If we desire a more conventional means of writing this, it is possible to use the instanceof operator and do the check before the cast.

if(someList.get(0) instanceof A) ((A)someList.get(0)).anAThing();

which is pretty much the same thing. Of course, this forces programmers to clutter their code with all sorts of tests or try/catch blocks. It seems to me that getting away from that sort of thing was precisely the reason everybody wanted to switch to object oriented programming in the first place. Polymorphism replaced an awful lot of if ladders and switch statements and here we are putting them back in to work around a runtime system that throws an exception if we guess incorrectly about the type of an object.

Not to mention the idea that its quite possible to have this situation:

public class A extends Object { public void doAThing(); …}
public class B extends Object { public void doAThing(); …}

Object myB = new B();
((B)myB).doAThing(); // fine
((A)myB).doAThing(); // error – object is not an A!

which seems just silly. We have an object that we are pretty sure implements the operation doAThing – but that’s not good enough. We have to know exactly which interface this particular object implements in order to call that method. Thus, type defines protocol in the statically typed world.

The problem is that such an arrangement assumes the existence of what Bart Kosko calls crisp sets and hierarchies. The world isn’t nearly so neat. Its fuzzy. B might well be capable of performing some of A’s operations and in some (but perhaps not all) circumstances, B might be an excellent stand-in for an A. Its a more accurate model. To quote Kosko again. “fuzz up – precision up”.

OK, so what about the dynamically typed languages? How are they better? Assume we have the same classes A and B derived from Object and each implements doAThing.

| anA |

anA := A new.
anA doAThing.

anA := B new.
anA doAThing.

anA := ‘this is a string’.
anA doAThing.

This all works except for the last line when anA refers to a String rather than an instance of A or B. String definitely doesn’t implement doAThing. So what happens?

First, it helps to know that the runtime systems for Smalltalk and Objective C are “message sending” rather than “function calling”. When the programmer tries to send a message to an object that doesn’t respond to that message, the runtime packages the message up as an object and calls a catch-all message instead. In Smalltalk this is usually called doesNotUnderstand: message.

In Objective C, a special message called forwardInvocation: is called to give the programmer a chance to send the message to some other object such as a delegate. The default implementation of forwardInvocation: doesn’t do any forwarding. Instead it just calls another method doesNotRespondToSelector: which raises an exception.

The programmer may choose to respond to these messages in a class specific way – polymorphically, by overriding forwardInvocation: or doesNotUnderstand:

Doing this moves the error handling to a central location rather than forcing the programmer to scatter it throughout the code at the call locations. The end result is cleaner, smaller code.

Plus there’s a bonus. Being able to forward messages provides a clean mechanism for building chain of command patterns and allows an object to be “decorated” with new behaviors dynamically.

Its also easy to do distributed computing by having the forwardInvocation method perform remote procedure calls over the network without the need to do clumsy code generation of proxies and stubs common in C, CORBA, and Java RMI programs. A single proxy class can stand in for any kind of object.

The doesNotUnderstand: message can also provide a trigger for database fetching and object faulting. When a message is sent to a simple database query object that implements almost no messages, doesNotUnderstand: is invoked, the database query is executed, and the object replaced with the results of the fetch. The message is then delivered to the newly fetched object. Such faulting mechanisms can simplify programming and virtually eliminate the need for application programmers to directly interact with a database API.

These extra capabilities are nearly impossible to implement in the statically typed environments and this is clearly a case when the dynamically typed environment yields simpler application code (we don’t have all those try/catch blocks or instanceof tests). Simpler application code means greater reliability with reduced programmer effort. This all translates to faster development times and lower costs.

This server is under attack

Wednesday, August 23rd, 2006

And has been for the last week. I was using mod_proxy to proxy requests to BadPage.info and, due to the lack of sensible documentation (which I still don’t quite understand – a few examples would be useful apache foundation), my web server was turned into a spam forwarding machine.

I’ve disabled proxies (and thus BadPage.info is off the air just now). I’m still getting hammered from indousa.us sending malformed http requests in hopes of them turning into spam. Since I’ve set up explicit deny rules for them, I’m hopeful they’ll check their logs soon and get the hint.

If you’re in the mood to test a server loading script, indousa.us seems like a dandy test bed. Give them my regards.

Update: As soon as I deny that domain, a dozen others arise to take its place. Anyone with extensive skill on the use of mod_rewrite and mod_proxy that cares to help, please give me a shout.

Update: Solved it I think. BadPage.info is back on the air but all other proxy requests are being denied. I am still being hammered though.

Marooned on Gilligan’s Island

Thursday, August 17th, 2006

Just sit right back and you’ll hear a tale….

Gilligan’s Island ran for 3 years and nearly every episode was roughly the same. Some discovery (read “new technology”) provides the castaways with the seed of a new approach towards getting rescued. The approach is tried, there are inevitable problems with the execution (generally introduced by Gilligan), failure ensues, and the plan is forever abandoned in favor of a completely new approach.

In 3 years of regular television, 96 episodes, the castaways worked their butts off on an amazing variety of novel ideas and still they never managed to get anywhere.

If you develop software for living, this starts to sound familiar.

A more logical approach for the castaways would be to pick a promising approach (building some sort of sea-going vessel perhaps) and refining it until it works. Its not nearly as entertaining, but its much more likely to succeed. Of course, the writers and producers are actually to blame – they introduce the plot devices and plans to keep the audience interested in trying to solve the same problem every week.

Of course, the big name producers in the software industry (Sun, Microsoft, IBM, HP) are doing the same thing to their audience (the sofware developers and IT managers). The new and novel is preferred over the tried and true because…, well because new things are just so much more entertaining! Its much more fun to do something new. Even if the old way is predictable and known to work!

Some of the shiny new things we’ve adopted and abandoned: Computer Aided Software Engineering (CASE), Artificial Intelligence (AI), Object Oriented Programming (OO), Virtual Machines, Garbage Collectors, Electronic Data Interchange (EDI), List Processing (LISP), Client-Server, n-tier, O/R Mapping tools, SGML, HTML, Web Applications, XML, .NET, the list goes on and on.

Some of these are languages, some core technologies, and some are architectural approaches. What they have in common is that the hype accompanying their introduction spurred large numbers of developers to drop whatever they were doing just as they began to master it, and run to embrace the next shiny toy. So much for the benefits of experience.

Interestingly, none of these falling stars have been completely abandoned. In fact, many of them are still around. OO programming was the panacea of choice for awhile. While the component world hasn’t quite emerged (actually – several competing worlds exist – COM, CORBA, RMI, PDO…) OO programming is now nearly ubiquitous.

AI, also a grossly overhyped technology and now considered a ‘failure’ by the purveyors of ‘conventional wisdom’ – is still making contributions towards the development of intelligent applications in the areas of manufacturing control, loan application evaluation, and consumer fraud detection. Certainly it didn’t quite live up to its hyped image, but the techniques developed to enable automated reasoning are extremely valuable where they are applicable.

Programming languages are another frequently used plot device. This used to be less of an issue. Some time before C++ started to proliferate, the developers of most operating systems defined a standard binary object file format that all compilers would generate. It was common to use FORTRAN math libraries from C, Pascal, or COBOL programs. You could mix or match to your heart’s content. The object formats were the same so the linker didn’t care.

The introduction of C++, in a misguided effort to make use of existing linker technology, introduced a technique called name mangling and essentially broke the guarantee that object files were interchangable regardless of which compiler were used. C++ code compiled with different compilers couldn’t even interoperate.

The end result was projects now had to select a single language for the entire project and stay with it. The cost of calling out to code written in other languages went up. Thus began the porting wars.

Companies now spend money to port significant portions of their existing software assets to the current darling in the language world. Even conservative aerospace companies have risked introducing new defects by porting reliable and fast math libraries written in FORTRAN to C++ just because “everything is going towards C++”.

Suckers.

They did it again with Java for the same reasons. I expect they’ll do it yet again with C# and .NET.

The real fact of the matter is that there has been no real progress in the software industry in something like 10 years. In fact we’ve lost some interoperability capabilities among programming languages.

As a side effect of changing the rules every year, experience is marginalized and hiring low cost unskilled programmers labor begins to look attractive. After all, even the old timers are going to have to learn everything all over. Thus, the endless march of new stuff for the sake of being new stuff nullifies the value of experience and hinders the development of masters in the field.

Of course, it also means that we are spending too much for each piece of functionality because everything is being done by brute force. Some of the truly elegant and productivity enhancing programming techniques have fallen into disuse because they are too mind expanding for novice developers to grasp quickly, and the top developers who can grasp them are too busy running in place to impart the advanced knowledge.

The only people benefitting from all this ‘new’ technology every year are the producers. Its all about trying to corral developers into their particular play pen and lock them in. Sun and IBM have been doing most of the rounding up lately. Microsoft is kicking off their own cattle drive with their .NET initiative.

The only way off the island is to identify a technology that works for you that you can control. This probably means open source development tools. Build up your expertise with this technology, continue to refine your approach, and stick with it. You’ll be sailing home before you know it.

Two Steps Forwards One Step Back

Thursday, August 17th, 2006

The first commercial software I ever worked on produced reports from data files. I had to open the file, parse its contents, write a new file, and then close all the files. Something like 80 percent of the program was just I/O processing.

The first data base system I worked on used indexed files. It was more hierachical than relational and unexpected system failures would have you rebuilding your tables and indexes for hours on end on a regular basis to recover from partial writes. The data storage routines were in a library and were pulled in by the linker. There was no concept of locking because you knew you were the only user of those files.

It was data processing with stone knives and bear skins and we liked it fine. OK, actually it sucked but at least we didn’t have to open and close the files or parse their formats ourselves. Application development went a little faster because we could focus less on file I/O and more on code that called the data storage routines and processing steps.

That was a step forward.

Later on we got Oracle and we experiemented with embedded SQL. Thats where you write bits of SQL write into your source code and some preprocessor turns the SQL into C code and function calls. It had the effect of removing the programmer one level away from the data processing and you could specify things in terms of set operations which let you get away from writing loops. One more set of bugs eliminated.

Of course, we now had to know three programming languages, C, SQL, and the weird embedded C/SQL/ProC dialect where the two collided. Code was structured into function libraries based on business functions. A library might contain the implementation of a number of important “edits”. Basically code that checked data to make sure it obeyed certain logical constraints. Since the libraries were reusable, so were the edits and thus business logic was mostly centralized and relatively easy to maintain. The libraries were then organized into “systems”. Billing System, Order Entry System, Payment Processing System.

Step forwards.

Object Oriented Programming arose as a means of refactoring code and it gained widespread acceptance primarily because it made the newfangled graphical user interfaces so much easier to write.

Unfortunately, OO also had the effect of turing the data processing world on its ear. Numerous articles from a number of OO “Methodologists” appeared bemoaning the so-called Object-Relational “Impedance Mismatch”. The idea being that, because the two processing and representation models were so different, the translation between the models became difficult and somewhat computationally expensive.

Several brave companies jumped into the breach with products designed to map database tables directly into user interface elements. Thus, client server was born. It was a huge mistake. Client-server was a two-tier solution. You had your data storage tier which was your relational database, and then there was the user interface tier. Client server encouraged user interface programmers to build the data edits and business logic right into the user interface. Thus, the same business rule would be embodied over and over again in various screens and other user interface elements.

Step backwards.

Meanwhile, the people using Smalltalk had come up with the Model View Controller (MVC) Architecture. The model represented the business logic and entities that was common to the business. It once again centralized the edits and business rules into a single code base that could be reused across a number of applications. The views represented the user interface elements and the controller mapped the data between the model and view and provided application specific actions.

Step forwards.

Mapping the model to the database was still a problem though. The most enlightened developers realized that it was possible to apply MVC principles with the relational database being the model and the business model taking the role of the view. The controller was known as an Object To Relational (O-R) Mapping layer. This specific mapping was relatively difficult to do and only a few companies produced really good mappers. Several others produced some rather bad ones. The good ones were exemplified by Persistence, TopLink, and NextStep’s Enterprise Object Framework (EOF). They allowed a developer to specify an object model, an entity relationship model, and mapping between them without writing any special code. Business logic was associated with business objects, database consistency edits were stored in the database, and application developers only needed to write views and application controllers on top of the model to provide new business applications.

Step forwards.

Since every computer platform had a different user interface programming API, every application had to be rewritten once for each platform. This made it difficult to mix Macintoshes, Wintel machines, and Unix/X Windows machines in an enterprise. There was also the difficulty in doing remote administration of thousands of clients every time a new version of an application was completed.

Enter the world wide web.

Because web browsers behave mostly the same on all platforms, it was possible to move all the application processing to a server and serve up views rendered in HTML. This had the advantage of eliminating the client configuration and distribution issues at a cost of a loss of interactivity. Web applications could only be forms based. Things like collaborative drawing tools were not good candidates for web applications but things like order entry and online banking were mostly forms based and a good fit.

Both the two tier and three tier approaches were taken to the web. The two tier had all the same problems it always had with scattered business logic located in the user interface layer. The three tier approach simply created a hierarchy of view objects that would “draw” themselves by emitting HTML. It was better but it was no longer possible for the application to be very interactive because of the nature of HTTP request response processing. So applications got dumber and slower.

Step sideways.

The two tier people caught hold of XML and XSL which appear to address some of HTML’s shortcomings. They began to look at web requests as document processing rather than events for a user interface. Except that databases aren’t really documents. So to get the database to participate in the XML oriented applications, new mapping models to go from relational databases to XML documents were created. Of course, unlike objects, XML documents are essentially inanimate data lacking behavior. The behavior is supposed to be provided by applying style sheets and transformations to the XML to produce new XML.

Giant step backwards.

Mapping relationally structured data into intelligent objects eliminates work and ultimately bugs because the objects enforce their own constraints and business rules. Mapping relationally structured data into hierarchically structured data has limited value other than as a streaming format. The data remains independent of the rules that transform the data and maintain its integrity.

There are lots of other inappropriate applications of XML. I’ll discuss more of them later.

Blog Migration

Thursday, August 17th, 2006

Once upon a time I had another blog.  I hated the software that ran it and eventually stopped using it.  I have a few good posts on programming there though, so I’ll be bringing them over here as new posts.

Wikis – The new generation.

Wednesday, August 16th, 2006

Scoble is investigating collaborative tools – primarily chat and wiki tools. Seaside and Squeak are powering some really cool new capabilities. Like LogoWiki, a wiki that allows people to embed executable Logo programs. Useful for making educational sites about geometry and introductory programming. LogoWiki is built upon Pier, a wiki so rich in features that it approaches the level of a content management system. Given that the wiki was invented by a Smalltalker, this seems like a return to wiki’s roots.

Vonage is a Rip-Off

Tuesday, August 8th, 2006

Digg me.

And I sincerely hope they go out of business soon.

I decided to give them a shot when I canceled my Qwest DSL for unreliability and switched to Comcast (which has been SOLID). They promised that it would be easy to transfer my phone number, they sent me some paperwork and a catalog of interface boxes/routers. I selected one made by linksys, it arrived and I set it up and got a dialtone.

Two days later we had a wind storm and power outage and the linksys completely died. Bad experience, I conclude that having my phone service dependent on electricity is a bad idea and set out to cancel the phone number transfer.

This is where I learn that Vonage is the roach motel of phone companies. They have sales people working around the clock, but intentionally don’t put their customer service extension in the menu on their phone system. This is dishonest and I decide not to do further business with them.

Unfortunately, they won’t or can’t cancel the number transfer. I contact Qwest to see if they can stop it and at first they think so, but eventually they say no, but let it go through and in a week they can request it back. So we do. This leaves me without the ability to receive calls for over 3 weeks. I tell the reps at Vonage that I want my number transfer cancelled and plan to cancel service – they warn me not to cancel until the number safely returns to Qwest. Qwest gives me the same advice.
It takes 20 days for a number to transfer – so this entire fiasco is over 40 days long. Cleverly, Vonage provides a 30 day money back guarantee. Criminal.

I learn this last fact when phoning to cancel my Vonage “service” (which never worked) and am told that 1) I’m over the 30 days to I have to pay $30 in cancellation fees and 2) I am charged an additional $50 for hardware that I bought up front for $89 in the first place and that failed within days of power up and 3) am told that they won’t even replace the dead hardware (although the rather smug-bitchy rep in cancellations is pleased to tell me that “if you had stayed we’d replace it but since you’re cancelling you’re out of luck”).

There’s no option to simply not pay the bill – Vonage gets your credit card number up front. However, I have disputed all charges from them with my bank and returned the defective equipment so I’m hopeful to escape with my cash intact (while sticking them with chargeback costs as well).

So much for customer service. Avoid Vonage at all costs!