The software crisis
I recently got a chance to read Scott Rosenberg's "Dreaming in Code," a story about the ups and downs (mostly downs in this case) of the software development process of Chandler at OSAF. The book is a great introduction to challenges of software engineering, especially for outsiders, though it is nowhere near as good as Tracy Kidder's "The Soul of A New Machine" despite the fact that most reviewers seem intent on comparing the two.
One of the central themes that Rosenberg tackles is that of the "software crisis," or simply put, software engineering's inability to scale the production process in the same way that other engineering disciplines have— most notably, hardware engineering but also mechanical engineering, chemical engineering, etc. In all of these other fields "scale" refers both to being able to linearly apply more manpower while achieving linear or super-linear boosts in productivity, as well as producing less error prone and more predictable output.
This question is of course near and dear to any software person who has ever missed a schedule or over-scoped a feature, so naturally I was drawn to the bibliography where I found an incredibly interesting, albeit dated, book by Brad Cox (the creator of Objective-C) called "Superdistribution" on exactly this topic of scaling software engineering. The book was published in 1996 and is now out of print but I would highly recommend it to anyone interested in software engineering as a fledgeling discipline.
The core of Cox's argument is that the notion of "Software ICs" (software integrated components similar to chips in hardware) has not taken off not because of technical reasons (as he learned after developing Objective-C for just this purpose) but because of economic ones. In essence, he argues that there is no economic advantage to the designer/developer of a software component that can easily be re-used (through good interfaces, cross-language bindings, good documentation, etc.) because the licensed-library model does not provide adequate compensation to create a thriving ecosystem, and because other models (such as per-unit royalties) have not been popular in the software industry for various reasons. Hence capable programming teams are left much more motivated to re-invent the wheel in the hopes of building the next MS Office and are less likely to make a good living at some lower layer (for instance, the word-wrap object, the pagination object, etc.) in the same way that their hardware counterparts have been able to.
His solution turns on a clever attribute of software but is nonetheless completely impractical. Software has no control over its being copied, Cox argues, but it has very tight control over its being executed, and as such, it could establish a payment mechanism based on "useright" rather than "copyright," metering its own use and charging accordingly. This seems counter to the grain of personal computing to me, and in fact the only examples I have ever seen of this model have been with really expensive software ($100K+) where the vendor leaves no other alternative. And of course this is to say nothing of the administrative nightmare that a system like this would be. Current DRM systems would look like "Hello World" by comparison.
So I don't like his solution but I do agree with his conclusion. To scale we do need more modularization, and the reasons for not being there are most likely economic and social at this point— and not because we're still looking for the one runtime to rule them all.
Coincidentally, it is interesting to see how in the intervening decade two trends have helped combat the coming software crisis that Cox predicts in his book. The first, the mainstreaming of open source, has created an alternative economic model for encouraging software re-use, though I suspect Cox would argue that this alternative economic model is not robust enough to survive or scale to what he is imaging. Still, you'd be a moron to try to write an RDBMS or a webserver at this point, and this productivity boost comes directly as a result of the open source production process. The second trend, which seems much more in line with what Cox argues for, is that of metered web services. Amazon S3, EC2, and their brethren are great examples of a pay-per-use model at work, and the fact that the bits are not being pushed down to the client seems to be a great way to solve the language/runtime interop problems that he discusses in the book.
Economic arguments aside, there is tons of good stuff in Superdistribution (if you can get past its datedness and its wonky illustrations). For instance, Cox covers how time-sharing systems have left us with a legacy of independent processes with individual memory address spaces and are thus difficult to integrate in a component-based way while the futuristic systems of the 1970s (Smalltalk in particular) tried to eliminate the distinction between the machine's runtime and the processes running in part to solve this very problem. This was sort of mind-bending to me as I've always seen this as one of UNIX's greatest features (at least when compared to DOS/Windows/Mac OS-pre-X). He also makes a wonderful attempt at building a vocabulary for layers in software and often uses rich analogies from other engineering disciplines to make his arguments.
But best of all, it is clear that Cox is an in-the-trenches engineer who has deep knowledge of languages, platforms, and building big systems. Thus the book avoids that "architecture astronaut" feel that you sometimes get from this kind of effort (sorry but the much beloved Go4 book still feels like that to me). If you are interested in this sort of thing, it's well worth reading.


Hi, I'm Antonio, living in Boston and working this whole net thing out...
