The discovery that Web developers need to think in cosmic architectural terms for even relatively small projects comes as something of a shock. Then you discover that the tools for managing architectural complexity, even at the conceptual level, are primitive to nonexistent. Where does that tornado known as the Internet leave us? Definitely not in Kansas.
I love to program on instinct. You probably know the feeling. You know in your head how a program should run. You see in your mindıs eye the screen and how it should look. With such a vision, you can sit down at the computer, flex your fingers, stroke the mouse, and start coding. A few hours later, punctuated by a pizza break, youıve got a nicely polished piece that runs smoothly.
Obviously this is a dream or a flashback to the programming of my youth. Itıs been years since Iıve written a complete piece of software, much less a full-scale application that could be programmed in such a continuous, all-in-the-head fashion. Short pieces ı functions, methods, a few classes ı sure, these can be programmed in one sitting and with a clear goal in mind. But real applications are usually far too complex for a programmer to retain details from beginning to end. Also, applications usually take so long to develop that you can hardly remember what you did when the project started.
Iım not lamenting. However, I do recognize that there is sometimes a barrier to good programming created by what my mind wants (nice, comprehensible chunks of code) and what the applications require (sophisticated integration of very complex components). I know the complexity is there, but Iım not sure I want to deal with it ı and of course, deadlines sometimes make dealing with complexity almost impossible.
When I started developing for the Web, I realized that for all but the most trivial applets or applications, there was a lot more to take into consideration than there was for the client/server programming I had been doing. On top of all the usual concerns (network performance, security, help systems, error trapping, multiuser data access, and data integrity), the Web introduced some concerns of its own ı browser limitations, Web server performance, site management, HTML integration with CGI and applets, data connectivity, and statelessness. Above all, the presence of the Web server meant that, by definition, Web development would take place on at least three tiers (client, Web server, and database server) and usually three discrete pieces of hardware. This was inherently more complex than the simple two-tier client/server applications I was accustomed to writing.
I also realized that middleware took on an expanded role in Web applications at a much smaller scale than in typical client/server development. Iıve always associated application servers and transaction processing monitors with large, enterprise-class applications. This is also true for the Web, but you may need some of these middleware pieces for much smaller applications. This generally increases application complexity in a hurry. On the positive side, there are more opportunities on the Web to shift code between client, middleware servers, and database server to balance processing load, improve multiuser control, and guard data integrity. Of course, these options create a complexity of their own.
In short, it didnıt take very many projects for me to understand that for Web application development, doing almost anything by the seat of the pants was equivalent to skiing off an unfamiliar cliff. Iıve always believed that analysis and design were parts of almost every application and that a professional programmer should create a complete application development framework to accelerate programming, but on the Web a new area of attention seems to be in order: architecture.
How many times have you heard the word architecture slung around like it explained something? It has lots of connotations. I associate it with big projects, comprehensive designs, and major elements crafted into a whole framework with the implication that a rational, organizing mind is at work. Snickering is appropriate. Itıs very easy to be cynical about "architectures" in software. They are so often surrounded by marketing hype that their track record ı they seldom last more than a year or two ı gives the whole concept something of the air of a dirigible, a big gassy balloon thatıs liable to blow up.
But then there is the reality of the Web. Iıve got an example fulminating in my mind as I write this; a project that started out as a client/server application but has taken a different turn as geographical distribution became a factor. The original project was to create a laboratory information management system (LIMS) for a government regulatory lab that recorded incoming samples of various kinds, assigned and tracked testing, and reported results to clients. All this took place in one building and with a single database.
Now, however, Iıve been asked to extend the information gathered by the LIMS to and from roughly 30 field inspectors and their headquarters. These people are dispersed throughout a state, spend most of their day on the road, and thus have no permanent connection to any kind of WAN or LAN. The most obvious way to get data to and from them was to have them dial into a central phone number and connect to the state WAN. Unfortunately, related experiences showed this to have important drawbacks ı namely, the expensive long-distance phone calls, and a continuing problem with unreliable connections. These problems also meant that scaling from 30 to perhaps 400 inspectors (which was part of the plan) probably would be extremely difficult and expensive.
Of course, this being the age of the Internet, the majority of the field inspectors were already connecting to the Web, so the thinking was, why not use the Web as the pipeline to and from the field? Inspectors could use any ISP of their choice, preferably local, and there would be a backup ISP or state connection if for some reason their local providers were unavailable.
At first glance, this seemed like a simple matter of operating a Web site outside the firewall that would shuffle data for incoming samples and outgoing test results to and from the field and lab ı a basic three-tier architecture. At second through fifth glance, it got more complicated. We had to add a routing of some information to the clientıs office, which also had to participate in the data traffic to the field. We needed to add messaging (via email) that would synchronize with the data flow. A new centralized data warehouse was being implemented by the state, and we had to channel some data through that. There were more complications, but it doesnıt take much imagination to understand that we had data being pulled in more directions than taffy. At the very least, we needed an application server and perhaps a transaction monitor to track and manage data transmissions as well as to handle at least some of the business rules that guided who got what and when.
At some point, it occurred to us that it would be helpful to organize the growing number of distribution elements so that we could explain it to the people who would pay for it, and we could have some confidence that we were going about the project in a workable (if not perfect) way. Because most of the software already in place was a Microsoft Windows something or other, we decided to take a good look at what Microsoft offered to structure a project like this. Oh my. It so happened that Microsoft had just announced (in September 1997) its Distributed interNet Applications Architecture (DNA). Although we were like babes before the gaping maw of a deep dark cave, we decided to wander in and see how far we could get.
DNA is Microsoftıs framework for describing the various elements of network applications (all networks ı Internet, LAN, and WAN) and a master blueprint for where Microsoft thinks it is going with its application-development product line. In the latter respect, the word architecture does seem appropriate. (The acronym DNA is easy to remember, but it raises a lot of inappropriate associations.)
DNA is ambitious, nothing less than a complete integration of the Web and client/server application-development models. I say ambitious because even though there are plenty of conceptual linkages between the Web and the client/server models (partitioning comes immediately to mind), as Iıve pointed out, the real-world differences are considerable. We found that in terms of performance (among other things), the Internet does not behave like most corporate networks (at least not yet), and it will demand a lot of software workarounds and compromises to achieve results similar to client/server applications. This is especially true for transaction processing.
Microsoft sees DNA in terms of services (read products and OS services). These fall into some general categories:
The general idea for Web and client/server integration is to enable existing applications for the Web (in other words, to make their data and user interfaces available through a Web browser). New applications can be designed with the Web in mind and can take advantage of services offered in the middle tier ı that is, transaction processors, application servers, and database connectivity. The implication is that Microsoft expects the Internet and its kissing cousins, the intranets, to be the dominant form of networking in the future, and the one for which most software should be designed.
The most important element of DNA is the underlying use of Microsoftıs Common Object Model (COM). COM specifies how objects behave in relation to each other through messaging. Essentially, a COM object presents itself to the world through a well-defined interface ı one that looks more or less like a normal function call. Other objects can read the interfaces, and they in turn can ask for services or information from the object. For objects that will be distributed among multiple servers, Microsoft has augmented the COM specification to create the Distributed Common Object Model (DCOM).
Itıs interesting that while Java has taken on the mantle of cross-platform champion, COM is claiming the title for cross-language ecumenism. Itıs Microsoftıs contention that no one language has all the features and attributes necessary to satisfy every aspect of Web development (as framed by DNA). Thatıs why it developed COM as an umbrella to allow almost any object-oriented language to produce COM components. The newest version of COM, imaginatively called COM+ (scheduled for release in the first half of 1998), extends the language neutrality even farther.
COM+ defines a standard set of types and the means by which all objects describe themselves so they can be read by other COM objects. Itıs almost like there are two classes that produce a COM object; one that produces the object, the other that describes it. In fact, the metadata is defined with the coclass keyword. This separation of description from implementation allows a COM object to be built with very language-specific characteristics, such as data typing, while still being usable by other languages and other environments.
Microsoft has also added a kind of event handler called an interceptor to COM+ that extends its ability to react to object-generated events (as contrasted to the more common user-generated events such as mouse clicks). Interceptors provide the mechanism to manage transactions, system monitoring, and other distributed services. As long as other elements of DNA can also use COM+, they will be able to share services and function as part of the unified architecture. At least thatıs the theory. At press time, we donıt have COM+ (as a software development kit), much less DNA. There are plenty of questions to be answered, such as how DNA will handle elements that are not objects ı particularly those from relational databases.
OLE DB, another major component of DNA, is an example of Microsoft taking one step back (or up, in terms of abstraction) in order not to leave out some important pieces of the big picture. In this case, Microsoft starts with the eminently successful ODBC, which is so widely distributed that even the Java equivalent, JDBC, went out of its way to have a JDBC-ODBC bridge available from the outset. The architectural problem with ODBC, however, is that it is primarily designed for relational databases and uses SQL as its common language. While this covers a substantial majority of the worldıs data, it certainly isnıt the whole story. In particular, object-oriented databases fall outside the purview of ODBC, which is something Microsoft knew must change. Microsoftıs answer is OLE DB, which defines how to access and manipulate all kinds of data including multimedia. ODBC then becomes a subset of OLE DB. OLE DB elements can be incorporated into COM objects and thus integrated into DNA.
As we explored the concepts behind DNA with the LIMS project, it struck us how DNA and COM help to explain how the many elements of a Web application could be organized (given that you accept the Wintel context to begin with). The problem we had was with how the elements are organized, at least for now. At the cosmic level of DNA, there are no tools for visualizing the whole or for estimating how various elements should be implemented. There is a conceptual framework, but the interpretation and monitoring are left up to you. At the enterprise level of development, where budgets can be big and so can the support staff, this kind of work can be left to specialists. However, as weıre discovering, with relatively modest Web development, you need some sort of architectural organization and insight into the good/better/best ways of implementing it. This will be a heavy load for a small development staff.
Given the fragmented development history of Microsoftıs product line, it seems legitimate to suspect that one motivation for DNA is the desire to provide a comprehensible framework to link seemingly unrelated products. Whatever the motivation, itıs a good idea to put a big intellectual wrapper around something as complex as Web application development, although I suspect weıre going to be struggling to describe the "right" way to do this for some time to come. Microsoftıs attempt helps to compare its approach with others, especially IBM, Netscape, and Sun, which also have aspirations to be comprehensive providers of networking solutions.
One of the aspects of DNA that I feel may be trouble is the degree to which Microsoft indicates it wants the elements bound to the underlying operating system (Windows NT in most cases). This is "bound" as in fully integrated, so you canıt have one without the other. Itıs one thing for Microsoft to produce software that is designed to run on Windows NT (or Windows 95 for that matter) but still remains independent products that you can use or disregard. When many of these same products become tightly integrated with the operating system, they will be much more difficult to disregard.
The situation, as it is unfolding at the moment with Internet Explorer and Netscape Navigator, may be a test case (in more than one sense). In most situations, Iım a believer in the best-of-breed approach to acquiring software. In its literature, Microsoft says DNA will "encourage best-of-breed suppliers." Good. However, if Microsoft makes that kind of selection more difficult, then I will cheerfully join the folks already on the barricades.
In this column, Iıve profiled just the Microsoft approach to architecture, leaving other stories for the future. Microsoft is hardly alone in realizing that we must deal with the architecture of our applications. Mainframe software development came to the same conclusion decades ago. Microsoftıs Distributed interNet Application Architecture, which is apparently a creature of necessity under heavy evolutionary pressure, provides some insights but few tools. Weıll have to wait for some of the plumbing, such as COM+, to become real, and then hope that Microsoft (or another company) will provide tools that help us keep the architecture under control. Maybe Iıll never return to programming applications on the fly, but at least Iıd like to be able to remember what an application is supposed to look like.