It seems simple. If middleware is nothing more than a layer above the existing network, than all middleware should perform the same way. Right? Wrong.
Middleware technology has grown in many different directions, which results in many trade-offs when considering middleware performance. Typically, software development project leaders select middleware without considering performance. This kind of impetuosity can lead to disaster, including failed projects, upset users, or both.
Selecting the right middleware for your distributed application has a major drawback: It's difficult to determine performance until the application is up and running. By that time, the damage has been done. I can provide you, however, with some performance guidelines for selecting middleware. In short, what's best is not always fastest, and what's fastest is not always best. The trick is to find the best and the fastest. Got it? OK, now I'll explain.
What drives me crazy about this business is that the application architects I know who select a middleware layer based on religious reasons, without considering the application requirements. It's dysfunctional architecture at its best. You need to be diligent here.
Selection of the communications mechanism is a key issue when designing and building a distributed application. If things were simple, we would communicate application to application using a direct synchronous paradigm. We would call the remote application (or resource), and the remote application would return status or data. Pretty simple.
However, the real world is not that simple. There are many different communications paradigms these days. For our purposes here, we can limit them to:
You need to remember that some middleware products may use one, two, or all of these communications paradigms. Thus, it's a bad idea to group these concepts in terms of middleware categories.
In synchronous communications, the calling program sends a request to a remote program and waits for the response. A remote procedure call (RPC), such as the one that exists within products like the Open Software Foundation's Distributed Computing Environment (DCE), is the best example of a synchronous middleware layer. Synchronous communications means that the calling application must stop processing or is blocked from proceeding until the remote procedure produces a response.
Asynchronous communications are unblocked or do not block the program from proceeding. The program can make the request and continue processing before a response occurs. Most message-oriented middleware layers support asynchronous communications through the point-to-point-messaging or message queue models. (Some do it in strange ways, so be careful.)
Connection-oriented communications means that two parties first connect, exchange messages, and then disconnect. Typically this is a synchronous process, but it can be asynchronous.
Connectionless communications means that the calling program does not enter into a connection with the target process. The receiving application simply acts on the request, responding if required.
In direct communications, the middleware layer accepts the message from the calling program and passes it directly to the remote program. You can use either direct or queued communications with synchronous processing; however, direct is usually synchronous in nature and queued is usually asynchronous.
When using queued communications, the calling process, typically a queue manager, places a message in a queue. The remote application retrieves the message either shortly after it's been sent or at any time in the future (barring time-out restrictions). If the calling application requires a response, such as a verification message or data, the information flows back through the queuing mechanism. (See Figure 1.) The advantage of the queuing model over direct communications is that the remote program does not need to be active for the calling program to send a message to it. What's more, queuing communications middleware typically does not block either the calling or the remote programs from proceeding with processing.
So which paradigm should you use? The answer depends on both the functional and performance requirements of the application. Let's look at the trade-offs of each.
RPC's provide a mechanism where, by simply making a function call from a program, you actually invoke a process on a remote machine. The great thing about RPC's are their simplicity, but you need to consider performance as well.
The major issue with RPCs is that they require a lot more processing power. In addition, many exchanges must take place across a network to carry out the request. In other words, they suck the life out of a network or a computer system. For example, a typical RPC call may require 24 distinct steps in completing the requests as well as several calls across the network. It's not a good idea to make RPC calls across slower networks, such as the Internet.
The estimated processor performance cost of RPCs is high. RPCs require 10,000 to 15,000 instructions to process a remote request, and that's several hundred times the cost of a local procedure call (a simple function call). For those of you who have used RPCs before, this is nothing new, and I personally wish that I had back all the time I've spend waiting for an RPC to complete.
As with the case of DCE, one of the most famous (or infamous) RPCs around, the RPC software also has to make requests to security services to control access to remote applications. Moreover, there may be calls to naming services and translation services as well. All of this adds to the overhead of RPCs.
The final argument against RPCs is the fact that they are almost always synchronous in nature. Thus, both the calling and remote portions of the application are bound to the RPC or are blocked from processing until the RPC is complete. I've already covered this earlier.
Clearly the advantages of RPCs are the sheer simplicity of the mechanism and the ease of programming. However, RPCs have a huge performance cost and don't scale well unless combined with other middleware mechanisms such as a transaction processing (TP) monitor or message queuing middleware.
The problem with RPCs is that you never really know when you're using them because they are bundled into so many products and technologies. For example, CORBA-compliant distributed objects are simply another layer on top of an RPC and thus rely on synchronous connections to communicate object-to-object. The additional layer means additional overhead when processing a request between two or more distributed objects. This is why distributed objects, while architecturally elegant, typically don't scale or provide good performance. However, the Object Management Group (OMG) and CORBA vendors are working to solve the performance problem. It's about time.
While message-oriented middleware (MOM) is one of the newest players in the middleware world, it's now a mature technology with some performance advantages over traditional RPCs. There are two models supported by MOM: point-to-point and message queuing (MQ), which I'll focus on here.
When comparing MQ to the standard RPC, you can see several performance advantages to MQ. First of all, MQ lets each participating program proceed at its own pace without interruption from the middleware layer. The calling program can post a message to a queue and go on with its life. If a response is required, it can get it from the queue later. Another benefit is that the program can broadcast the same message to many remote programs without waiting for the remote programs to be up and running.
What's more, because the MQ software (for example, IBM's MQ Series or Microsoft's MSMQ) manages the distribution of the message from one program to the next, the queue manager can take steps to optimize performance. There are many performance enhancements that come with these products, including prioritization, load balancing, and thread pooling.
Don't be concerned that some messages may be lost during network or system failure. Most MQ software lets you declare a message as persistent or stored to disk during a commit at certain intervals in order to recover from such situations.
TP monitors provide the greatest performance advantage over both MQ and RPCs. Of course, it depends on what you're doing. Several features of TP monitors, such as BEA's Tuxedo, IBM's CICS, and Microsoft Transaction Server (MTS), enhance performance as well as provide the ultimate in scalability.
When it comes to support for many clients and a high transaction processing load, nothing beats a good TP monitor. TP monitors perform such tricks as using queued input buffer to protect against peaks in the workload. If the load increases, the engine is able to press on without having an effect on response time. TP monitors can also use priority scheduling to prioritize messages and support server threads, thus saving on the overhead of heavyweight processes. Also, the load balancing mechanisms of TP monitors make sure that no one process takes on an excessive load.
When an application uses these features, they are able to provide performance as well as availability and scalability. If you've been reading my column for a while, you already know I'm sweet on TP monitors.
TP monitors also provide queuing, routing, and messaging features, which let distributed application developers bypass the TP monitor's transactional features. Here is where you can assign priorities to classes of messages letting the higher priority messages receive server resources first.
The real performance value is the TP monitor's load-balancing feature. Load balancing lets TP monitors respond gracefully to a barrage of transactions. An example is end-of-the-month processing. As the demands increase, the transaction manager launches more server processes to handle the load and kills processes that are no longer required. What's more, the manager is able to spread the processing load among the processes as the transaction requests occur.
Of course, if there were one perfect middleware layer providing all the features we need along with the best performance, life would be simple. However, you need to look at all the issues before hooking your application to a middleware technology.
While RPCs are slow, their blocking nature provides the best data integrity control. For instance, if you're using an asynchronous layer to access data, you can't assure that the update occurs in a timely manner. An update to a customer database could be sitting in a queue waiting for the database to free up while the data entry clerk is creating a sales transaction using the older data. When using RPCs, updates are always applied in the correct order. So if data integrity is more important than performance, RPCs may still be your best bet.
On the asynchronous side of the house, MOM vendors contend that synchronous middleware cannot support today's event-driven applications. Programs just cannot wait for other programs to complete their work before proceeding.
RPCs could provide better performance than traditional store-and-forward messaging layers in some instances. However, messaging could provide better performance because the queue manager offers sophisticated performance-enhancing features such as load balancing.
I don't mean to drive you in a single direction, but you should think about middleware in terms of performance. As application integration looks to scale to the enterprise, performance could be the problem to solve in 1999. I've never lost sight of it.

Figure 1. When using queued communications, the calling process, typically a queue manager, places a message in a queue.