DBMS

Moving Forward with Replication, Part 2

By Glenn Froemming
DBMS, April 1996

The conclusion of our series on the complex replication issues inherent in mobile, sales force applications.


In last month's installment of this two-part series, I tried to provide a general introduction to the replication process, and offered some general guidelines for determining the appropriateness of replication for an application. In particular, I reviewe d the following replication topics.

In this article, I address some of the more subtle issues associated with replicated applications -- especially the issues involved with using replication in mass-deployed, disconnected, mobile applications. As I'll explain, you must address several addi tional scalability and availability issues in such applications.

Availability, Scalability, Synchronization, and Performance Issues

Most of the available replication services are based on a direct-connect model. When a subscriber connects for synchronization, a direct database-to-database connection is established with the publisher, and each site processes commands against the datab ase engine for the duration of the synchronization process. This is a very logical approach and works well in relatively low site count applications (5, 10, or 15 sites) connected by relatively high bandwidth connectivity. When hundreds or thousands of s ites utilize low bandwidth connectivity, it doesn't work as well. In a data warehouse or LAN-based, distributed client/server environment, a relatively high number of transactions go to a few sites, while in the mobile environment, a relatively low numbe r of transactions go to a very high number of sites through very slow communications mechanisms.

The sales force of a Fortune 500 company can easily consist of 800, 1000, or even 2000 field users. These salespeople typically connect to the main office either at the start of the work day or at the end of the work day or both. Although these sa lespeople may be spread across four time zones, it's logical to assume that 100-200 users/sites could try to connect at about the same time. What kind of database horsepower do you suppose would be required to support hundreds of direct database-to-datab ase connections through slow, modem-based connectivity, or through even slower wireless communication?

Considering these issues, my associates and I made a very conscious decision several years ago to move from a direct-connect architecture to a more batch-oriented model. In the batch model, replicated changes are not applied until after the "comms sessio n." Our operating definition of a comms session includes the scriptable set of activities that occurs prior to the actual communications connection (precomms), the actual communication session, and the scriptable set of activities that run after the comm unications line has been released (postcomms). This reduces the billable communication time to file-transfer time (that is, the time needed to transfer the transaction file physically), and greatly reduces the horsepower requirements of the server databa se engine because it does not have to process the transactions while the communications line is being held.

Batch-Oriented (Asynchronous) Synchronization

With a batch-oriented, asynchronous, disconnected mechanism, each side (publisher and subscriber) of the replication process simply delivers a change file to the other side for processing after the comms session completes. This approach results in minima l connection time and provides the most flexibility at the hub sites in spreading the processing load of incoming change files. Transaction files can accumulate into a change file queue to be processed by one or more replication agents (as time permits). This model also provides the highest availability because the comms system can be independent of the replication system. Therefore, a hub site database does not have to be running for a remote user to resynchronize. As long as the comms system is runnin g, the remote user can still deliver its changes to the hub and receive changes that had occurred through the hub up to that point. Therefore, the field user can continue to connect in order to drop off his or her changes, and get email, reports, and so on, even if the replication server is down. If the replication server is down, the field users are not able to receive changes from that server, but they are not impeded in any other way.

Of course, batch-oriented synchronization is not without its shortcomings. It can introduce quite a few nasty design issues, but it also provides the highest overall system availability and lowest system cost in terms of processing power and comms time.

Some Synchronization Dangers

Many replication services perform synchronization while the "publishing" site continues to be active. (I put publishing in quotes here because the publisher/subscriber metaphor breaks with the idea of bidirectional replication. Both sides are really publ ishers of changes.) While it is not desirable to lock out the database at the publishing site whenever a "subscriber" calls in for synchronization (particularly in the direct-connect model), there are certain risks associated with allowing update activit y in the database while synchronization is in progress.

First, if the replication process is table-based instead of transaction-based, you may pick up pieces of transactions based on the race condition that is established between the change table scanning process and whatever "external" activity may be adding entries to the change tables. ("Race condition" is a fairly common term in software engineering that refers to situations in which several processes want to access a particular resource, and the resultant behavior varies depending upon the sequence.) Th is could be especially problematic when a remote site is being completely restored. In this case, the table scan window may be open for an extended time period, increasing the likelihood of getting disjointed pieces of transactions.

If the partitioning scheme provides for reference to tables other than the base (replicated) tables, you must consider another risk. If the distribution (table partitioning) mechanism allows other tables to be referenced in order to determine that the pa rtitioning and database update activity is occurring in the background while replication is being performed, a "distribution list" may change, causing the same kind of data to be distributed differently at different times. For example, say there's a dist ribution list of Top3SalesPeople, which has entries {Mary,Peter,Glenn} at time T1, but, due to a reevaluation during the day, has {Mary,Peter,Foobar} at time T2. Maybe that's OK, maybe not. It's handy to have distribution "cycles" in order to feel confid ent that during a specific cycle, particular data was distributed with a particular distribution state.

Of course, these race conditions and distribution-determination dynamics can act in combination. In a table-based replication scheme, you can end up with the "pieces of transactions" scenario if the distribution list associated with data for tables A, B, and X changes between the time that the table A extract executes and the table X extract executes.

Complex Transaction Semantics

You must pay special attention to the use of database procedures or other complex transaction mechanisms in replicated systems. Even if you have consistent semantics across sites, you should not assume that a complex transaction that succeeds at site A w ill also succeed at site B. Given that complex transactions affect more than one table, it is quite possible that some condition that was true in the table at site A may no longer be true in that same table by the time the transaction gets to site B. Thu s, the entire transaction is rolled back at site B, and the problem must be reported to someone who can fix it. But maybe the reason the transaction fails at site B is because a change was made at site B to one of the underlying tables, and that change i s in the process of being replicated back to site A. For example, perhaps an update to a child table of Account 123 fails at site B because site B just deleted Account 123, and that delete is on its way back to site A. In this case, the net effect when e verything has settled out will be that Account 123 no longer exists at sites A or B. But the failed update error will be reported and someone will have to understand what really happened.

In some cases, you may not want the exact same functionality at all sites. I can point to delete handling as an example in my company's Sales Force Automation (SFA) applications. We frequently want to perform physical deletes on the remote units and logi cal or archival deletes at hub sites. For reporting purposes, it is often appropriate to retain the data physically at the server sites, while making it transparent to the daily application reference. The SFA applications also have knowledge of account s haring and account reassignment, whereby an account can be "deleted" from one user and "granted" to another. In this case, the operation looks like a physical delete on one remote site, an insert on another remote site, and an update to the AccountOwner field at the server site. In our replication system, the Open Data Delivery System, a proprietary database engine, uses an independent form of complex transaction processing to accomplish this. However, this process can also be performed with database pr ocedures (stored procedures). In this example, the interesting question relative to currently available replication products is:

If the result of changing the value of the "territory" field at a publisher site is "the account has been reassigned from territory A to territory B," how can you get the replication engine to send a delete to the losing territory and the necessary in sert information to the receiving territory?

The ability to generate transactions that didn't physically occur at a particular site into the replication tables for distribution to other sites can be very helpful. Therefore, you should be able to send to other systems those transactions or messages that result in actions (such as logical and physical deletes) that have a slightly different behavior at the originating site. What you really need here is a disconnected remote procedure call mechanism.

Cascaded Deletes

Cascaded deletes are common database procedures that let you, by issuing a delete at the root of a subtree, delete the entire subtree. In a replication environment, it's important to know whether the replicator will emit a change record only for the init iating row of the cascaded delete, or for every deleted row involved in the cascade. If all sites have cascaded delete procedures and the replicator sends the entire delete list, the child rows will be eliminated after the first delete triggers the local cascade.

At first, it would seem that one way of regulating the depth of the cascade would be through the replication rules. However, it is not guaranteed that the replication rules provide granularity down to the operation type (insert, update, or delete). Furth ermore, can you define replication rules for lower-level tables, but have them applied only to deletes at the "root" of the delete, and not as part of a lower-level cascade? For instance, say an account has child tables Call and Sales, and Call has the c hild table, CallDetail. A cascaded delete on Account would also include Call, CallDetail, and Sales. A cascaded delete on Call would also include CallDetail. If, however, on an Account delete, you do not want Call, CallDetail, and Sales deletes to replic ate out to other sites, you can simply not replicate any Call, CallDetail, or Sales deletes. That way, the subscriber site, which presumably has some sort of cascaded delete procedure on Account, will receive only the Account delete -- not the deletes to Call, CallDetail, and Sales (these records will be gone by the time the cascaded delete completes). Therefore, it's important to know exactly how the replicator deals with cascaded deletes and other forms of complex transactions and database procedures. Unexpected output from the replicator can lead to extremely unexpected behavior at the receiving sites.

Error Notification and Handling

Error notification and handling is another issue that is exacerbated in mobile-based replication applications. As you can see from the previous scenarios, there are plenty of opportunities for errors, although a rigorously tested application can run with relatively low reject levels in the replicated transactions. Following are a few things to consider: The first thing to consider in either LAN-based or mobile replicated systems is the impact of a rejected replication transaction on the ongoing transaction flow. Will a rejection at a subscriber site halt further replication to that node until the reject is corrected? Can your application tolerate the outage? The safest thing to do if any transaction fails as it moves from node A to node B would be to simply stop the transaction flow. After all, in a typical client/server application, the user is preven ted from proceeding further along the errant path until the error is corrected.

Therefore, you must understand the potential for data corruption, and weigh that against the impact of an unavailable or out-of-phase system. Then you must determine what options you have for detecting and correcting rejected transactions. Does your repl ication service provide the opportunity to allow the application to proceed?

You must also consider the following error-handling issues:

Where notification occurs may be an even more important issue in mobile applications than how notification is provided. In mobile applications, the subscriber nodes are operated by end users -- typically salespeople, managers, or executives -- not DBAs. These people are not going to correct rejected replicated transactions. There really doesn't seem to be much value in notifying a terminal subscriber node of such problems. It might seem like the right or polite thing to do, but if there's nothing they can do except get mad, what's the point? If the error is so catastrophic that the application should be disabled, then do that, but it's not necessary to display a list of error messages if the application is going to proceed anyway. The notificati on must be provided to someone who can correct the problem. In most cases, error correction should be performed at the publishing site by trained support personnel. These support people must be familiar with the application as well as the replication ser vices. Interesting enough, most rejects are caused by data, configuration, or sequencing issues --not by the replication services.

We have also found it productive to handle rejected transactions at the publisher or hub nodes. Rejects from the mobile units are recorded in a set of database tables, which are replicated and returned to the publisher in the next comms cycle. Replicated transactions from the field units that are rejected at the hub are recorded in the same set of database tables. A reject-correction tool edits and resubmits the transactions. If the reject is a transaction from the field that was rejected at the hub, it will be reapplied locally. If a transaction was rejected at one of the field units, the corrected transaction is simply placed in the outgoing transaction queue for distribution during the next distribution cycle.

Handling Interruptions During Synchronization

As with typical client/server applications, interruptions during processing are issues that the replication service must address rigorously. In a LAN-connected client/server application, if the application has been designed in a transactionally consisten t environment (that is, an environment in which interdependent database operations that are viewed as a single "business operation" are encased as a single database transaction to ensure that the operations succeed or fail as a unit), a service interrupt ion such as a power failure or full disk should affect only one transaction. In this case, you need to be concerned only with the following questions: Server and mainframe operating system environments usually provide robust mechanisms to deal with these issues. However, the operating systems typically employed on laptop and other forms of mobile computers are not nearly as resilient. This is particula rly unfortunate when you consider that because these mobile devices are battery operated, have limited disk storage, and are operated by end users and not system administrators, the power failure and full-disk scenarios will occur quite frequently.

Another more subtle, and potentially more damaging, scenario occurs if the change queue or change log is somehow corrupted. If it is wiped out, will it automatically be recreated? Is there a mechanism that can detect such corruptions? What backup mechani sm is provided? The answers to these questions will vary depending on the service provider. For a list of questions to ask your potential replication vendor, see the sidebar.

You must know precisely how the replication service will handle such interruptions. If the replicator is transaction-based instead of table-based, it will be much less of an issue. A table-based scheme incurs a much higher risk of transactionally inconsi stent data because pieces of a transaction in an unreplicated table will not be received until service resumes. Under such conditions, you must determine if the applications that use the replicated data should be allowed to proceed.

Sequence Generators

Sequence generators (provided by database vendors) are useful mechanisms for generating unique key and sequence values. They work very nicely in a mainframe or classic client/server architecture, in which several front ends connect to one database instan ce. Unfortunately, because sequence generators were designed to operate in a single environment, they don't work as well when that operating environment is distributed across many, perhaps thousands, of independently operating nodes. Originally, the fron t ends had sequence value access that was serialized through one database instance, but now each node has its own sequence generator that is totally unaware of sequence values being generated at other nodes. In effect, each node has the entire range avai lable to it. Even in a homogeneous replication environment, the centralized serialization of the sequence number generation is circumvented.

Obviously, the loss of centralized management of the sequence values can lead to problems. Therefore, my company has provided a sequence generator that operates at a level above the database engine, and has a built-in notion of a node or site identificat ion. The mechanism returns to the application a 32-bit integer value that is guaranteed to be unique for a given data object across the network of nodes. Thus, the sequence range is partitioned across the user node space.

Until database vendors supply a similar mechanism, you can take the following approaches:

There is another issue to consider here: What happens in the case of a catastrophic database failure? In classic client/server, single-database systems, the centralized database is regularly backed up or protected through checkpoint and recovery processe s. But there are no DBAs backing up databases on mobile computers. If the database is lost, it's gone. My company has circumvented this problem by enabling the sequence values to be replicated to a publisher site for backup.

Data Collisions

In symmetric or updatable replicated systems, in which several sites can manipulate separate copies of the same data concurrently, independently, and transparently, concurrent access problems are time-delayed to an appropriate synchronization time. Such problems with shared data have been called "data collisions."

The crux of the problem is that one or more users or nodes has updated the same row at about the same time. The fundamental dilemma here is timing. By definition, in replicated systems there will be some level of latency between the time that site A prov ides its updates to the publisher and when site B provides its updates.

In classical systems, the locking model enforces a "first in wins" policy. If a second user enters data five minutes later, thus avoiding the lockout condition, the last-in policy would be implemented. So which data is correct? There is an implicit, de f acto, almost cultural policy that last-in is "right." While this may be generally true, the locking mechanism is just the agreed-upon policy for resolving conflicts in single-engine systems.

In replicated systems, each node has its own database engine. Therefore, the collision scenario is delayed until the changes are applied at some integration point (such as the publisher site). If the data is shared (that is, it can be updated by multiple sites), a check must determine if changes have already been recorded against that row. If so, a collision may be signaled. Quite a variety of schemes have been proposed to detect collisions, ranging from row version numbers to checks against the change record. I believe the most common scheme is to check the target database change log for an entry with a key match.

Once a collision has been signaled, the replication service must activate some type of collision-handling mechanism. Some of these mechanisms are time based. However, a time-based mechanism does not make sense for any multisite distributed system. Think about it. Ask five of your friends what time it is. Odds are, you'll get five different answers. Won't that also be the case on five different computers? How about 1500 different battery-powered computers?

Let's assume that you could somehow guarantee that all the clocks are synchronized. On what time should you base the collision handling? Time of data entry or time of data transmission? What if Bob enters changes on Monday, but doesn't communicate them u ntil Friday? Is this time-based mechanism really going to try to roll back hundreds of transactions from hundreds of different sites? And what about all the transactions that followed and were potentially dependent upon the ones that will be rolled back?

The point is, time is a very relative concept in replicated systems, and it is not a concept upon which I would recommend basing any sort of collision-detection or resolution scheme. Furthermore, I would suggest that rollback mechanisms are not practical in such systems. My associates and I take the following approach in our SFA systems:

Our SFA systems provide a mechanism that detects collisions based on comparing the "before" values from the change transaction to the current values on the target system. If a value differs, a collision is signaled. The system performs collision detectio n on the terminal (subscriber) nodes only. On the server (publisher) site, all changes are serialized and the net effect on the database server is that the "last-in wins." At the remotes, when a collision is detected, the values at the remote "win" (beca use, in these systems, the remotes are the originators of most of the data). In other words, it will not overwrite remote values with server data values that are probably older than the value at the remote. Importantly, if the collision detector goes off , it most likely means that the local system has made a change to this value that has not yet been applied to the server.

In your systems, the publisher may be the data generator. If so, you want a mechanism that allows you to enforce that policy. Many of the replication services are now providing a selection of handling mechanisms, including user exits. The important thing is to understand what mechanisms the replication service provides for detection and handling, and determine if they will be adequate to enforce the policy appropriate for your application(s).

One final issue regarding collision detection and handling: Will the replication service stop replication between the affected nodes on a collision? This extreme approach ensures that no further transactions are processed that would be potentially incons istent or inappropriate relative to the collision. This may be the right thing to do, but can your application afford to stop completely until someone corrects the situation?

Moving Forward

In this replication series, I have tried to provide a general introduction to the replication concept, and to enumerate issues associated with the use of replication in various types of applications. I hope this information will be helpful in determining the appropriateness of replication for your applications, and that it will help you understand what features may be required of a replication service for your applications.


Glenn Froemming has been in the computer industry for more than 20 years. For the past eight years, he has served primarily as a system architect, designer, and technical development manager. He currently works for Sales Technologies Inc. (Atlanta), a su bsidiary of Dun & Bradstreet, as chief architect and technical development manager for the company's Open Data Delivery System. You can email Glenn at 103147.374@compuserve.com.


Questions You Need to Ask Your Replication Vendor

General Replication Questions
What is the fundamental unit of replication?
a) table row
b) transaction

What mechanism is used to provide replication?
a) log scraping
b) database procedures
c) other (please specify )

If replication is log-based, could replicated transactions that have not yet been picked up by all subscribers affect log purge or rollover? Yes/No

If not, how is this situation avoided?

Are replicated transactions in a different log? Yes/No

Would you characterize the replication service as:
a) push-oriented (publisher driven)
b) pull-oriented (subscriber driven)
c) hybrid (please specify)

Is bidirectional/symmetric/update-anywhere replication available? Yes/No

If yes, are there any limitations on who can update what (that is, subscriber/publisher)? Yes/No

Replication Rules
Are replication rules, or replication specifications, required for each replication target site? Yes/No

Can replication rules be reused or parameterized to allow several subscribers to use the same rule? Yes/No

Can replication rules contain substitution parameters? Yes/No

If replication rules can contain parameters, how and when are values for those parameters obtained (or extracted and captured) from a replicated transaction?

Some replication rules provide ways to partition tables horizontally or vertically. Can the replication rules reference tables other than the replicated table? Yes/No

Complex Transactions
In a cascaded delete, can replication of the delete be limited to the uppermost table of the delete, while still allowing deletes at the lower levels to be recorded independently? For example, given the structure A with children B1 and B2, where B1 has c hildren C1a and C1b, can a cascaded delete of A be limited to a delete on A and only A, while also being able to record a cascaded delete of B1 as a delete of B1 and only B1?

Error Notification and Handling
Given an error while replicating between Publisher-P and Subscriber-S, where will notification of the collision occur? Check all that apply:
a) Publisher-P
b) Subscriber-S

Check all forms of error handling supported:

Check all forms of collision detection supported: Check all forms of collision handling supported: Heterogeneity
Indicate the forms of replication for which your product supports heterogeneous operation; that is, the ability to support replication services across disparate DBMSs (not just different variations of same engine):
a) single updater (snapshots)
b) bidirectional

Indicate mechanisms used to support heterogeneous replication services:
a) gateway
b) automated generation of database triggers/procedures for a specific DBMS
c) log-scraping mechanisms for different DBMSs
d) other

Heteromorphism
Indicate all forms of disparity in data element addresses and interpretations supported by the replication service:
a) name space mapping: same data element, different table/ column name
b) formatting/sizing differences
c) denormalization: one item to multiple targets
d) flattening: normalize data to repeating group form
e) merge: combine multiple items into a single item
f) aggregate: summarize multiple entries into a single item
g) other

Configuration Management
Are mechanisms provided to detect configuration-level mismatches? Yes/No

Suitability for Mobile Applications
Is the current version of the product well-suited to hundreds or thousands of subscriber applications? Yes/No

If there are plans to provide a version for mobile applications, what is the timeframe for that release?

Will the mass-deployment or mobile version implement the same direct-connect-during-synchronization architecture that most of the LAN-based replication servers use? If not, describe the architecture that will be used.

-- Glenn Froemming


Subscribe to DBMS and Internet Systems -- It's free for qualified readers in the United States
April 1996 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS and Internet Systems (http://www.dbmsmag.com)
Copyright © 1996 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated Monday, November 4, 1996.