You can develop a custom application to perform the same functions
as the rs_subcmp utility. The application’s
complexity depends on the number of different data server types,
the complexity of the tables to be compared, the amount of data
translation involved, and so forth.
The following list describes the major issues that a database
comparison application must accommodate to be successful in a heterogeneous
replication environment:
- Connectivity – the
application must be able to communicate with both the primary and
replicate databases. If multiple database vendors are involved,
ODBC and JDBC protocols can provide a common interface and functionality.
- Sort order – the default sort order may
be different for different databases. The application may need to
force the sort order to improve comparison performance.
- Character sets – some primary and replicate
databases may store character data in different character sets.
Your custom application may need to support these translations.
- Object identification – primary and replicate
tables may not have identical names or exactly the same schema or
column names. The comparison application may need to accept very
explicit instructions for location, database, and table and column
names to be referenced.
- Subset comparison – the application may
need to compare only a portion of a table. The ability to specify
a where clause type of select for
both primary and replicate tables may be important.
- Latency – in a replication system, there
is always some latency (a measure of the time it takes a primary
transaction to appear in a replicate table). A comparison application
must include some tolerance to distinguish between rows that are “not
there” and “not there yet.”
- Data transformation – the application must
be able to handle differences in precision and format between different
databases, the same way Replication Server supports class-level
translations. To simplify processing you want to allow certain columns
to be excluded from the comparison process, based on datatype (for
example, do not compare the DATE datatypes
of different database vendors).
- Large object (LOB) data – large object
(for example, LOB, CLOB, TEXT, or IMAGE)
datatypes cause additional processing issues because of their size.
To improve performance, limit the number of bytes used for comparison,
if the likelihood of a “non-match” can still be
relied on.
See the Replication Server Administration Guide and
the Replication Server Reference Manual > Executable Programs > rs_subcmp for more information on rs_subcmp.