Designing an efficient extraction procedure

It is very inefficient to create a large number of remote databases by running the Extraction utility for each one. You can make the process much more efficient. This section describes one way of making the process more efficient.

There are several potential causes of inefficiency in a large-scale extraction process:

  • The Extraction utility extracts one database at a time, including the schema and data for each user. Commonly, many users share a common schema, and only the data differs. The brute force method of running the Extraction utility for each user repeats large amounts of work unnecessarily. Extracting schema and data separately can help with this problem.
  • Running from Sybase Central, the Extraction utility creates a new database for each user. If subscribers share a common schema, you could create a single database, with schema but no data, and copy the file.
  • By default, the Extraction utility runs at isolation level zero. If you are extracting a database from an active server, you should run it at isolation level 3 to ensure that data in the extracted database is consistent with data on the server. See Extraction utility.

    Running at isolation level 3 may hamper others' turnaround time on the server because of the large number of locks required. It is recommended that you run the Extraction utility when the server is not busy, or run it against a copy of the database.

An efficient approach to extracting many databases

One approach that avoids these problems is as follows:

  1. Make a copy of the consolidated database, and at the same time start the subscriptions from the live database. Messages will now start being sent to subscribers, even though they have no database and will not receive them yet.

    To start several subscriptions within a single transaction, use the REMOTE RESET statement.

  2. Extract the remote databases from the copy of the database. As the database is a copy, there are no locking and concurrency problems. For a large number of remote databases, this process may take several days.
  3. As each remote database is created, it is out of date, but its user can receive and apply messages that have been being sent from the live consolidated database, to bring themselves up to date.

This solution interferes with the production database only during the first step. The copy must be made at isolation level three if the database is in use, and uses large numbers of locks. Also, the subscriptions must be started at the same time that the copy is made. Any operations that take place between the copy and the starting of the subscriptions would be lost, and could lead to errors at remote databases.