Failed RLV Recovery

You encounter a recovery problem such as Checksum error reading a page from disk, mismatched sequence number on head / tail of page, or OS exception reading a page from disk.

Recover occurs in four high level phases:
  1. Initialization (SYSIQRVLOG table is scanned and log identity block is loaded. Tables are added to the recovery list if log pages exist).
  2. Commit log analysis
  3. Table log analysis
  4. Operations which belong to committed transactions are redone.
Recovery errors in phases 1, 3, or 4 will result in an IQ server shutdown.  An error in phase 2 is handled by doing an extended phase 3.
Recommendations
  1. Use two server startup switches to restrict access:
    • Use -gd DBA so that only users with the SERVER OPERATOR system privilege can start and stop databases on a running server.
    • Use -gm 1 to allow a single connection plus one DBA connection above the limit so that a DBA can connect and drop others in an emergency.
  2. Set -iqrvrec_bypass = 1 to bypass all RLV recovery.  This option is intended to be for emergency repairs, such as dropping a problematic RLV table.   As currently implemented this disables further logging, but there are no other checks in the code that will prevent general RLV operations.  As such, this mode is likely unstable if non-DBA users are allowed general access.
  3. Manually establish / correct the consistency of the database.
  4. Truncate the RLV portion of a table.  This may leave the database inconsistent, but will allow a subsequent recovery.
    Note: All data in the RLV portion of the table will be lost.
  5. Reboot with normal recovery.