Partial online recovery using RFI

Implementing RFI gives the DBA many more choices in the event of failure during online recovery. Before opting for log suicide, consider these advantages of RFI over log suicide:

  1. Isolated pages are known and can be examined. You can thus make an informed decision on whether to repair the faults or restore from backups.

  2. You can set thresholds to determine at what level page faults are unacceptable, and at which the whole database should remain unrecovered.

  3. You can make the database available to users while conducting repairs. The database can be configured to allow updates or to allow read-only access.

  4. Faults on system table pages cause recovery to fail for the entire database.

  5. You can implement a limited form of suicide recovery by disregarding all or some of the suspect pages and onlining them even if they are corrupt. The suicide is limited in the sense that only transactions associated with those pages are suspect. Recovery rolls forward (or back) other transactions in the log properly.


Implementing Recovery with RFI

The default granularity of recovery is at the database level. Take the following steps to implement page level granularity:

  1. Check or implement page granularity on desired databases using the sp_setsuspect_granularity stored procedure:

    sp_setsuspect_granularity [dbname [,{"database" | "page"} 
    [, "read_only"]]]
    

    If you set the granularity to page level, you have the option to set the database to read_only mode when recovery detects suspect pages. By default, all available pages are accessible for both reads and writes.

    NoteWherever possible, use the read_only mode. If a query attempts to access an offline page, the server raises error messages 12716 and 12717 regardless of whether the database is read_only. For more information on these errors see the chapter titled “Error Message Writeups.”

  2. Set the threshold for escalating page level granularity to database granularity using the sp_setsuspect_threshold stored procedure:

    sp_setsuspect_threshold [dbname [,threshold  ]]
    

    Once the number of offlined pages reaches this threshold value, recovery marks the entire database suspect. The default threshold value is 20 pages. It is unlikely that setting it much higher will be of much use since 20 corrupt pages is very likely to indicate corruption at a level than cannot be effectively repaired.

  3. Bring the suspect pages or database on line. You can print a list of pages or databases that are suspect after recovery using the sp_listsuspect_db and sp_listsuspect_page stored procedures:

    sp_listsuspect_db
    sp_listsuspect_page [dbname]
    

    You can bring these pages or database online using the sp_forceonline_db or sp_forceonline_page stored procedures:

    sp_forceonline_db dbname
      {"sa_on" | "sa_off" | "all_users"}
    
    sp_forceonline_page dbname, pagenumber
      {"sa_on" | "sa_off" | "all_users"}
    

    sa_on and sa_off toggle the database or page online and offline, and allow access to the database or page only to those with the sa_role set on. This permits the DBA to examine and repair the suspect database or pages without other users being able to access them.

WARNING! The all_users option is irreversible and makes the database or page available to all users. If no repairs have been made, this may result in some level of logical inconsistency.