Using RFI

RFI allows partial recovery only on user objects. If it encounters corruption on system tables, recovery fails for the entire database.

RFI allows the database administrator (DBA) to select the granularity of recovery for each user database. The choices are:

Mark the whole database suspect on any recovery failure. This is the default behavior and it is how recovery worked in previous versions.
Set the number of pages that can be offlined during recovery and still allow the database to be opened. The DBA can decide whether the partially recovered database is to be open for updates or for reads only.

Finally, the DBA can set the database to be marked suspect on any recovery failure, then change the setting to recover all but the corrupt pages. In this mode users cannot access the database, enabling the DBA to determine the appropriate course of action and proceed accordingly.

There is a significant difference between RFI's page-level and database-level granularity. Database granularity mandates that all transactions in the database should either be completed and rolled forward, or rolled back and all changes backed out. In either case the database is in a logically consistent state at the end of recovery. In short, recovery is all or nothing. Any interruption in recovery that makes this impossible causes recovery to fail entirely, and the only sure way to guarantee a consistent state is to restore from backups. This can be problematical, depending on how many backups are available, their validity, and how recent they are.

Page level granularity, on the other hand, allows the server to offline corrupt pages in a transaction while onlining other pages. Since recovery has not been able to complete and verify the transactions, this will leave some of the transactions only partially available and all other transactions completely recovered as usual. There is no way to determine whether transactions that involved offlined pages are complete except by manual examination.

If, for example, a transaction changes rows on three pages and the changes on two pages are written to disk before the server stops, recovery would normally assure that the third page also was written to disk. If, however, recovery marks as suspect the page to which the third update is to be made, there is no way to determine whether the transaction is complete or incomplete; that is, whether all three pages are updated or only the first two. A transaction in this state is deemed to be partially available, as the changes to the first two pages are available while the change to the third page is unavailable, and it is not known whether it was changed.

At another level, consider a case where a page from a specific table is marked offline. Subsequent work is dependent on this page but only at an implicit level, meaning that it is assumed that business rules will be handled without explicitly coding referential checks. If the code were to explicitly check for the offline data, an error would be raised; but if this is not done and the work proceeds with only an implicit dependence on the offline pages (which cannot be restored to a consistent state), it may result in logical inconsistencies in the database. This is yet another reason we recommend that all dependencies between data be explicity coded via declared referential constraints, triggers or existence checks.

It is important to understand that while it is possible to bring corrupt pages online, doing so without first repairing the pages will result in logical and data inconsistency. When restoring a database by repairing offline pages (or by restoring objects to which the offlined pages belonged), therefore, the DBA must explicitly determine the degree to which logical consistency of the database may be suspect according to business rules and coding practices. Of course, restoring the database from a database backup and incremental transaction backups assures both the logical and physical integrity of the database through the last successful load of a transaction dump.

It is also important to run dbcc tablealloc or dbcc indexalloc with the fix option on any objects with suspect pages because the allocation information for these objects is also suspect.