To me making sure your database is backed up is the most important task for a dba. But not only is it backed up, but it’s backed up correctly.
I just encountered a situation where a client started receiving numerous errors in their production system. There were almost a hundred ORA-600 and 7445 errors in the alert log, all pointing to different issues. Upon analyzing the alert log it was decided to rebuild the database using RMAN backups. The client stated the current hardware was old and giving them problems and they wanted to move the database to a new server. With a properly backed up database this should be an easy task to accomplish.
Without going into much detail the basic steps would be to:
- Install the Oracle binaries on the new server.
- Setup the directories the same as on the old server or as close as possible.
- Make the RMAN backups available to the new server either by coping them over to the new server or through a shared directory.
- Setup the init.ora and control files making any necessary edits to the init.ora file.
Now you should be able to do a full restore and recovery using RMAN and bring the database up on the new server.
A simple restore and recover, right?
Only if the database has been backed up correctly. The client had faced space issues on the old server and configured the backups to only do certain tablespaces. The tablespaces they were not backing up were index tablespaces and the sysaux tablespace. When the restore had finished the recovery failed because of the missing tablespaces. To get past this problem the database was brought up into mount mode and the missing datafiles were dropped using the offline drop command. The database could then be recovered, but there were still issues to be resolved with the missing datafiles. In most cases the tablespaces could be dropped, rebuilt and then the data restored using export and import. Errors were encountered when trying to drop the index tablespaces because a lot of the indexes were for primary keys and could not be dropped without first disabling them. All of the indexes had to be traced down and disabled. Once this was accomplished the tablespace were finally dropped.
The sysaux tablespace was a problem all on its own. Oracle documentation, ID 243246.1, states that the sysaux tablespace is mandatory, but then the Oracle Database Administrator’s Guide, 10g release 1 (10.1) part number B10739-01 states
‘If the SYSAUX
tablespace becomes unavailable, core database functionality will remain operational. The database features that use the SYSAUX
tablespace could fail, or function with limited capability.’
Dropping the tablespace was full of roadblocks. Researching on Metalink several documents can be found outlining how to handle different problems with the sysaux tablespace and steps you should follow. The only problems with these are that they require the tablespace to be available and this wasn’t possible in this scenario. The major problem in dropping the sysaux tablespace involved the AQ$ objects. When trying to drop the objects an error is encountered stating you have to use the DBMS_AQADM procedure, another catch 22. The DBMS_AQADM procedure would fail because it couldn’t read from datafile 3. I finally found a post by dpbradley that stated setting the trace level will let you drop the AQ$ tables. The command I used was alter session set events ‘10851 trace name context forever, level 2. After this was done the AQ$ tables were dropped and then I was able to drop the sysaux tablespace. There are still other issues with the database, but it is up and running and the client is going to create a new database and export/import the relevant schemas over to it.
What should have been a 5 hour restore and recovery, turned into a very time consuming and tedious task.
So remember, back up your database, but do it correctly.