Saturday, July 14, 2007

Lesson learned - Recovering from corruption

As if the previous lesson learned wasn't enough, it led to another painful lesson.

Our Microsoft PSS engineer had us take a backup of the current mailbox stores that wouldn't mount along with the transactional logs so that we'd have something to fall back on should things not go well. Our mailbox stores are large enough that the process would take several hours. Rather than have new mail bounce he had us create empty databases by configuring the file names different from the original files. This gave the new messages a place to go.

After the backup completed, we restored one mailbox store files back to the previous day (which took a couple of hours). We then went through several rounds of replaying the logs until we figured out where the corruption started. When that process finally completed, we had a restored mailbox store as good as we could get it, plus a newly created mailbox store with a couple of days worth of messages. The final step is to merge the two. This is done by making one of the stores the Recovery Storage Group and merging the new data with the old.

I'll interrupt the story here to say that because of the length of time involved with the process, we ended up doing this with two different groups of people. The first group completed the merge and all appeared well. The second group went through the same process with another mailbox store with a different PSS engineer.

The bad news

When the second group got to the merge, it was taking a very long time. Much longer than it took the first group. We could do nothing but watch the merge wizard slowly process each mailbox. Our mailbox store was large enough that the entire process took over a day to complete. Our users were patient and understanding, but at the same time displeased.

Lesson learned

Some of you may have recognized that what we were doing was performing a Dial-tone Restore. Henrik Walther wrote an excellent set of articles on this subject at MSExchange.org. The first group did exactly what Henrik described, and the second group had left out one important step which would have saved us many hours of frustration. Before performing the merge, you need to swap the mailbox stores so that you are merging the small into the large. The mailbox store was somewhere between 75GB and 100GB and therefore took a very long time!

I strongly urge everyone to read through Henrik's articles to familiarize yourself with the process. You never know when you'll need it!

2 comments:

skippy99 said...

Excellent article Dean. I still have nightmares about my Exchange servers and never sleep well at night!!

Zack said...

Hey, have you ever run into a script or tool that will allow you to find a forwarding address? I know you can go into the profile and see what address the user is forwarding to, but is there way to make a list of all people who have forwarding addresses set and who they are forwarded to?