Skip to Content
Author's profile photo Chris Kernaghan

Trial Cutover 2 (TC2) completed and the lessons we learnt from it (CUUC Part 6)

Christmas was long over, the January blues were closing the month and  I had just spent another weekend at the Hilton at the Birmingham NEC  performing TC2 – which went very well. So I have decided to spare you  the details of the process – instead I wanted to document how the  analysis of TC1 influenced the upgrade process and if those tweaks  worked.

So what did we ‘tweak’ between TC1 and TC2? The table  below details the items discussed in the last post and the updates from  TC2

As  you can see from the table above we had several changes to how we  executed processes as opposed to changing the processes themselves –  this was to provide the consistency that is necessary when coming to the  later stages of upgrade projects.

In terms of timings the tables  below show how TC2 ran compared to TC1 and the PoC CUUC process, and how  our mitigation techniques worked for us, especially on the Unicode  conversion

So  as you can see the overall time has come down quite dramatically in  both the Upgrade Uptime and the Unicode phases, this is primarily due to  the index defragmentation done on the database over Christmas and our  own improvements in running tasks.

As well as technical measures to  improve performance we also implemented a number of soft measures  around people and upgrade management.

  • The shift patterns were  improved, we found that the other upgrades were running at paces which  were not accurately reflected in the plans – this caused problems for  the team as I wanted each person to have a ‘buddy’ to check upgrade  inputs and reduce errors.
  • The previous post talked about the Cut  over communications, these were difficult over TC1 because the shifts  were not optimal. After reworking the shifts we found that the  communication methods worked quite well, but communicating with a  geographically distributed team across timezones is not easy with so  much at stake.
  • SAP Streamwork was a very useful repository for  the upgrade documentation, the team was able to keep their documentation  up to date and available to the whole project team. I also put a daily  upload of the project plan on Streamworks so my team could keep up to  date with the plan revisions as they did not have file system access to  the client.
  • The technical team did a project plan walk through,  counting out each hour and detailing the tasks they would perform, this  is difficult one to call as to it’s usefulness. My team found it very  useful but it does take about 4-5 hours to get through, on simpler  projects I am not convinced of it’s value.

Coming out of TC1  and TC2 was a recurring issue around both backups and the fall back  scenario, both of which are vital for a customer to ensure they can  Return to Operation (RTO) as quickly as possible. The main issue that we  had going into TC2 was the actual scheduling of the post-Unicode  backup, this is a vital checkpoint as the system has to be re-introduced  into the backup schedule as soon as possible and also needs to capture  all the changes of the Upgrade, Unicode and transports. In order to do  this we decided that we would run an On-line backup of the system as  soon as the transports were imported and before an SGEN was executed. As  a result the backup would be running whilst the users were testing the  system and because an SGEN had not been run they would experience  degraded performance, this was going to be something they would have to  live with but creating a recovery point for the system was more  important than a temporary performance issue.

The fallback  scenario was another challenge, during the project the client had a  requirement for an additional Pre-Prod environment to allow testing of a  seperate project stream which was falling behind and would not go-live  with the other work streams. At the time, providing this system was a  major pain and caused a great deal of stress, but it would become a  great opportunity for the fall back. The diagrams below shows how we  used the additional Pre-Prod system to restore the Pre-Upgrade backup on  to it and in the event of a fall back re-present the storage to return  to the previous version.

In  the event of invoking a fallback, the process would change the SAN  presentation of disks to re-present the restored database back to the  Production source and allow a much shorter RTO than a straight restore  from tape.

Now that we had the database backups and fall scenario constructed, all that was left to do at this point was to get ready for Production in a few weeks time.

Assigned tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.