Some migrations are easy and pain free and almost an after thought. But eventually, we all get to the fun ones – the mission critical ERP implementation that is the life-blood of the business. You know the one – the one if it even has a minor hiccup, the CEO will be livid because your customers are severely inconvenienced. I have seen a lot of these projects succeed…..and I have watched others struggle mightily. From it all, I have learned that there 5 key
critical success factors to successful migrations of business critical ERP systems to ASE.
- Use the Best Practices for Migration and supplement with an OS DB migration expert and expert DBA resources.
- Have a performance & capacity team of DBAs, ABAP and OS admins that are aligned and committed to the success of the project.
- Make sure you set up and use a current version of DBA Cockpit
- Tune and monitor the statement cache
- Attack the top n problem queries in iterations
I have found that if you follow these guidelines, most of what is left is fine tuning and if any problems arise, they can be quickly resolved.
1 – Best Practices for Migration
Okay, this is not a shameless plug….it is more of a mea culpa. It has been a while since this has been updated. Sorry. The first iteration of the SAP on ASE Best Practices was actually split into two – one for migration (SAP Note 1680803) and one for runtime (SAP Note 1722359). At the time, it was felt due to all of the runtime tuning optimizations that to have it all in a single document was a bit much. In these days of twitter, apparently anything long than 140 characters or 3 pages is considered too difficult to expect people to read. Hmmmmmm…..back in my day, we had books. We actually read them.
However, subsequently, a lot of the runtime recommendations have been moved to their own SAP Notes – which should be considered by any customer. As we have started updating the last version for the newer ASE releases and incorporating the lessons we have learned and the features added to support SAP, we are probably going to pull it back into one document – but still with the links to some of the optional optimizations, etc. Hopefully,we should have out in a few more weeks (it is in review).
However, the version that is there does contain a lot of useful information. Some of it may be dated and there may be faster/better ways to accomplish step whatever using new feature X in ASE service pack Y – but that is why you need an expert DBA resource (or more) that can review it. Now, by expert DBA resource, I am not talking someone that can merely spell ASE or has worked on it some – I am talking someone who will likely be a key part of the performance and capacity team (next topic) and thoroughly understands ASE performance with respect to memory/caches, query performance and parallel processing. Their key role here is to look at which features will ease your migration effort and downtime and facilitate the initial performance analysis in stabilizing the system. If you do not have such a person within your company, you may wish to hire someone or contract for a decent period of time. Yeah, both these options are likely expensive as really good DBA’s don’t come cheap….but….think about it. If the system you are migrating controls the life of your company and is responsible for 100’s of millions or billions of dollars in revenue…..outsourcing to a cheap resource just to reduce the cost of the migration is likely going to result in high risks. Outsourcing for supplemental operational experience (performing backups, monitoring error logs, etc.) is a workable solution to providing short term or long term supplemental staffing – however, you should make sure they are using DBA Cockpit.
The second resource that is invaluable is an OS/DB migration expert. I am not talking about merely certified – I am referring to someone who has done this multiple times. I mention this from an experience I had with working with someone from whom I learned a ton of invaluable information about migrations – e.g. leveraging DISTMON and other aspects that could dramatically reduce the time and ease the complexity. Again, much like the expert DBA resource, an OS/DB migration expert is likely not cheap – but the good news is that unless you have a plethora of landscapes, this is a more temporary staffing requirement.
2 – Performance & Capacity Team
I mention this based on my experiences. One of my first experiences with this was that it was obvious that while the executives were committed to the migration for the cost savings, the company’s core ABAP developers were adamantly opposed to a migration. While the migration happened – getting resolution to problems was an absolute nightmare. In addition, SAP ERP systems can often be best debugged when the team works together – in this case, there was none of that. It was almost a constant badgering of “well on platform X, we never had to do that”. First, like many folks, they missed the fact in the big picture that the migration to ASE almost always requires an upgrade to the SAP BASIS and kernel layers as well as a DBMS migration – and any changes to the application layer are much more intrusive. Second problem was they had outsourced the hardware and OS implementation to a group who sort of did a fire-and-forget implementation without any consultation with the DBA staff or even ABAP team. As a result, the customer struggled with environment issues for months. Unfortunately, the later experience has been repeated all too often when OS admins simply install the system and strongly resist any OS tuning recommendations from either the ABAP or DBA staff and tend to argue the merits of their implementation based on a generic “best practices” vs. understanding the SAP NetWeaver or ASE can benefit from specific OS tuning.
By contrast, I worked on a project where the OS vendor was committed and provided onsite support and did a ton of knowledge transfer and tools skills to the project staff. They weren’t there every day – but they were there periodically and on the actual migration date. It made a real impression and changed my opinion of that company. They made sure that the tools and graphs were not hidden deep in the bowels of the sysadmin lair where they never see the light of day – but rather they were used. I remember walking into a meeting by the project manager where he pulled up the OS tools and showed the HW utilization graph by component (CPU, network, disk) during one of the migration test runs and challenged us with what we could do to utilize the HW more effectively to reduce the migration. Not only was it extremely useful as we noted a common HW configuration issue (10GbE running at 1GbE) but it help us to see the need to have a separate migration configuration of more cores/memory vs. a runtime configuration and we worked together on how to best do this to minimize downtime and effort.
This extends beyond the migration and long into production monitoring and capacity. As workload increases, all need to be aware of the trends of HW utilization by the various components and how that utilization has changed over time. One of the worst things Ihave seen is that often after the system is in production, no one seems to be able to come up with even the simplest of graphs/charts highlighting performance trends – whether HW utilization, key transaction times, or DB key performance indicators and trends.
3 – Use DBA Cockpit
This is puzzling to me. SAP has spent a lot of time obfuscating the DBMS complexities and command differences behind a single GUI that focuses on SAP application requirements….and people don’t even try it…and often don’t even set it up. Now, I have to admit to be an old, grey-haired geezer that things the ultimate editor is ‘vi’ – but I, myself, am impressed with DBA Cockpit – especially as two of the most critical and common performance issues (statement cache and query tuning) is simplified and easy to get to. My first real experience with DBA Cockpit was helping out during an internal SAP IT migration of one of SAP’s internal systems…..people who know me know all too well my preference and heavy use of command line tools and scripts…and to say I was sold on DBA Cockpit would likely stun them with disbelief. There are still some aspects (such as deep query diagnostics) that I like to do manually – but from a performance monitoring/problem query isolation perspective – DBA Cockpit to me is a no brainer.
The other reason this is puzzling to me is that a lot of the drudgery DBA tasks such as update stats, reorg, etc. are automated in DBA Cockpit via the ATM jobs. Even if you find the ATM implementation insufficient for your largest tables (as one customer complained), having it manage 59990 of the tables in the schema while you develop a more focused solution for the top 10 largest tables reduces the amount of effort needed to develop the custom solution.
Also, make sure you are using a current version. A recent customer decided in their migration to simply upgrade their SAP layer to just the minimums needed to support ASE as this represented the closest release to their current version. The result was they got DBA Cockpit v1.0. While it had a lot of the functionality they were after, the current release at the time was something like v23. It had a LOT more functionality, fixes, and improvements and was a lot more easy to use. Now, one of the things I often suggest in a large mission critical system is that there is a DBA host from where DBA monitoring and custom DBA scripts reside – a separate system from the DBMS host. It should be a simple task to implement a more current version of the SAP NetWeaver kernel on that host so that a more current release of DBA Cockpit can be used. Lacking such a DBA host and assuming that all DBA work will be done on the DBMS host is clearly a recipe for catastrophic failure….one oooopps at the filesystem and something reeeallllly bad happens next. Isolation is the key to stability.
4 – Tune and Monitor Statement Cache
My first exposure to this problem was at a customer that was having debilitating CPU spikes. We spent hours looking for the usual culprits – bad queries, application loops, etc. And then the person with the least experience with ASE in the room points out the statement cache….ughhhh. In a matter of minutes and several iterations of increasing it – problem solved.
The statement cache is ASE is probably one of the most critical areas of ASE from a performance perspective for SAP applications. Not only is sizing it correctly a key factor – but also it is the entry point for all of the query tuning diagnosis. Unfortunately, the default is 100MB. For most mission critical systems, this is about 10% of the real requirement of ~1GB. Some sites may need 2GB. Keeping in mind that the correct size will depend on a lot of factors such as the breadth of different queries as well as the transaction rate, I might start with 250MB as a minimum for most small system installs and 1GB for larger systems and 2GB for VLDB. Then monitor and increase the size when you see considerable turnover occurring. By considerable, I mean any consistent turnover of 10% of the statements in cache.
This is where DBA Cockpit excels. If you simply look at the statement cache, the first chart you will see is something like the following:
It shows us that there are about 14,500 statements in cache with a single cache hit ratio drop to ~80%. Now, those that know me also know how much I hate cache hit ratio metrics. It may have been a really good metric several decades ago when memory was expensive and we didn’t have much of it. But in today’s large memory systems, it is almost useless. An in-memory table scan is an incredible waste of both CPU and memory – but yet will drive cache hit ratio to 100%. Consequently, I like to focus on cache volatility and try to find out what is causing cache fluctuations. To do this, I prefer to see how many statements were removed and inserted – more a graph similar to:
This of course points out one of the other neat features of DBA Cockpit – the ability to create and save custom user-defined graphics. Now, with 14,500 statements a sudden turnover of 2000 of them could be a concern if it continues. It could be due to someone running sp_recompile on a lot of tables, so a single spike such as the above is not a source for immediate concern – however, it is a bell ringer to start paying close attention
5 – Top N Problem Queries
During the initial migration testing, someone should have identified key business transactions and compared the pre-migration to post-migration execution times and resolved any of those problems prior to migration. However, once the migration is complete and the system is in production, you need to focus on the rest of the system and how it is impacted by the migration. Reviewing every query in every module simply isn’t possible. The best approach I have found is to simply start tackling the top 5 or so bad queries – resolve them – and then repeat. After about 3-4 iterations, suddenly the system performance is so stable/predictable that it becomes almost boring – but still a good practice aspect for up and coming members of the performance and capacity team.
Again, this is where DBA Cockpit shines – one of the (in my mind) key displays it has is the top N bad queries. Now, I like to customize this by adding in columns such as SSQLID, Hashkey, TotalLIO, TotalCPUTime, and TotalElapsedTime and then saving that as the new default for the display. Once you have identified the top 5 queries, you may need to consider one of several actions:
- Add a better index. The fact that a query uses an index is not good enough – it should use an effective index and the more often the query is executed the more closely there should be an index that matches the predicates. It is fine for a query executed once a day to use partial index techniques…..a query that is executed millions of times a day needs an index to support it.
- Add histograms steps. It is a bit much to go into here, but the default histogram steps of 20 is too low for SAP – 50 is better – and for tables with large IN() or UNION clauses, 100 might be needed.
- Better histogram cell or density stats than the default options collect.
- Slight query modifications (to aid in costing or execution speed)
- Altering the query to add a hint via a PLAN clause
The last one is a last resort solution – although it may be a temporary solution implemented quickly to work-around the problem and restore business execution while diagnosing the real issue. However, these query optimization techniques are likely best left as a topic for another day.