I recently attended the SAP Cloud and Virtualisation week at the SAP Labs in Palo Alto, it was 1st visit to Palo Alto and I have to say it did not disappoint.
As the week progressed, the attendees heard more and more about the technology trends and also the operation best practises for that technology – I was struck by the amount people referred to abstraction of the various layers of technology using virtualisation. This abstraction led me to think about whether administrators would abstract or abdicate their responsibilities in certain areas.
Every Vendor had an abstraction layer and a tool set to work with that layer, these tool sets were designed to enable the platform to do things like
1. Reduce the TCO of a server landscape through better utilisation of compute resources
2. Reduce the compute wastage inherent in todays landscapes due to over-powered servers
3. Increase the flexibility of landscapes through abstraction of application resources
4. Increase the effectiveness of administration staff through automation and increased use of modular items, like stock O/S images, Appliance factories
The points above require a little more than new architectures to become reality, they require Automation and Orchestration software, the ability to use a rule base and apply it to a homogenous architecture in order to make the best utilisation of your landscape, then periodically checking the landscape/external stimuli and make adjustments accordingly.
So far so good and the technology is amazingly cool, but I see lots of issues with this approach.
1. “Don’t worry about that layer, we abstract that using this technology and you do not need to deal with it – the hypervisor does it”
I have a big problem with this type of thinking – of course I need to know what it is doing, that is my job. I don’t need to interfere with it, but I need to understand how it works and what it is doing so that when things go wrong I can understand the event cascade that forms the issue.
2. Horizontal scaling
As you start to plan your ability to auto-scale with performance demands, it becomes clear that horizontal scaling is easier than vertical – it is less disruptive to your landscape to start a new application server than increase the CPU and RAM of an existing machine. The issue that I have is that you now need to manage more systems, with all the corresponding issues that go with that – how do you plan for patching a landscape that changes regularly in size.
Planning for this type of activity takes practice and consideration, something that becomes more difficult as time progresses – not because of technology but because of the human factors involved.
3. Lazy administrators
I fear lazy administrators, they are the stuff of nightmares for me, perhaps I take my job too seriously or my responsibilities too far – but I believe if you are paid to administrate a landscape then you should ‘know’ your landscape.
Automation and Orchestration systems are unlikely to be built by the admin teams, the complexity of the process mapping software and the time taken to develop the end-to-end processes mean that a project team will be formed to put in the system. This has two probable effects
- The current admin team will have one or two people involved, if the admin team are lucky these will be motivated and skilled people capable of holding their own against the consultants – ensuring their needs are met and communicating effectively with their admin colleagues on the new system and how to use it.
I have seen this scenario happen, twice in over 10 years.
- The admin team will become the 2nd generation users, as technology moves so rapidly, so too can the turnover in staff in larger teams – 2nd generation users are effectively working from the Standard Operating Procedures (SOPs) set down when systems were originally built and have little training. They lack a critical understanding of how something works, only that it should provide this outcome. By the time you reach the 3rd generation of staff, people are divorced from the original project and need training on the system – of course by this time it’s due for an upgrade anyway, using another project team.
During the V-Week, I saw two other developments that I conceptually love and realistically fear.
1 Appliance factories – SAP have been working on an offering that, through a subscription charge, will give customers access to a ‘factory’ that will build SAP landscapes to order and provide images of these SAP landscapes for implementation.
As a junior Basis person, I learnt how SAP systems worked by installing them. The interplay and dependencies of the components realised through the observation of the installations, I worked out from the log files of the 40b installations that you could interrupt the installation before the Database build, restore the Oracle database, skip a few steps and continue the install to completion, this is now a homogenous system copy – I devised this method a full 2 years before the SAP note was released supporting this action.
Later I learned many people had also worked it out and left less awesome somehow. The point is that a critical learning tool will be homogenised and made SAP standard, leaving our junior staff poorer for it.
2. Automated Daily checks – I really had a bad feeling about this one and could not put my finger on it. Systems like Tivoli and CCMS/SolMan already provide real-time automated monitoring, which in many cases can remove the need for daily checks. But as I thought more and more about it, I found my concern was about two things.
a.Signal versus Noise
When receiving alerts as they happen, it is difficult sometimes to discern a pattern, because you can be receiving alerts in many different forms – E-mail, SMS, Screen prompts. Without an aggregated view over 24/36 hours of the systems, and the connected satellites, patterns will be difficult to pick up – it will be difficult to determine what is signal and what is noise related to a particular problem. I know monitoring software will have the ability to provide this view – the issue I see is that how many people will go and view this on a regular basis unless they are either made to build it themselves or sent it by the system.
b. Rhythm of a system
As an administrator, performing the daily checks was a pain, but I developed a good system which enabled me to whip through them quickly and effectively. The checks allowed me to develop a sense of the SAP landscape’s rhythm, you may now stop laughing, every system has a rhythm and sequence of events that play out on a regular basis, this is because the system is based on rules and processes. For example, every month the month-end processing runs, as a result the background processes are swamped and Mr Jones’ stock report falls outside it job slot and gets cancelled, so 1 week out of 5 Mr Jones’ report get cancelled and he reruns it. These types of items are things that make up the rhythm of a system and I am not sure how automated daily checks will affect this ability to ‘know’ your system. It is unlikely that your daily checks system will be able to discern the ‘normal’ monthly event from something more sinister.
So far this has been a fairly negative post about how technology can be used inappropriately, I want to concentrate on how the technology can be used to enhance the work of the administrator.
Jobs change as do the requirements to be fulfilled within jobs, the move to automated systems is a progression along the road. Just because I learnt something a particular way does not mean that it should remain that way, practise evolves to meet the needs of the business and provide the services they require. The ability to automate provides a true 24*7 service, that is auditable and free from bias – because it is always on and based on pre-defined rules.
The number of technologies in use within our landscapes is growing and no 1 person can ever hope to be an expert in all of them, so we use alerting technologies to provide timely information to assist in resolving issues. The abstraction of technology also provides us with more freedom as we interact with other teams – for example the ability to provision a disk volume, previously may have been outside our control, with the right automation tools and scripts it is possible to extend and Oracle volume automatically when a storage threshold is reached. This has just increased the effectiveness of the collective teams from a possible 2 days to 30mins.
As former administrator my life was a constant struggle with compliance rules and the goal to make my life easier through automation, so I believe in the technology but not that the technology is the goal. I knew my landscape and it’s rhythm, I could deploy a simple monitoring architecture to provide key metrics, aggregated over 24 hours which would show me any trends. This made me much more effective without a corresponding decrease in my awareness of the systems.
So will this technology lead to administrators’ abdication of the responsibility for their landscape – I think it will for many, especially as time goes on and for many reasons. This is a subject I do feel passionately about and I hope to have a debate with @Steverumsby, @Tomcenens about it, which will be referred by ever insightful @Jonerp