Part 2 – Unit Testing – Poltergeist: The Legacy
ABAP Unit is the most fantastic tool in the ABAP developer’s toolbox that they never use. Apart from maybe debugger scripting but let’s let that one go. Anyway why does no-one in ABAP world do unit testing, and how can I convince them otherwise?
At the end of the last blog….
…Casanova Frankenstein had thrown me into a vat of sulfuric acid filled with acid resistant piranhas and released a swarm of killer bees in case I try to jump out of the acid and he has an atom bomb timed to go off in ten minutes time surrounded by electrified barbed wire guarded by starving lions with laser beams coming out of their heads.
Luckily life keeps handing me lemons, so I used those lemons to neutralize the sulfuric acid, but not before jumping up and down a lot to make sure the acid killed all the bees. The specially bred piranha fish could no longer live in a neutral solution, so I used all the dead fish and bees to make a yummy fish/bee pie which tempted the lions to come and eat it, and whilst they were distracted I used the last of the acid to melt the barbed wire and defuse the atom bomb. Easy really – if only convincing people to use ABAP Unit was that easy.
Unit Testing – what’s that then?
The idea of unit testing, or test driven development, in the ABAP world seems to be 100% impossible to convince people of its worth. It cannot be done, it’s impossible, all I am doing is banging my head against a six feet thick wall made of diamond.
Doctor Who, as portrayed by Peter Carpaldi, managed to smash through such a wall, though it took him several billion years, and I am not convinced I have all that time available to me.
Nonetheless it does not stop me. In the novel “Legend” by the late David Gemmel, the invader Ulric asks one of the defenders “Why do you seek to stop me? You cannot prevail, my army is far too mighty” or words to that effect. The response came back “We cannot stop you, we knew that from the start. This was never what this battle was about”.
Ulric was confused, as well he might be “so what IS IT about then?” has asks. Back comes the answer “We are TRYING to stop you.”
I am fighting the same battle with unit testing as a concept. No-one in the ABAP world (apart from me) wants to do this, which is coming through crystal clear, I am obviously – obviously – wasting my time, so why do I keep going? Have I not got something better to do?
No I don’t. The advantages are so great, and so obvious once you are even one step down the path, that if I spend the rest of my life shouting out about this, and convince even one ABAP person, the exercise will have been worthwhile.
Please find below a link to a blog I wrote on the subject some years back:-
I then wrote a book about ABAP programming. On the internet you get a free sample chapter, and that is about unit testing as well:-
What has happened here, to use another analogy, is somewhat like Cassandra in the books of Homer. She was cursed by the gods to always speak the truth and have no-one ever believe her. At this time I think I have hammered the point home enough.
NB from this point on I am probably going to be using the terms “unit testing” and “test driven development” interchangeably. They are different things, but the latter depends on the former, and I think they go together like, well like in the song from Grease:-
We go together
Like rama lama lama ka dinga da dinga dong
As shoo-bop sha wadda wadda yippity boom de boom
Chang chang changitty chang sha-bop
Dip da-dip da-dip doo-wop da doo-bee doo
Boogedy boogedy boogedy boogedy
Do you know the publishers at SAP Press suggested I did not use that exact analogy in my book? I cannot imagine why.
Anyway, earlier I mentioned about people at work thinking this (unit testing / test driven development) was some sort of radical idea I had come up with all on my own – I got comments such as “This is just a theory right? No-one has ever actually done this?” Sadly when it comes to the ABAP world they may well be correct in making such an assumption.
Again, just because it is new to the ABAP world, doesn’t mean unit testing is new to everyone else, and to prove this I went looking on the internet for its history.
USA programming guru Kent Beck is credited with “rediscovering” unit testing (TDD) in 2003, when working on the Chrysler Worldwide Payroll system. When asked what he meant by “rediscover” he said:-
The original description of TDD was in an ancient book about programming. It said you take the input tape, manually type in the output tape you expect, then program until the actual output tape matches the expected output. After I’d written the first xUnit framework in Smalltalk I remembered reading this and tried it out. That was the origin of TDD for me. When describing TDD to older programmers, I often hear, “Of course. How else could you program?” Therefore I refer to my role as “rediscovering” TDD.
The book in question was “Program Checkout” in Digital Computer Programming (D.D. McCracken, 1957).
I was then interested in learn the first practical use of test driven development (unit testing) in a computer context in real life. I turned out to be in 1942, when computers were gigantic and full of valves and gears and things and like most advances in computer programming it was the female of the species behind it. In this case a group of women known as “Top Secret Rosies”
In the 1940s, Top Secret Rosies were doing Test First with artillery tables on ENIAC – a group of female mathematicians were recruited to complete secret research for the US Army. Top Secret Rosies: The Female Computers of WWII is a one hour documentary that shares the little known story of the women and technology that helped win the war and usher in the modern computer age. Even to this day there are so very few female programmers, at least in the countries where I work e.g. Australia / UK / Germany, which is quite surprising considering the history of computer programming e.g. Lady Lovelace and so on.
Here I go again on My Own, down the only road I’ve ever known
So, let us get back to unit testing as a concept. What’s it all about Alfie?
12 – Theory of Unit Testing
As mentioned above, the concept of “test driven development” in its current form was first written about in 2003.
The idea is to embed within a program a series of low level tests which (a) prove that a certain routine does what it is expected to do and (b) that a change to one part of a program does not break another part of the program.
An automated framework for unit testing was invented around 2007 for Java, and 2010 for ABAP.
The industry term is “xUnit Framework” and for each language you replace the “X” with the name of the language, so in Java this is Junit and the SAP framework is called “ABAP Unit”. In the programming language invented by Frank Zappa the unit testing framework is called “Moon Unit”.
I was at the cheese counter at Tescos the other day and who should I meet but Lord Lucan and the Loch Ness Monster. Neither of them had ever heard of unit testing, and when I started to explain that the tests ran automatically they stopped me and said “A-Ha! A-Ha-Ha-Ha! We have no need of such a thing, as we already have HP Quality Centre (HPQC) in use!”
At first glance that seems reasonable – why would you need TWO automated testing tools? The fact is, they (ABAP Unit and HPQC) are very different beasts.
13 – ABAP Unit vs HPQC
The purpose of both tools is the same – regression testing – but how and when they go about it is very different as can be seen from the slide above. I think they are different enough that there is virtually zero overlap and if you can get to a stage where you have both in your development process then you win the entire contents of Bruce Forsyth’s conveyor belt. Didn’t you do well? What a lot he’s got. Nice to see you – to see you nice. Give us a twirl.
Anyway, enough of that – the point is there would be no point in using both tools if they were the same. After all, you don’t get anything for a pair – NOT IN THIS GAME! You’ll never beat Bruce Forsyth on a Saturday night because he is so chummy! Isn’t he chummy! Isn’t he chummy!
OK, time to calm down. The next point is in regard to what unit tests look like – the code should read just like the specification the functional person (hereafter business analyst) gave you. They should be able to have a look at the unit test code and say “that is just like the test script I wrote” or more importantly call you up if the tests look nothing like the test script / specification.
14 – What do ABAP Unit Tests look like?
To re-iterate – you write the tests, in such a way they could be understood by a business user if they should happen to read the code. This is a type of test driven development called “behaviour driven development”.
Unit Tests should be “executable specifications”. That is, the aim is that the code which executes the tests looks just like the written specification document….. That way you know you have met the specification because it executes…. Hence the name….
I will now consider a more real world example. There is nothing that drives me up the wall and across the ceiling more than examples where the variables and routine names are called A and B rather than actual business related names. The reason it is bad being driven up the wall and onto the ceiling is that when you get there you have to dance with Lionel Richie.
15 Real World Example
We start with a so called “use case” – a real person needs the computer system to do something for them for a real business benefit and the use case spells out that need. Then we describe the steps in the process to be encoded in the computer system.
Typically at the very minimum you need at least two tests for each such use case – you test the happy path where everything works out fine, and also that if things go wrong some sort of error is generated.
Now, how would such code look inside SAP? Let us look at some “ we have always done it this way” code:-
It was mentioned above that unit tests only test the business logic, leaving tools like HPQC to test all the other stuff like interfaces.
Traditionally you need real people to do testing, and realistic data in the database, and a test instance of your external system. Testing also takes a long time, and there is so much you need to cover you can’t get anywhere near enough coverage. Moving production data to a test or development system is not easy, which is why so many custom solutions are on the market (including one from SAP itself).
Traditionally each subroutine in a program – even an OO program – did many things like reading from the database, asking the user a question etc.. at the exact point in the program flow it was needed, mixed in with the business logic. This is because it is so incredibly easy to do.
When a hero in an action movie says “that was almost TOO easy” you know something really bad is about to happen to them. It is the same here – it is too easy to add authority checks, database reads, three bags of flour and a jellied eel all into the same routine. It is just like when a child mixes up lots of different sorts of brightly coloured plasticine into a ball and gets puzzled when the whole things turns SAP grey.
To put it another way, most routines in a program are responsible for doing many things at once.
This makes it impossible to test the business logic independently from the database, external system or whatever.
Thus the first step is to separate out each “dependency” e.g. database access into its own class which just does one thing e.g. read from the database. You then replace direct reads which methods of the database access class. Let us see how that would look after this “refactoring”:-
17 Breaking Dependencies
As can be seen, every time we have a dependency a call is made to a class which has a specific (and single) responsibility for dealing with that dependency. We are back to the “single responsibility principle” once again.
Don’t get me wrong – as unit testing expert Chrissie Hynde would say – this is a lot of effort. It is well worth it though. Code that has been changed in the way described above is able to be tested using unit tests, because you can replace all the dependencies with “mocks” during the test run.
18 Mock Objects
For each of your “dependency” classes e.g. class that deals with the database, create a local “mock” – a class which looks like the original but does not really interact with the database or external system or whatever. It just sends back fake data.
In production mode the real classes are used. During the automated test run the “mock” subclasses are used.
You can do a test on the actual production code using the fake classes, the routine under test will never know the difference.
That is the Liskov Substitution Principle at work – you can substitute a subclass and the calling code will not be able to tell the difference.
Then you write a test that looks just like the specification – in our example:-
GIVEN TRUCK_MAXIMUM_WEIGHT( 20 )
GIVEN ACTUAL_TRUCK_WEIGHT( 15 )
WHEN TRUCK_IS_WEIGHED( )
THEN TRUCK_IS_NOT_OVERLOADED( )
You would then do another test for the opposite result i.e. when the truck is overloaded.
You end up with a whole bunch of such descriptive tests, each outlining a “use case” or “feature” of the application being developed. When doing proper test driven development you write the test first, and know you have the feature working when you have written enough production code to get the test to pass.
Moreover, going forward to the time when you make any sort of change to the program the real payoff occurs.
19 Running Unit Tests
Getting the code in a state where it can be tested is an enormous effort, like climbing a mountain, but just like climbing a mountain, once you are on the top you have a view (of the state of your code) that is second to none.
Once the framework is set up, every time you make any change anywhere in the program, run the “unit test” option from the SE80/SE38 menu, there is no database access so within a few seconds you can see if you have broken anything, and if so, what broke and what the problem is, and then navigate to the broken code directly from the above screen.
You can even set up such tests as regular batch jobs to check important programs (or everything). In addition you can trigger the test as part of the extended syntax check run when a transport is released.
The benefit of this should be obvious – in the same way you do a syntax check after every code change, to make sure the program still compiles, you can also do a unit test run, which is just as fast as the syntax check, and you can be sure changes to one area of the program have not broken anything else. In other words, you have covered your bottom.
In the words of Robert Martin “QA should find NOTHING!” In other words by the time the code change gets to tests the unit tests have proven beyond doubt that the code works (or at least does what it did before).This is the ultimate bottom covering exercise.
This is wonderful, this is the Holy Grail, this is the greatest thing since sliced bread. The mere existence of such a concept should cause spontaneous street parties throughout the world. So, why does no-one ever do this in real life?
20 Twice as Long
The unit test framework does not appear to be used very much. Part of this is people just don’t know it exists, but there are also some commonly used arguments against using it.
The main one is … it takes twice as long to write a program with unit tests. When managers hear that they will be horrified, because that implies you will be half as productive.
However since 95% of effort on programs is on maintenance, over the entire life cycle of the program the benefit becomes a lot more obvious. But how many people can think in the long term or even in the medium term? It is rather like large companies being unable to plan for the long term because the management is judged only on the last three months (by the media and the so called analysts) and if they have a good three months they get a huge bonus and if you have a bad three months they get sacked, so there is no incentive at all to take a long term view.
Hereby lies the problem – it is obvious common sense to think for the long term (from pensions, to climate change, to doing something properly in IT rather than focusing on the need for speed) but there are enormous social pressures forcing people to behave stupidly for short term gain. The global financial crisis of 2008 was caused by that sort of short term thinking – give someone a loan they cannot possibly pay back, give them a really low interest rate for five years, get a huge bonus right here and now, and then not think about what would happen after five years.
I am amazingly lucky that where I work IT management – right up to the CIO – do indeed think about the future, and so have given me a green light to go ahead with adding such unit tests to our existing core programs, as well as any new ones. In fact I have been doing unit tests for years now, so I know for a fact this is going to work, and won’t people (end users) be glad when it does?
Actually probably not – bugs are like internal organs, you only notice them when something goes wrong. It is rather like the news – you never get front page headlines like:-
“In Venezuela, everyone is happy and content!”
“Company President is really nice guy, has never done anything wrong in his life!”
“Economy has never been better, Government doing really good job!”
When the constant stream of bugs caused by program changes dries up due to unit tests choking them at birth then people will start to forget how bad it was before. However I don’t care – I will remember.
Getting back on track, there are three ways to introduce unit tests to programs:-
- For new programs, use test driven development, that way the tests are there from the start
- Totally re-write monolithic programs in a testable way, adding tests as you go, and taking a year (that is what I am doing BTW)
- Add tests to existing programs
The last one is the hardest, as might be imagined. Much has been written on this subject but all the literature starts off presuming your program is already constructed in an object oriented way, just without tests. In ABAP world you are 99.9% likely to be dealing with a huge procedural program.
For what it is worth, here is how I would go about such a task, and in fact what I have been doing for the last five years to one business critical program I generally look after, until I got permission to re-write it.
21 No One Ever Happy
I say when you have a change to a big program, write a test for the new behaviour, it fails, change the code, the test passes. Changing the code is easy to say, far harder to do as you have to replace (say) the database calls in the routine being changed with calls to a database class. However you are only doing this in one routine, not the whole million line program.
Then you have one unit test. Let it go at that, do not bother refactoring anything else. You know the routine you have changed works, you have no idea if you have broken anything else. You now have 0.01% of the program under test. You may well have broken something in the other 99.99% but this is no different to any previous changes you have made before the advent of unit tests
When the next change comes along to the program, the next week, to a totally different part of the program, repeat the procedure. Then you have two tests, and furthermore you know the second change has not broken the first piece of functionality. At this point you have 0.05% of the program under test – you still may have broken something in the other 99.95% but you are ever so slightly better off than you were before.
If you rigidly enforce this then after a while you will benefit from what I call the law of “no-one is every satisfied” which translates to “if a part of a program is being used by a human, sooner rather than later they will want it changed”. Therefore after a five year period every single part of the program that is in use will have been subject to a change request, and thus will have a unit test of some sort. This is very simplistic, but that is the genera idea. New programs should have tests from the start.
The very first program I had to write using the “radical new” idea of using SOLID principles, the factory pattern and so forth, was the most simplistic program possible. They very sort of program where people say OO is an overkill.
It was chosen for that very reason I think – the logic in the program was so trivial, if all this SOLID OO stuff turned out to be stupid or fatal then nothing important would be damaged.
All the program did was, for a given country, compile a list of buttons and arrange those buttons on a screen. When the user pressed a button a transaction was called. Just like The Fiori Launchpad, but in the SAP GUI. As I said, a trivial exercise, usually you would have the thing written during the course of a morning.
I already had a twelve year old procedural version, wrapping such a beast up in OO and using the factory pattern took a little bit of time, but nothing to arduous. Now I had to live up to my promise of putting unit tests in all new programs.
Things get even sillier, as initially we only have one country who will be using this, though of course the idea is to future proof this, so when other countries want to come on board with different list of buttons, we only need to create a new subclass.
I make use of the ZCL_AF_FACTORY generic method to pick the correct subclasses. In this case we only need a country specific subclass for the model, but it is possible that one day a country specific subclass will be needed for the view, so the program caters for this.
The program works fine, there is not much to it – it just shows a big bunch of buttons on the screen, each one calling a transaction code.
However I could not let this go without adding unit tests. As such I have added a special INCLUDE to store the test class.
If I was doing this properly I would have started with the tests and built the rest of the application around it. As I was copying an existing program, albeit totally rewriting it, I did this at the end.
Now the program only does two things – shows the user the buttons on a screen, and then does a CALL TRANSACTION when the user presses a button. You can’t test either of those with a unit test. So the obvious conclusion is – do not bother with unit tests in this case.
Well I am going to do it anyway.
While I was Going up the Stair, I met some Behaviour that Wasn’t There
If the program was blank i.e. had no code at all, then you have to say “what is the most important thing that this program does NOT do? “And then write a test for this missing behaviour. A gentleman called Dan North came up with this approach.
If 60’s pop group the Zombies were writing a program with unit tests the most important behaviour for them would be the existence of their girl, but with no code in their program they would have to sing “Please don’t bother trying to find her, SHE’S NOT THERE”. So they would write a test in which the FIND_GIRL method returned a positive result. Naturally that test would fail (because she’s not there) and they would then have to add code until she was there, and then the test would pass. That would ruin the song though, but you can’t have everything.
Going back to my program, since the UK asked for this, the most important thing a blank program would NOT do without any code is present a GB user with a bunch of buttons.
So that will be the first unit test. If buttons were there the next most important thing the program would not do was react to one of those buttons being pressed. So that is the second test.
22 Test Class Definition
If someone read that, a BSA for example, they should get a general idea of what the application does, hence the unit tests should form what is called an “executable specification”. That is, the test says what the application should do, and then goes ahead and proves it actually does it. Since we are limited to 30 characters, it is often difficult to properly express the intent in the name however. In the above you cannot really tell what the “buttons” are for, so I use comments.
Each of the FOR TESTING methods is further split into three sections, which set up the starting position, invoke the production code to be tested, and then see if the test has passed or failed.
23 Given When Then
I then start with the tests. I cannot test if the screen actually appears, or if looks OK, or I can test is if the table of buttons was passed into the view. Likewise, I cannot test the CALL TRANSACTION I can only test that raising an event from inside the VIEW will cause the value to get transferred to the USER COMMAND method of the model.
Nonetheless, creating those tests ensures no really low level errors have occurred, and also forces me to structure the program correctly e.g. having a separate class for database access. The more business logic (conditional logic, calculations) an application actually has the bigger the benefit from unit tests but there is always some.
NB I still am learning here myself, I am sure there are better ways to structure the tests. This is partly why I write these blogs – in the hope someone out there has done this before, been there got the t-shirt, and will tell me a far better way to achieve my goals.
This Car is Monolithic, So Horrific – this Car is a Greased Dead Snail
I mentioned earlier that I (with colleagues) have a task to re-write one or two gigantic monolithic procedural monsters in a SOLID manner.
It is going to be difficult enough getting the structure right for this and moving the existing code to the correct place and so forth – this will be the subject of part three of this blog.
However since this is a “no pain, no gain” type of exercise why not make the task even more difficult by adding unit tests at the same time as re-writing it? Yes that sounds like the go. Possibly to make this even more of a challenge I should do this blindfold, wearing boxing gloves, whilst cycling over the Grand Canyon on a tightrope that is on fire.
Naturally the code has no tests, and is 18 years old, happy birthday to it, and moving code out into classes is a very different matter than working out what needs to be tested.
There are two strands to deciding what tests to add, each strand relating to part of the programs lifecycle.
The first 5% of a programs lifecycle is when you get the original specification – we need a program to do this and that. Usually there are a manageable number of features, and they are all obvious. Here you can add unit tests using the “most important thing the program does not do” technique, by imagining the program has no code. This way you are still sort of doing a “test first” approach, even if you are actually adding the tests last.
Then we come to the remaining 95% of the programs life cycle, the part that never ends until the day you migrate off your SAP system and onto something else. SAP of course would tell you that the latest EHP would probably provide all the functionality you have spent twenty years developing, right out of the SAP box, so give up your Z code and return to standard, everything will be OK. Oh, and if it is not, change your business process so it agrees with the way SAP wrote their code. That argument did not work in the year 2000, and I am not convinced it is going to work this time around either no matter how many buzzwords like “digital transformation” and “paradigm shift” and “ten million tons of bananas” get wrapped around the argument.
What was I talking about? Oh yes – after the program has been written (5%) then you get a never ending stream of change requests – either bug fixes or enhancements. That is the 95% of the program lifecycle.
Now I have no doubt you have a really robust change control process with detailed documentation on every change to a program, documentation which is really easy to find. You don’t? Oh dear, who would have thought.
If you don’t have ANYTHING then you are sunk, in more ways than one, but hopefully the vast bulk of organisations would have something however trivial. In my organisation in Australia we have RevTrac, and each change has the transports bundled with the documentation and test script, and a log of who approved each step in the journey from DEV to PROD. It is therefore quite easy to get a list of changes and why they were made.
However, even a spreadsheet with the list of changes for the month, quarter or whatever, would e good enough if they said who was asking for the change and why, and hopefully describing the nature of the change!
Lipstick on a Bug
Changes are either fixes to something broken (a bug), or a request for extra functionality (new feature) or a request for new functionality masquerading as a bug, as whoever is requesting it thinks it will get through easier that way.
If it is new functionality then you will be able to build a use case as described above i.e. as an XYZ a need the program to do ABC to give me business benefit 123.
Note you can write the test in the GIVEN / WHEN / THEN format without the slightest clue as to what routines in the program actually perform the desired logic. You can write that test first, it will fail, and THEN you go hunting for the routines. If you can’t find tem easily then the naming of the routines is probably all over the place, and you need to rename the routine (method) so it says what it actually does, and thus the next person who goes looking for it can find it.
If it is a bug then you can still write the test – hopefully you will have been told how to reproduce the problem (i.e. the GIVEN and WHEN parts) and what the desired result is as opposed to what had been actually happening (the THEN part).
Mr. Coupling makes Exceedingly Bad Tests
One easy trap to fall into would be to write one unit test per method. Indeed that is exactly what some people tell you to do on the internet – moreover the SAP ABAP framework lets you press a button and automatically generate a test class for each of your methods.
That sounds intuitively right does it not? You want to test a small piece of logic, so a unit test would seem just the thing. Indeed when a routine is clearly broken, you most likely need to do just that.
However Robert Martin wrote a great article as to why this is bad as a general concept .
Iin essence you want to decouple your tests from the structure of your program, as otherwise when you change the implementation details of your program the test will go all haywire. He notes that while you want your production code to be as generic as possible, handling all possible situations, your tests should be as specific as possible.
The exact quote is:-
It is entirely impractical to specify absolutely everything. So what happens instead is that we gradually increase the generality of the production code until every test that we could possibly write will pass.
Woah Cowboy! We keep writing failing tests in order to drive the generality of the production code to a point where it becomes impossible to write another failing test. Woah Cowboy!
Leaving cowboys aside, he is saying that the more specific you make your tests, the more generic the real code is forced to become. This sounds like a paradox, and indeed it is a famous paradox in IT.
Anyway the theory (paradox) is that it is easier to write a program that solves a general problem, rather than the specific problem that you are trying to solve.
The third and last blog in this series will be in regard to the exact mechanics of re-writing a monolithic program in a SOLID manner.
The problem I have is that Casanova Frankenstein has captured me again, and this time made me drink poison, and then dangled me upside down above a live volcano, inside a straitjacket filled with dynamite, suspended by a burning rope, with flying monkeys firing machine guns at me
So … stay tuned for the next exciting episode in the series… will I escape from the evil villains clutches? What is this “program re-writing” of which I speak? How I am going to implement all of this in real life? (If I survive)
Cue theme music…
PS Many thanks t the Bar D’Aix En Provence, and O’Reillys Irish Pub in Heidelberg for letting me use their Wi-Fi (which was invented in Australia by the way) whilst writing this. Naturally Germany is full of French and Irish places.