Introduction

Usually when I blog on SCN I write about some specific development problem and the solution I found for it. In contrast this blog is about a more abstract topic, namely how to efficiently debug code. While it is quite easy to debug SAP code (the business suite is open source after all, at least the applications written in ABAP) debugging a certain problem efficiently is sometimes quite complex. As a result I’ve seen even seasoned developers getting lost in the debugger, pondering over an issue for hours or days without being close to a solution. In my opinion there are different reasons for this. One, however, is that some special approaches or practices are necessary in order to find the root cause of complex bugs using debugging.

In this blog I try to describe the approaches that are from my experiences successful. However, I’d also be interested which approaches you use and what your experiences are. Therefore I’m looking forward to some interesting comments.

Setting the scene

First I’d like to define what I would classify as complex bugs. In my opinion there are basically two categories of bugs. The simple ones and the complex ones 😉 . Simple bugs are all the bugs that you would be able to find and fix with a single debugger run or even by simply looking at the code snippet. For example, copy and past errors or missing checks of boundary conditions fall in this category. By simply executing the code once in the debugger every developer is usually able to immediately spot and correct these bugs.

The complex ones are the once that occur in the interaction of complex frameworks or APIs. In the SAP context these frameworks or APIs are usually very sparsely documented (if documentation is available at all). Furthermore, in most cases the actual behaviour of the system is influenced not only by the program code but also by several customizing tables. In this context identifying the root cause of a bug can become quite complex. Everyone that has every tried to e.g. debug the transaction BP and the underlying function modules (which I believe were the inspiration for the Geek & Poke comic below) or even better a contract replication form ERP to CRM knows what I’m talking about. The approaches I will be discussion in the remainder of this blog are the ones I use to debug in those complex scenarios.

http://geekandpoke.typepad.com/.a/6a00d8341d3df553ef016767875265970b-800wi

Know your tools

As said in the introduction I want to focus on the general approach for debugging in this blog. Nevertheless, an important prerequisite for successful debugging is knowing the available tools. In order to get to know the tools you need to do two things. First, its important to keep up to date with new features. In the context of ABAP development SCN is a great resource to do so. For example, Olga Dolinskaja wrote several excellent blogs regarding new features in the ABAP debugger (cf. New ABAP Debugger – Tips and Tricks, News in ABAP Debugger Breakpoints and Watchpoints , Statement Debugging or News in ABAP External Debugging – Request-based Debugging of HTTP and RFC requests). Also Stephen Pfeiffers blog on ABAP Debugger Scripting: Basics or Jerry Wangs blog Six kinds of debugging tips to find the source code where the message is raised are great resources to learn more about the different features of the tools. Besides the debugger also tools like checkpoint groups (Checkgroups – ABAP Development – SCN Wiki) or the ABAP Test Cockpit (Getting Started with the ABAP Test Cockpit for Developers by Christopher Kaestner) can be very useful tools to identify the root cause of problems.  However, reading about new features and tools is not enough. In my opinion it is important to once in a while take some time to play with the new features you discovered. Only if you tried a feature in toy scenario and understood what is able to do and what now will you be able to use the feature in order to track down a complex bug in a productive scenario.

Besides the development tools there are other important tools you should be able to use. Recently I adopted the habit to reply to questions by colleague whether I knew what the cause of a certain bug could be if they already performed a search on SCN and in the SAP support portal. In a lot of cases the answer is no. However, in my opinion searching for hints in SCN and the SAP support portal should be the first step whenever you encounter a complex bug. Although SAP software is highly customizable and probably no two installations are the same those searches usually result in valuable information. Even if you won’t find the complete solution you will at least get information in which areas the cause of the bug might be. And last, but not least, also an internet search usually turns up some interesting links.

Thinking about the problem…

The starting point for each debugging session is usually an error ticket. Most likely these tickets was created by a tester or a user that encountered an unexpected behaviour. Alternatively the unexpected behaviour could also be encountered by the developer during developer testing (be it automated or manual). In the first case the next step is normally to reproduce the error in the QA system. Once a developer is able to reproduce the error it is usually quite easy to identify the code that causes an error message or an exception (using the tools described in the previous chapter). If no error message or exception but rather an unexpected result is produced identifying the starting point for debugging can already become quite challenging.

In both cases I recently adopted the habit to not start up the debugger immediately. Instead I start by reasoning about the problem. In general I start this process of by asking myself the following questions:

  • What business process triggers the error?
    The first question for me is always which business process triggers a certain error. Without an detailed understanding of which business process and its context causes an error identifying the root cause might be impossible.
  • What does the error message tell me?

In the case of a dump this is pretty easy. The details of the dump clearly show what happened and where it happened. However, in the case of an error message the first step should always be to check if a long text with detailed explanations is available. Most error massages don’t have an detailed e

description available. But if a detailed description is available it is usually quite helpful.

Even the error messages without detailed descriptions can be very helpful. For example error message following the pattern “…<some key value> not available.” or “….<some key value> is not valid.” usually point to missing customizing. In contrast to that a message like “The standard address of a business partner can not be deleted” points to some problem in the process flow. Once one gets used to reading the error messages according to those kind of patterns they are quite useful to narrowing down the root cause of a error.

  • Which system causes the error?

Even if it seams to be trivial question it is in my opinion a quite important on. Basically all software systems in use today are connected to other software systems. So in order to identify the root cause of an error it is important to understand which system (or which process in which system) is responsible for triggering the error. While this might be easy to answer in most cases there are a lot of some where answering this question is far from trivial. For example consider SAP Fiori application that is build using oData service from different back end systems.

  • In which layer does the error occur?

Once the system causing an error is identified, it is important to understand in which layer of the software the error occurs. Usually each layer has different responsibilities (e.g. provide the UI, perform validation checks or access the database) For example, in a SAP CRM application the error could occur in the BSP component building the UI, the BOL layer, the GenIL layer or the underlying APIs. Understanding on which layer an error occurs helps to take short cuts while debugging. If the error occurs in the database access layer it’s probably a good idea to not perform detailed debugging on the UI layer.

Usually I try to get a good initial answer to this questions. In my opinion it is important to come up with a sensible assumptions for answers to these questions. If the first answers obtained by reasoning about the error are not correct the iterative process described below will help to identify and correct these.

…and the code

The next step I take is looking at the code without using the debugger. After answering the question mentioned in the previous section I usually have a first idea in which part of the software the error occurs. By navigating through the source code I try to come up with a first assumption what the program code is supposed to do and which execution path leads to the error. This way I get a first assumption what I would expect to see in the debugger and also test my assumptions if come up with so far.

Note that trying to understand the code might not be sensible approach in all cases. Especially when dealing with very generic code it is usually far easier to understand what happens using the debugger. Nevertheless, I’ve had the experience that first trying to understand the code without the debugger allows me to debug much more efficient afterwards.

Debugging as an experiment

After all the thinking it is time to get to work and start up the debugger. I try to thinks about debugging as performing an experiment. After understanding what the scenario and context are in which the error occurs (by thinking about the problem) and getting a first assumption what the cause of the error might be (by thinking about the code) I use the debugger to test my assumptions. So basically I use the cycle depicted below to structure my debugging sessions.

/wp-content/uploads/2015/09/debugging_as_experiment_786006.png

First I try to think of an “experiment” to test my assumptions about the problem. Usually this is simply performing the business process that causes the error. Especially if an error occurs in a complex business process it might be better to find a way to test the assumptions without performing the whole complex process. The next step is to execute the “experiment” in order to test the assumptions. This basically is the normal debugging everyone is used to. If the root cause of the problem is identified during debugging the cycle ends here. If not, the final step of the cycle is to refine the assumptions based on the insights gained during the debugging. On the basis of  the new assumptions we can redesign the experiment and start the cycle over again. In this step it is important to move forward in small increments. If you change to many parameters between to debugging sessions it might be very difficult to identify the cause of a different system behaviour. For example consider a situation where an error occurs during the address formatting for a business partner. If order to identify the root cause of the problem it might be sensible to first test the code for the address formatting with a BP of type person and after that with a BP of type organization with the same address. This will enable to check if the BP type is part of the formatting problem or not.

<F5> vs. <F6> vs. <F7>

During the debug step of the cycle presented above the important question in each debugging step is if to hit <F5>, <F6> or <F7> (step in, step over or step out respectively). Using <F5> it is easy to end up deep down in some code totally unrelated to the problem at hand. On the other side using <F6> at the wrong position might result in not seeing the part of the source code causing the problem.

In order to decide if to step into a particular function or method or to step over it I use a simple heuristic that has proven very useful for me:

  • The more individual a function or method is the more likely is it to use <F5>
  • The more widely used a function or method is the more likely is it to use <F6>.

Using this heuristic basically leads to the following results:

  1. I will almost always inspect custom code using <F5>. the only exception is that I’m sure the function or method is not the cause of the problem.
  2. I will only debug SAP standard code if I wasn’t able to identify the root cause of a problem in the custom code.
  3. I will basically never debug widely used standard function modules an methods and instead focus on new ones (e.g. those delivered recently with a new EhP).

As an example consider an error in some SEPA (https://en.wikipedia.org/wiki/Single_Euro_Payments_Area) related functionality. When debugging this error I would first focus on the custom code around SEPA. If this doesn’t lead to the root cause of the error I would start also debugging SEPA related standard functions and methods. The reason is that this code has only been recently developed (compared to the general BP function modules). If I would encounter function modules like BAPI_BUPA_ADDRESS_GETDETAIL or GUID_CREATE in the process I would allways step over them using <F6>. These function modules are so common that it is highly unlikely they are the root cause of the problem.

Nevertheless it might turn out that in rare cases everything points to a function module or method like e.g. BAPI_BUPA_ADDRESS_GETDETAIL as the root cause of an error. In this case I would always check the SAP support portal first before debugging these function modules or methods. As these are widely used for quite some time it is highly unlikely I’m the first one encountering the given problem. Only if everything else fails I would start debugging those function modules or methods as a last resort.

The right mind set

For all the techniques described before it is important to be in the right mind set. I don’t know how often I heard sentenced like “How stupid are these guys at SAP?” or “Have you seen this crappy piece of code in XYZ”. I must admit I might have used sentences like these one or two times myself. However, I think this is the wrong mind set. The developers at SAP are neither stupid nor mean. Therefore, whenever I see something strange I try to think what might have been the reason to build a particular piece a code a certain way. What was the business requirement they tried to solve by the code. This usually has the nice effect that with each debugging session I learn something new about some particular area of the system. This will in the future help me to identify the root cause of new issues more quickly.

And probably the most important technique of all is the ability to take a step back. It happened to me numerous times already that I was working on a problem (be it a bug or trying to implement a new feature) for a while without any progress. For whatever reason I had to stop what I was doing (e.g. because the night guard walked in and ask me to finally leave the building). After coming back to the problem the next day i quickly found the solution. It then always seemed like I had been blind for the solution the day before. So whenever I get stuck working on a problem I started to force myself to step back, doe something else, and revisit the problem afresh a few hours later.

What do you think?

Finally I’d like to here from you what your approaches to debugging are. Do you use similar practices? What are the ones you find useful in identifying the root cause of complex errors?

Christian

To report this post you need to login first.

14 Comments

You must be Logged on to comment or reply to a post.

  1. Łukasz Pęgiel

    Christian,

    nice one. When I have “complex” bug I also start with searching firstly SCN, google and OSS Notes, then if I cannot find anything I go for debugging. I’ve done a lot of debugging of standard SAP programs, classes and FM and I do it like binary search 🙂 Means, I go over some performs or method calls which description says nothing to me and I check at each step if the error had place. If not then I continue, if yes then I start debugging again from this line and but I go inside the method/form and I repeat this until I found the issue.

    Cheers

    Łukasz

    (0) 
  2. Arun Jacob

    Thank you Christian,

    very nicely put the every day struggles of a developer.

    Really loved the stepping back the revisit part(happened to me many times)

    Thanks,

    Arun

    (0) 
  3. Florian Henninger

    Hi Christian,

    I always start to get my hands on the problem have a look at the callstack. Most of the times I find a modification (“Enhancement”) which is the rootcause.

    Second step is to have a look here on SCN and SAP-OSS…

    and afterwards I use the debugger a lot and work with watchpoints.

    But to be fair, most of the times the Enhancements wins the match 😛

    ~Florian

    (0) 
  4. Bärbel Winkler

    Hi Christian,

    thanks for putting this together!

    Here’s a list of how I often approach debugging:

    • I see it as a kind of “whodunit?” where the more or less evasive error is the culprit which needs to be tracked down (can you tell that I like murder mysteries?). Alternatively, it can be likened to the proverbial “search for the needle in the haystack” where you in some cases also need to identify the correct haystack first. So it can be quite a challenge, but as there’s usually a pretty good chance of eventually finding the reason for the error these types of “treasure hunts” can also be a fun and satisfying experience.
    • One of the first questions I ask is “since when does the error occur?” as pinpointing that time usually helps a lot with finding the root cause. This year I was for example asked to help track down issues within SAP’s standard code which had – as it then turned out – been introduced with a big upgrade to ERPEHP6 last year in December. It was a kind of “sleeper issue” which only became visible once more items were being regularly processed in the affected area of the standard code and people started to notice something “not quite right”.Thankfully, it was at least consistent in its behaviour so that it could be easily reproduced. As an aside: This actually was quite the “needle in the haystack” as the issue was caused by one field either containing a space or an ‘A’ when an RFC FM was called in the APO-system and backtracking this to the SAP-code where the ‘A’ was lost took about 2 weeks of off and on debugging.
    • Before diving into debugging code I’m not already at least somewhat familiar with, I first make liberal use of transaction SAT (SE30 for older SAP versions) to get a better handle on what all gets processed and which tables get accessed. As it’s possible to jump into the ABAP code from “promising looking” places in the code and to take a look around this can help to find suitable places to put breakpoints in.
    • As debugging very often is an iterative process and in order to better refine the places for breakpoints, I take lots of screenshots even though they at first glance may not look all that helpful. But, they serve at least two purposes: being able to better place breakpoints for the next debugging iteration in order to skip sections where nothing related to the issues seems to be happening.And, they may later also come in handy as “corroborating evidence” in case there’s a need to raise an OSS-message.
    • Once a certain amount of information has been collected, I also like to discuss the issue and my findings with colleagues. This often provides new insights and hashing things out tends to avoid falling prey to confirmation biases where you just “know” that what you see is an ‘X’ when it’s in fact a ‘Y’ you are looking at.

    Hope these bullet-points give you some more food for thought!

    Cheers

    Baerbel

    (0) 
  5. Jerry Wang

    Hi Christian,

    Thanks a lot for your great job! Being an application developer in SAP for eight years with 2000+ internal & external issues resolved by me, I do have finally generated my own style of debugging and I do see we have lots of same thoughts here 🙂

    1. About your point “Know your tools“: Thousands of years ago there was ever a proverbn in China: “工欲善其事必先利其器( When a workman wishes to get his work well done,he must have his tools sharpened first )”. In SAP I have participanted in ABAB, C#, Java, Javascript development and I do spend lots of time to explore the various feature of debuggers for these languages. I firmly believe in that the debugger is for a programmer just as the gun for a marine. A soldier should be an expert of the weapons at his hand to survive in the war, so do we developer to kill the bug.

    2. About your point “Especially if an error occurs in a complex business process it might be better to find a way to test the assumptions without performing the whole complex process“: Yes, in my blog http://scn.sap.com/community/abap/blog/2014/05/01/my-tips-about-how-to-handle-complex-and-tricky-issues I call this approach with an terminology “error isolation”. It is useful when the business scenario invovled in the debugging are too complex for you to concentrate. In such case I do prefer to spend sometime to build a sanbox which can also reproduce the issue, then I only need to debug on my much simplified sandbox and life becomes easier. Again the Ancient in China said “磨刀不误砍柴工( Grinding a chopper will not hold up the work of cutting firewood )” .

    I successfully use this approach to take down one hardcore incident, details could be found in my blog.

    3. There is a very famous book written by guru Jon Bentley, <<Programming Pearls>>. In chapter 5.10 it gives several brief but useful debugging tip, and also a very interesting bug: ” A programmer had recently installed a new workstation. All was fine when he was sitting  down, but he couldn’t log in to the system when he was standing up. That behavior was one hundred percent

    repeatable: he could always log in when sitting and never when standing. ” Would you like to know the root cause of this mysterious bug? Go read that book !

    4. Debugging is never not just a way for us to resolve problems, but also a powerful approach to learn new knowledge of components for which the documentation is poor or even not. For me, debugging framework code can help me understand it in a deep sight and enable me to build better application on top of it.

    5. The last but not the least, I would like to emphasize another important point which can help effient debugging, that is ourselves, our passion,  our conviction that the root cause could finally be found by debugging or what ever other approaches. Never give up.

    Enjoy debugging 🙂

    Best regards,

    Jerry

    (0) 
    1. Christian Dr. Drumm Post author

      Hi Jerry,

      thanks for you reply and the pointers to your blog. I somehow managed to miss this blog so far. It’s on my reading list now. Also Programming Pearls seems pretty interesting from the few excerpts I found only.

      Christian

      (0) 
    2. Christian Dr. Drumm Post author

      Hi Jerry,

      I really like the “write simulation report” mentioned in your blog. I use this approach myself quite often to try to check small assumptions (most of the time about some ABAP construct) or to understand APIs. However, I think it is also a quite powerful tool to identify the root cause of errors in complex business scenarios.

      For everyone reading the comments to this blog up to here I really recommend to also read Jerry’s blog: My Tips about how to handle complex and tricky issues

      Christian

      (0) 
  6. Christian Lechner

    Hi Christian,

    very interesting and useful blog. What really is to stress is that it is essential to clarify the business process and how it should run instead of blindly debugging the problem and spending hours without a clear target at what to look at exactly. From my experience (prerequisite is for sure some knowledge of the business process and how it is implemented in the SAP module) this can approach can save you a lot of time

    Nevertheless I want to add some points:

    • “…In the case of a dump this is pretty easy. The details of the dump clearly show what happened and where it happened
      Well this can be the case, but there are also other cases where the dump is the “emergency exit” for the application to avoid data inconsistencies, but happens at a complete different location in the code than the root cause of the error. So from my point of view the points connected to an error message also apply to a dump – in both cases you do not exactly know where the root cause is located
    • Knowledge of the module you debug is not essential, but it can abbreviate the process a lot. Especially if you are new to a module do not hesitate too long to ask an experienced developer to support you. Nevertheless even if you are a newbie first try to find the root cause yourself, but do not spend days in investigations just because you are too proud to ask. If you ask an experienced developer it helps him a lot to get a clear statement at what you did already take a look at. Otherwise, you may unnecessarily do some things twice
    • In case of an error message that does not clearly state what happened I prefer to first debug where it is raised in order to see what causes the error (the code never lies 🙂 ) and then switch to the iterative loop you described in “debugging as an experiment” 

    So once again thanks for this very helpful blog, that I will definitely promote it in our company

    Cheers

    Christian

    (0) 
    1. Christian Dr. Drumm Post author

      Hi Christian,

      thanks for your comments.

      You are right, there are some cases where dumps are emergency exits that aren’t the real root of the problem. But in most cases they provide a pretty good starting point for the error analysis.

      I really like your comment about newbies. This kind of approach is what I try to promote with my colleagues. If you don’t have detailed knowledge about a module try at least to do the basics to identify the cause of an error. This helps to learn a lot already. When you immediately ask an experienced colleague you usually learn a lot less.

      Christian

      (0) 

Leave a Reply