Code Cities: Visualizing ABAP code metrics with ABAP2CodeCharta
This article describes how code visualization with code cities can help to identfiy maintainance hotspots in custom ABAP codebases
Four decades of ABAP
We all know that ABAP has undergone rapid development since its early days. Coming from a COBOL influenced, procedural approach, which allowed business people with few technical experience to enhance SAP standard coding, it has evolved to a fully grown object oriented language used by professional developer teams to build custom software products with tens or hundreds of thousands or even more LOC (Lines of Code).
Working with large codebases confronts us as developers or software architects with major challenges, especially when we’re dealing with “legacy” solutions that have grown for years or even decades and incorporate multiple development paradigms and architectural styles.
When you navigate through the code of such projects, sooner or later you come across one of those huge classes or function groups that consists of thousands of LOC and is overwhelmingly complex. Objects like this tend to be a source of bugs and cause high maintenance and testing efforts, which makes further development cumbersome and slow (and who want’s that?).
When we find objects like this, we know that a refactoring them would pay off in the long term. But as time is always limited, we need a way to identify the hotspots where an invest in software quality promises the greatest value. Typically, our hotspot candidates would be very large and complex modules that are changed frequently.
Most of the information required for such a hotspot analysis can be obtained with SAP standard transaction /SDF/CD_CCA. It allows to measure different aspects of custom code like size, complexity, number of changes, SQL statements, and other parameters for methods, function modules or includes.
But working with large, detailed and unaggregated lists is cumbersome and the presentation of pure numbers to management an colleagues was not very effective. So, how about enriching the information from /SDF/CD_CCA a little and visualize it in a way that speaks for itself?
ABAP Code Cities
In January, I found Enno Wulffs blog post on code visualization (I Have A Dream: Code Visualization | SAP Blogs) and was fascinated by the idea of code cities. There were several tools with extractors for languages like Java or C#, but I couldn’t find any for ABAP. After a while I found Code Charta by MaibornWolff, which supports an open json format for custom extractors.
I wrote an ABAP extractor that can convert data from /SDF/CD_CCA in such a way that it can be loaded into Code Charta. It’s named ABAP2CodeCharta.
This is an ABAP Code City:
A code city from one of our ABAP projects, before we started to refactor the code. Each building stands for a class, function group or program. The height of the building stands for the size of the module and the ground areas represents the average size of its submodules (methods, function modules, includes). Yellow and red coloured buildings indicate that a module was changed many times. The largest, red building (marked in orange in the picture above) in this project is a class with nearly 4000 LOC that had been changed 193 times. Doesn’t this smell like high maintenance cost?
This is the same project after a few months:
Some of the skyscrapers, which mixed up many business and technical functionalities, have given way to nice suburbs made of tiny classes that respect the clean code principles and are designed for automated testing. These suburbs are a lot more robust and follow a standardized structure so they can easily be maintained by everyone in the team.
Focusing only on module size may seem an oversimplification but combined with the change frequency it’s a surprisingly good indicator for possible maintenance issues. Nevertheless, there are also other interesting metrics we can use to identify problematic areas. Currently the following metrics are available:
- Total lines of code (including comments and empty lines)
High values indicate that the SRP (Single Responsibility Principle) might have been violated
- Statements: Total number of ABAP Statements
Might be more precise than Lines of Code, in case you have modules with many comments, multiline statements and empty lines
- AvgStatementsPerMethod: Average number of statements per submodule
Analyse if your code follows the SRP on submodule level. According to the Clean ABAP guidelines values should be between 3 and 5.
- Complexity: Total number of conditional statements (if/elseif/else/case/when/do/while).
High Complexity means that there are many possible paths through you code, making it hard to test.
- AvgComplexityPerMethod: Average number of paths through the methods of a class (or module)
- ComplexityOfConditions: Number of Operators (OR/AND/NOT) in conditions weighted by decision depth. High values hint to deeply nested conditional statements and/or very complex conditional logic, which makes code hard to maintain
- DBAccesses: Number of SQL/DML statements
If a module has many SQL/DML statements, its design might be tightly coupled to the database model. You might want to consider moving database related logic to other classes to separate business and technical aspects
- NumberOfChanges: Total number of changes in a module
Parts of the system that are changed frequently should be easy to maintain (e.g. low values for complexity and size). On the other hand, normally you don’t need to invest in refactoring of code that is hard to maintain, but never changed.
ABAP2CodeCharta uses ABAP where-used analysis, to find dependencies between modules. The results are visualized as lines between the buildings
Incoming and outgoing dependecies are displayed in different colours. Modules with many dependencies or connections between large modules are indications of tight coupling, which increases the risk for side effects when code is changed. Splitting these modules into small classes and representing dependencies through interfaces instead of referring classes directly might help to evolve towards to a more loosely coupled system that is easier to maintain and to test.
Creating you own Code City
CodeCharta is open source and available on Github (MaibornWolff/codecharta: CodeCharta visualizes multiple code metrics using 3D tree maps. (github.com)) or as online demo (CodeCharta (maibornwolff.github.io)).
You can also run it in a docker container (codecharta/codecharta-visualization (docker.com)).
The ABAP extractor is available on Github (ABAP-2-CODE-CHARTA/src at main · BenjaminWeisheit/ABAP-2-CODE-CHARTA (github.com)) and can be installed via ABAPGit.
Extracting the data
Extracting the data is a two-stage procedure:
- Create a variant for transaction /SDF/CD_CCA
- Start transaction /SDF/CD_CCA and chose Code Metric
- Select the objects that should be analysed, make sure you check the additional analysis modes as shown below and save your selection as variant.
- Start transaction /SDF/CD_CCA and chose Code Metric
- Run the extractor report ZI_ABAP_TO_CODECHARTA with:
- /SDF/CD_CCA variant you created in the first step
- Download path for the result
- Choose the aggregation level for the data (module or package)
- Choose if you want to analyse dependencies. You can analyse all dependencies or cycles (A->B->A, which would be a strong indicator for design problems)
I recommend running the extractor in the background. Especially when you use the dependency analysis for a larger amount of code the extraction might take a few hours.
When the extraction process is finished, you can upload the json file to Code Charta
Trying different combinations of the metrics and of the Coda Charta functionalities can give you, your colleagues and your management interesting insights and ideas why and where you should invest to improve code quality. Don’t be shy. Rely fully on your play instincts and give it a try.
Please let me know if you were successfull using this tool. I would be happy to hear your ideas how the extractor can be improved.
Thank you for taking the time to read this article and best regards,
Finally, a few examples:
More than 2 million LOC. This extract ran for more than 24 hours. You can see which parts of this system are connected.
Same system but aggregated on package level
Analysis of database access statements. Sourounded by the red rectangle is a refactored part of the code where database operations were separated out into a few classes to isolate business logic from techical aspects. You can see how these „suburbs“ contrast with the high rise legacy buildings in red, which are tightly coupled to the data model.
Code charta also allows to display packages and sub packages as streets. The more fine grained you package structure, the more branching is your street map.
Find objects that dedend on one another recursively. The problem my be resolved by splitting into smaller objects and/or using dependency inversion with interfaces
Seems interesting. Superb post. Keep writing.
Thank you, Gourab
Thanks Benjamin Weisheit for bringing this topic up again and for the detailed documentation!
Code Visualization is really exciting!
That is true. Since our exchange in January, I've been working with this tool regularly and we are still gaining insights.
Hope that there will be more code visualization tools for ABAP in the future and that the medieval VR code city you decribed is not too far away. I still really like your idea that module interfaced could be visualized as harbours and periodic jobs as buses! Maybe we're not too far away 🙂
nice summary. As mentioned I am looking forward to minor code improvements like the "other dependencies" or the object types.
Looks fascinating. I've got everything downloaded, but for the life of me can't figure out how to install and run CodeCharta on windows 10. The installation instructions are somewhat sparse.
I installed it via docker hub, but according to the instructions Windows 10 installation should work like this:
1. Install java
2. Install Node.js
3. Install Git
4. Run the Git Bash command line and execute the following commands:
Let me know if this helped.
I missed out the node.js installation step. The visualisation works now, thanks. Currently running the analysis of our codebase. Only thing is, I don't have the versioning avaiable as it won't run on our old 7.31 system. (I'm looking to see how quick it will be to retrofit).
176 syntax errors... but some of those will be knock on effects. It's doable.
Great. But 176 syntax errors sound like a lot of work. Let me know if you made it run on 7.31 and if you got any insights out of the visualized data.
I think it's worth doing. Only 3 changes and it's already down to 67.
I've found a short cut.
If you change your program so that it optionally can read the code metrics from a file, before doing
Then I can write a program to get the code metrics into that file - I only have to modify a little of your code to 731 standards.
In this way, my program can be run on an older system, and then the output can be run in yours on a modern system.
I ran it on my X52 system, and I've found if objects with the same name appear, there's an issue.
For example, I have a
All with the same name //MATT/DIRECTORY//MATT/WB_OBJECT
I'm pretty sure this is causing an error in CodeCharta during import, which appears three times, of:
(Namespace, directory, object name obfuscated).
Maybe one way of fixing this would be to use the object type in the node name.
Thanks for finding this one. I am wondering why I didn't come across it, yet.
I opened an issue in GitHub and fixed it with release 0.5.1.
Adding the object type had a cool “side effect”. When you click on the object type in the upper part of your screen (marked with the red rectangle) you can highlight those parts that are made of function groups, classes or programs. This might be useful to see, which parts of the system are still written in a procedural style and where we already use OO elements.
Thanks for the quick fix.
Just one thing though - it'd be helpful if the filepath and variant could be validated in advance. I.e. fail early.
Quick question - on the first version, I got usage. When I did an analysis with 0.51, I didn't. What did I do wrong?
Seems like adding the object type causes problems during dependency analysis. I'll have a look at this.
Dependency analysis in combination with object name and type works now. Can you try on your system?
Sure. Just waiting for the overnight job to analyse everything to finish.
Just updated with the latest version and kicked off the job now. It takes just short of 24 hours to process.
Wow, seems to be a pretty large codebase
Or slow server!
I've created a program that will download the codemetrics into a tab delimited file. With a few modifications, your code will be able to read from this file, and generate the json file.
The reason for this, is that your bit can be on my 752 system, but I can get the data from the 731 system we develop on. I don't have to convert all your code to 731 antiquities.
If you don't fancy modifying your program in this way, I might find time to do it myself later.
Also - my program has validation on variant and filename.
I've written a frontend now that will load the file generated from the above program - either from appserver or presentation server - and then put it through your processing logic. Once I've tested it, I'll add it to the repository.
support for older ABAP versions would be a cool feature. I found the time to take over your variant and filename validations. If you want, you can support me and add the logic for file upload and the proper selection screen logic to the extractor. That would be great.
Repository updated. I've also added value help and better validation (it was using an obsolete FM).
Great. Thanks for your support. I'll try to integrate it next week.
Did you manage to create a code city for your codebase? Did it help yu to gain any insights?
Insights? Well, its just aesthetically pleasing. (sooo pretty)
But yes. At package level I have to ask - why is the utlility class dependent on one of the applications.
At class level - why does a controller class (MVC) read the database.
What I'm most fascinated by, is that if I can come up with a metric on something, the JSON is very simple. And I've some ideas that my employers can sell.
This is very interesting..have to try. Thanks for your work.
Wow, this sound awesome, thanks for sharing!