User Experience Insights
Hungarian beginner’s course – a polemic scripture against Hungarian Notation
I love the atmosphere of constructive debating – lively and resolutely debated, but not becoming personal. In this mood, the following blog post became composed.
There is no other way around it by auditing, maintaining or understanding other’s coding: “lt_” stalks you! Everywhere! In SAP’s Coding. In customer’s ABAP code and in their official developers guidelines. In SAP-PRESS-Books of Rheinwerk Verlag, even though they also published the official developers guidelines of SAP (which tells us, that’s bad coding) – WTH their editorial office was hired for? I also think it’s a bad idea and I’m not alone with this opinion. I’m gonna tell you right now, why – in the following blog post.
Hungarian Notation, so it’s called in most cases, was invented by Mirosoft, which is a valid reason for being for most of the developers on the planet. That’s the tale.
The truth is: Founder of the Hungarian Notation was Charles Simonyi, an hungarian developer (that’s why it’s called “Hungarian Notation”) at Microsoft, who wrote an article, but it’s epidemical spreading misunderstanding by masses of developers around the planet was not his intention!
Following my main rule (“I don’t like metaphors, I prefer to speak in pictures”), I’ll illustrate the problems by showing it at an example:
Using three indicators to identify a data type
Let’s take a common data type’s name lt_ekko. What tells us it’s name? It tells us, that it’s a local table, which linetype equals the well known linetype EKKO. To make a long story short: It tells us masses of redundant information.
1. The local/global indicator
For an ambitious software developer, global data types don’t exist. We should not work with it, that’s what SAP told us for years, and they were right and they are still right – but why their own employees are permanently breaking this rule?
Developers, working with other programming languages, can not believe, that ABAPers work with methods of before OO was invented. In a well encapsulated environment, global data has no reason for being, because they are conflicting basically with object oriented software development paradigm.
And – this question may be allowed – what is the definition of global? All data types, defined in a program or class, are locally by definition, because outside this development object they do not exist. The only data types, which are existing globally, are defined in the Data Dictionary. Same as classes and interfaces (which are just data types with a higher grade of complexity): Global classes and interfaces are defined in SE24/SE80 and not inside an ABAP. A class, defined in an ABAP, is a local class by definition.
In conclusion to this statements, all so called global data types are also locally by definition (program wide locally, to be exact). This doesn’t touch the rule, that we should not use this, but in order to this blog post, it’s important, that an ABAP can not define global data types, so the prefix “g” won’t be used correctly. This results into the question: If everything is locally by definition, why the hell we do need a prefix for that?
And, pals: Don’t tell me, a static class attribute is the same like a so called global program variable, because it’s valid in the whole class and accessible system-wide! An attribute is called an attribute, because it has a context (the classes’ context!), this is way different from what a variable is! And the accessibility of such an attribute depends on it’s visibility configuration. A private attribute is not accessible system wide.
ABAP 7.50 brings the new feature of Global Temporary Tables, tonne defined in Data Dictionary. Because it’s accessible system-wide, it’s called “global”. There you can see, that a variable, defined in a program, is local byte definition.
2. The data type dimension indicator
The next question is, why I should use an indicator, describing the dimension of a data type. A table is just a data type, same as a structure or a field. In most cases, I simply don’t know, what dimension a data type has, I work with – i. e. while working with references and reference variables (what we should do, most of the times). And what is (from a developer’s view) the difference of a clear-command to a field in comparison of the same command to an internal table? It does simply the same: The clear command clears, what stands to the right of this command. It’s that simple. What kind of information will tell me the “t” in lt_ekko in this context???
What’s about nested tables? In
begin of ls_main,
materials type standard table of mara,
end of ls_main,
lt_main type standard table of ls_main.
the table materials should be named lt_materials. No? Why not? Why a “such important” information, that this is a table, suddenly gets worthless, just because it’s a component? That this is a table, is only important in relation to the access context. Which means: For a statement like
ASSIGN COMPONENT ‘materials’ OF STRUCTURE ls_main ….
materials is a component, not more or less.
I’m not kidding: I really read some developers guidelines, which strictly orders, a field symbol has to have the prefix “fs_”, what is really dump, because a field symbol has it’s own syntax element definition “<…>“! Is this the way, a professional developer should work???
Next example is a guideline, which says, that I don’t have to use “lv_” for local variables, but “li_” for local integers, “ln_” for local numerics, “lc_” for local characters (which is in conflict to local constants) and so on. A developer needs to have a list of “magic prefixes” on his desk, to bear in mind this dozens of prefixes!
But this causes a problem: What, if you have to change the data type definition during development or maintenance process? You really have to rename it through the complete calling hierarchy through all of the the system, which means, you may have to touch development objects, only for the renaming process. You have to test all this objects after changing the code! What a mess! You need some Hobbies, if you need to fill your time, but not this kind of evil work.
It’s a well known rule: The more development objects you have to change, the more likely is, that you’ll get to objects, which are locked by other developers.
A public example: The change of data type definition from 32 to 64 Bit in Windows. All the developers, who have used Hungarian Notation, are now using a data type’s name, referring to a definition, which has nothing to do with it’s type!
What’s about casting? I could find more questions like this, but that’ll it for now, because it’s enough for you to get the key statement.
3. The structure’s description
This is another surplus information, because the structure’s or the basic data type definition is just a double click (in SAPGUI) or a mouse over (Eclipse) far from the developer’s cursor.
Now that we know, which redundant, surplus information we can get, let’s have a look, what kind of important information we won’t get from lt_ekko:
What kind of data we will find in lt_ekko? EKKO contents different kinds of documents: Purchase Order headers, contract headers, and so on. And by deep inspection, there are a few different kinds of Purchase Orders. Standard PO? Cross Company? What a cross company purchase order exactly is, depends on the individual definition of the customer’s business process, so it’s identification is not easy!
To get to know, what kind of documents are selected into table lt_ekko, we have to retrace the data selection and the post data selection processing, which is much more complex than a double click. For this reason, this is the most important information, we have to place in the table’s name!
If you select customers, what do you select in detail? Ship-to-partners? Payers? Or the companies, who will get the bill? Whatever you do, lt_kna1 won’t tell me that! ship_to_partners will do!Conclusion:
To get rid of all surplus information and replace them with relevants, we should not name his table lt_ekko, but cc_po_hdrs, to demonstrate: This are multiple (hdrs = plural = table, if you really want to do that) cross-company purchase order headers. A loop could look like this:
LOOP AT cc_po_hdrs “<— plural = table
INTO DATA(cc_po_hdr). “<— singular = record of
No surplus information, all relevant information included. Basta!
I am not alone
You may ask, why this nameless silly German developer is telling you me how you have to do your job? I am not alone, the following quotes proof:
- No one less Bjarne Stroustrup himself, the founder of C++ said in his C++ Style and Technique FAQ:
“No I don’t recommend ‘Hungarian’. I regard ‘Hungarian’ (embedding an abbreviated version of a type in a variable name) a technique that can be useful in untyped languages, but is completely unsuitable for a language that supports generic programming and object-oriented programming”
- Robert Martin, Founder of Agile Software Development, wrote in “Clean Code: A Handbook of Agile Software Craftsmanship”:
“…nowadays HN and other forms of type encoding are simply impediments. They make it harder to change the name or type of a variable, function, member or class. They make it harder to read the code. And they create the possibility that the encoding system will mislead the reader.”
- As well as Linus Thorvalds bans Hungarian Notation in “The Linux Kernel Coding Style” (Chapter 4, Naming),
“Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged—the compiler knows the types anyway and can check those, and it only confuses the programmer.”
„Brain damaged“, to repeat it. Is this the way, we want to talk about our work, we should be proud of?
Of course, I know, that masses of developers will disagree, only because the always worked like this (because they learned it from others, years or decades ago) and they don’t want to change it. Hey, we’re Software Developers! We are the one, who permanently have to question the things we do. Yesterday, we did procedural software development, today our whole world is object oriented, tomorrow we’re gonna work with unbelievable masses of “Big Data”, resulting in completely new work paradigms, we don’t know, yet. And those guys are too lazy, to question their way of data type naming? Are you kidding?We are well payed IT professionals, permanently ahead in latest technologies, working on the best ERP system ever (sic!) and the SAP themselves shows all of us, that they can throw away the paradigms of 20 years to define new ones (to highlight the changes to S/4HANA, I never would have estimated as possible).Let’s learn from our colleagues, who also develop applications with a high grade of complexity. Let’s learn from the guys, who invented the paradigms we work with. Let’s forget the rules of yesterday….
I’ve been asked, lately, if I don’t like prefixes at all. The answer is: No. Indeed, there are prefixes, indeed, making sense:
- importing parameters are readonly, so they may have the prefix “i_”.
- exporting parameters have to be initialized, because their value is undefined, ifthey are not filled with a valid value. So we should give them a prefix “e_”.
- changing parameters transport their value bidirectional, so they should marked with a “c_” and
- returning parameters will be returned by value, so we should mark them with prefix “r_”.
This is a naming rule, I’d follow and support, if requested. Because this prefixes transports relevant, non-redundant information (in terms of the things, which are not obvious), influencing the way we handle this data types.
Request for comments
Your opinion differs? Am I wrong, completely or in some details? You’d like to back up me? Feel free to leave a comment….
Disclaimer: English ain’t my mother tongue – Although I do my very best, some things maybe unclear, mistakable or ambiguous by accident. In this case, I am open to improve my English by getting suggestions 😉