Skip to Content
Author's profile photo Matthew Billingham

First real use of secondary indexes on an internal table

Introduction

Given the reluctance of the general ABAP community to use new-fangled (that is, over fifteen years old) concepts like SORTED and HASHED tables, I was hesitant to write about something a bit newer, but then I thought – what the heck, perhaps some people will find it an encouragement to use new stuff!

And I know this isn’t that new!

So, we have HASHED tables, where the key is unique and the lookup time is constant for each record, and SORTED tables which mean we don’t need BINARY SEARCH any more (except if we need sort descending…). For these tables, there’s an index already defined to speed things up – but it’s like a database table with just a primary index. Secondary keys are like additional indexes on database tables – but for internal tables.

If you don’t know how SORTED and HASHED tables then read the documentation. It’s all quite comprehensive so no excuse!

I’ve heard it said that you should only use these if you’ve got tables with loads of information in. Well, so long as the data isn’t being handled in a loop, I think it doesn’t matter. If the data volume being processed is small, a few extra nano-seconds won’t matter, and data volumes grow – so there’s some future proofing in using the structures which are most efficient with large tables, right from the start.

Secondary keys

Here’s that syntax, to refresh your memory.

TYPES dtype { {TYPE tabkind OF [REF TO] type}

            | {LIKE tabkind OF dobj} }

            [tabkeys]

            [INITIAL SIZE n].

And then tabkeys looks like this:

[ WITH key ]
    [ WITH secondary_key1 ] [ WITH secondary_key2 ] …
    [ {WITH|WITHOUT} FURTHER SECONDARY KEYS ] … .

Additions

1. … WITH FURTHER SECONDARY KEYS

2. … WITHOUT FURTHER SECONDARY KEYS

Those additions, we’ll forget about. They’re for use when you’re defining generic table types.

Now, for my purposes, I’ve got a questionnaire, with pages on it, categories of questions and questions. And I need to access it in many ways. So here’s how I defined it:

TYPES:

     questionnaire_ty TYPE SORTED TABLE OF q_entry_ty WITH NON-UNIQUE KEY page_number cat_seq

                      WITH NON-UNIQUE SORTED KEY by_question COMPONENTS q_id

                      WITH NON-UNIQUE SORTED KEY by_cat_guid COMPONENTS cat_guid q_seq

                      WITH NON-UNIQUE SORTED KEY by_cat_text COMPONENTS cat_text

                      WITH NON-UNIQUE SORTED KEY by_cat_seq  COMPONENTS cat_seq .

The idea is that I can access an internal table of this type rapidly by page number, question id, category unique id (guid), category text and category sequence. Seems quite a lot, but the alternatives were to have a standard table and sort it and use binary search for each read, or not bother at all, and just put up with sequential reads.

Some problems

I’ve got the categories in my questionnaire in sequence order. So, naturally, I want to renumber them. The obvious way of doing this is

LOOP AT me->questionnaire ASSIGNING <entry> USING KEY by_cat_guid WHERE cat_guid EQ i_guid.

   ADD 1 TO index.

   <entry>cat_seq = index.

ENDLOOP.

But there’s a problem there. It dumps. And it dumps because cat_seq is part of the key by_cat_guid!

So, I thought, I’ll delete the records, collect them and then insert them afterwards

LOOP AT me->questionnaire INTO entry USING KEY by_cat_guid WHERE cat_guid EQ i_guid.

   DELETE TABLE me->questionnaire FROM entry.

   entrycat_seq = index.

    INSERT entry INTO TABLE renumbered.

ENDLOOP.

INSERT LINES OF renumbered INTO TABLE me->questionnaire

But data was still going amiss. The problem was, that the delete command deletes the entry that matches the primary key. So it was reading one entry in the LOOP AT, and deleting an entirely different entry (that matched the primary key) at the DELETE.

I tried the old DELETE… INDEX, but that got me nowhere. But a quick check of the syntax for DELETE gave me the hint.

LOOP AT me->questionnaire INTO entry USING KEY by_cat_guid WHERE cat_guid EQ i_guid.

   DELETE TABLE me->questionnaire FROM entry USING KEY by_cat_guid.

   entrycat_seq = index.

   INSERT entry INTO TABLE renumbered.

ENDLOOP.

INSERT LINES OF renumbered USING KEY by_cat_guid INTO TABLE me->questionnaire

What to be aware of

With an internal table with additional keys, there are few things you really need to take care about.

1. You can’t change a field of an entry you’ve ASSIGNED to, if that field is part of one of the keys

2. If you access data using one key – you really need to change it using the same key.

3. All of the usual internal table handling statements have the addition USING KEY. Sometimes it’s vital – like with the DELETE example. Other times it’s a matter of performance. For the INSERT LINES I could have omitted the USING KEY, and it would still work – however it is not as efficient, since I know that all my renumbered entries have the same cat_guid.

Final words

When new ABAP commands become available, try to use them. In my application, it probably won’t make any difference. But what you don’t use, you forget. Surely there will come a time when you do need additional accesses to internal tables – if you’ve already practiced, the next time it won’t take as long.

Assigned Tags

      25 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Rainer Hübenthal
      Rainer Hübenthal

      Well the biggest problem i have is that i dont have these new ABAP commands available. As there are no business needs the system isnt updated - not in this area-

      The other problem: i'm using hashed tables wherte i can, but there are still people not knowing them sorting tables back and forth all the time... So i'm a little bit exotic here 🙂

      i'm always wondering why so many programers arent willing to learn using new commands. Since years we habe abap_bool, abap_true and abap_false, its mandantoy in the dev guide but its hard to catch the violation, still programers do their own stuff, and surprisingly, even their own stuff isnt consitently. Sometime they store 0 and 1 as flags, sometimes Y and N sometine X and space.

      I dont think i will see secondary indexes in my environment before i'm retired 🙁

      Author's profile photo Matthew Billingham
      Matthew Billingham
      Blog Post Author

      I'm fortunate in that regard - I just saw the other day that one of the systems I work in is at 7.4 already. So I just need to get some admin rights and install abap eclipse!

      Most of the systems I work on are 7.3 or higher for this particular client.

      Author's profile photo Peter Inotai
      Peter Inotai

      So I just need to get some admin rights and install abap eclipse!

      For eclipse you don't need admin rights. You just download the files, unzip them, install the ABAP addons and that's it. None of the step needs admin rights on the desktop.

      Author's profile photo Matthew Billingham
      Matthew Billingham
      Blog Post Author

      I needed the admin rights to get the right level of Java runtime installed!

      Author's profile photo Peter Inotai
      Peter Inotai

      You're right, Java VM needs admin rights. Eclipse+plugins don't.

      Author's profile photo Former Member
      Former Member

      Hello Matt,

      Nice to see someone blogged about secondary keys. I have been itching to use the them in productive code for sometime but ABAP Keyword Documentation always stopped me 🙁

      Use secondary keys in a way that benefits the table.

      Use secondary keys sparingly and only in cases where the benefits outweigh the extra costs.

      I wish i could use secondary keys with the BAPI tables (Standard tables), but they give me a dump as well.

      So although i'm aware of their existence i have never used them in productive code.

      BR,

      Suhas

      Author's profile photo Matthew Billingham
      Matthew Billingham
      Blog Post Author

      It's the same advice some people give with e.g. HASHED tables -> unless there's thousands of rows don't bother, as the overhead for initialisation is too high.

      But the way I look at it - if there's only 1000 entries, it's going to be quick anyway (notwithstanding processing within loops).

      Author's profile photo Rainer Hübenthal
      Rainer Hübenthal

      Well i guiess that's an urban legend with hash tables. There is a small overhead when there are only a few entries and theres no much gain with few entries, but who cares? If it is a table only being read from and the key is unique just make it a hashed table. Thinking about or even discussing this all the time wastes more resources as a hash table with only 50 entries.

      Author's profile photo Mario Tibollo
      Mario Tibollo

      Since I know of the existence of secondary indices, I often use them in BI. It avoids creating multiple itabs with basically the same data but different keys  for quick access in routines.

      nice.

      M.

      Author's profile photo Sandra Rossi
      Sandra Rossi

      Hi Matt,

      First of all, thank you for your blog, the only one which exists on that topic.

      I was digging in the official Secondary Keys documentation about what are the "lazy and delayed updates" (i.e. I want to understand how it works in the kernel), and went to read your blog, looking for help as the topic is a little complex. I think there are a few typos (maybe I'm wrong), and I have also one question about the DELETE:

      But there's a problem there. It dumps. And it dumps because cat_seq is part of the key by_cat_guid!

      In your example, cat_seq is not part of the secondary key by_cat_guid, but part of the primary key. Thus, I think it's why it dumps : if I've understood correctly the documentation, you may update fields which are part of secondary keys.

      <entry>-cat_seq = index.

      INSERT entry INTO TABLE renumbered.

      I think it's a typo, it should be entry, instead of <entry>. These 2 lines appear twice, in the second and third code blocks.

      I tried the old DELETE... INDEX, but that got me nowhere

      According to the documentation, when you read the internal table using a secondary key, the SY-TABIX corresponds to the line in that secondary index, so I think using DELETE me->questionnaire INDEX sy-tabix USING KEY by_cat_guid should do the job. Moreover, I wonder whether we may still use DELETE me->questionnaire, as we are inside LOOP AT? (but maybe you did it on purpose to demonstrate secondary keys).

      1. You can't change a field of an entry you've ASSIGNED to, if that field is part of one of the keys

      Here, I think it's the same topic as my first remark. I think it should be: 1. You can't change a field in an existing line of the internal table, if that field is part of the primary key (I think the ASSIGN is simply the way you modify the line in the internal table (you might have used MODIFY which would lead to the same restriction), and according to the ABAP documentation, you may update fields which are part of secondary keys (as far as they don't belong to the primary key).

      Regards.

      Sandra

      Author's profile photo Clemens Li
      Clemens Li

      Thanks to everybody's comment here, thank you Matthew for starting this. Yes, always a few developers really make use of new syntax offers in ABAP.

      100 % agree with Rainer about urban legends warning to use anything useful. Sandra was the only one noticing that you must decide never ever again use a LOOP INTO just not to get confused with work areas and field symbols.

      What ever the overhead of using hashed tables, secondary indexes or ASSIGNING field-symbols may be:

      when tables are small, you won't notice (can't measure) a difference anyway, when tables get larger you are simply better and faster. Remove LOOP .. INTO from your portfolio, use LOOP .. ASSIGNING <field-symbol> and use it always regardless of the situation.

      Why not start better in the first way?

      BTW: I love the new && operator - no more need for CONCATENATE any more.

      And: Not everything new is always better: REGEX is extremely slow though powerful compared to old-fashioned word with offsets and compare operators (CS, CA, ..). Here you may have good reasons to use old and outdated techniques.

      Just my 2 cents.

      Regards

      Clemens

      If ABAP in ECLIPSE is as slow and non-intuitive as the new debugger, then I'm happy not to have a system to try 🙂

      Author's profile photo Florian Henninger
      Florian Henninger

      You got some good arguments and I agree with your loop-statement 🙂

      but what is that:

      If ABAP in ECLIPSE might be as slow as your "new" debugger?

      Are you on a run? 😕

      ~Florian

      Author's profile photo Joao Sousa
      Joao Sousa

      If ABAP in ECLIPSE is as slow and non-intuitive as the new debugger, then I'm happy not to have a system to try 🙂

      It's not. And the new debugger may be "strange" but it's much more powerful (and I come from a time when the new debugger didn't exist).

      Nowadays I mainly use AiE, just use SE80 for some specific things that are missing.

      Author's profile photo Jörg Wulf
      Jörg Wulf

      Clemens Li wrote:

      ...

      And: Not everything new is always better: REGEX is extremely slow though powerful compared to old-fashioned word with offsets and compare operators (CS, CA, ..). Here you may have good reasons to use old and outdated techniques.

      ...

      If ABAP in ECLIPSE is as slow and non-intuitive as the new debugger, then I'm happy not to have a system to try 🙂

      Hi Clemens,

      i don't think that too many abaper would use REGEX in a case, where CS or CA does the trick. It's not, that REGEX is the most convenient construct to use, so i hardly see the relevance of that comparison.

      On the other Hand, you gain a lot for those cases, where CS or CA doesn't suffice.

      And all of a sudden, you can do things that where plainly unthinkable before REGEX.

      Besides REGEX isn't that new either.

      About the same holds true for the new debugger. It widens your range of possibilities.

      Even if it were slower (which i have failed to notice as of yet), the limitations and bulkiness of the old one are reason enough for the change.

      Old or new is hardly ever the question.I'll use whatever works best.

      But using new techniques in order to hone your skills, might pay off on the Long run.

      To Keep in mind:

      Those who are always trailing  behind the herd, will always feed on droppings, not on verdancy.

      Jörg

      Author's profile photo Eitan Rosenberg
      Eitan Rosenberg

      Hi all,

      IMHO a big hurdle in adopting the new ABAP enhancements is availability.

      As far as I know in order to use those enhancements you need to upgrade the    whole system.

      Is the business wiling to do that ? the cost of upgrade is not something to sneer at......

      I wish we could just upgrade the ABAP stuff.... 🙁

      As for myself I am on 740 sp08 🙂


      Regards.

      Author's profile photo Simone Milesi
      Simone Milesi

      Eitan Rosenberg wrote:

      As far as I know in order to use those enhancements you need to upgrade the    whole system.

      Is the business wiling to do that ? the cost of upgrade is not something to sneer at......

      I wish we could just upgrade the ABAP stuff.... 🙁

      Correct and right.

      And the answer is obviously "No, the business is not willing at all in throwing money on IT projects" 🙂

      Usually business has to be forced by some major force to accept the upgrade.

      All the times I worked on upgrade, the business was the force to fight off (not the upgrade issues themselves) and the projects were mandatory due global HQ decision.

      Author's profile photo Clemens Li
      Clemens Li

      just by the way: LOOP ASSIGNING <field-symbol> is available since release 4.5 (or earlier), more than 15 years. Still we are at 95 % LOOP INTO / MODIFY.

      Regards Clemens

      Author's profile photo Jörg Wulf
      Jörg Wulf

      There's nothing wrong with a healthy LOOP AT INTO, as Long as you're not processing really biiig internal tables.

      Especially when you need to manipulate all or some of the data for further processing, without  riscing to compromise it.

      We're always keen on protecting data from unintended modification, but willingly use field-symbols and just trust in the benevolence of other code-blocks?

      Even when you're dealing with methods only, there's no garantee, that nobody messes with the data you provide. And with field-symbols it is "what's done is done and what's lost is lost".

      I'd recommend the use of field-symbols wherever they can make your life easyer.

      A true gain of performance you'll have, where you need to update rows and rows of data.

      But looping at an internal table of a fews thousand entries, without the need or Intention to change anything within the table - the risc isn't worth the gain!

      The problem i see with field-symbols, ist that about half of those using them, don't know them half as good as they should and the other half can't be bothered.

      I'm as fierce a promoter of well used field-symbols as anyone - but i'm fighting a lost battle against all atttempts to install field-symbols for every purpose and in every context.

      First understand - then use -  and use wisely.

      BR - Jörg

      Author's profile photo Former Member
      Former Member

      I must say I have never understood the "protecting data from unintended modification" argument.

      Why is it different from any unintended modification of standard variable?

      Regarding methods,

      Import parameters cannot be changed (In contrary to IMPORTING parameters in old subroutines).

      Author's profile photo Jörg Wulf
      Jörg Wulf

      QED.

      The difference is, when you change a field-symbol, assigned in a LOOP AT TABLE, you change the contents of the table, not just a work area.

      BR - Jörg

      Author's profile photo Richard Harper
      Richard Harper

      And if you don't understand that go to the back of the class!

      Rich

      Author's profile photo Former Member
      Former Member

      This is correct, from technical point-of-view.

      However, you can also update "just a work area", which one row afterwards updates a DB table.

      You should always understand the context of your code.

      Author's profile photo Clemens Li
      Clemens Li

      You should always understand the context of your code.


      Well said.

      Author's profile photo Matthew Billingham
      Matthew Billingham
      Blog Post Author

      You should always make your code so that subsequent developers can understand the context.

      Author's profile photo Mario Tibollo
      Mario Tibollo

      code psychopat.JPG