Skip to Content
Author's profile photo Enno Wulff

Many Lines Spoil The Function

Current topic in my working life is refactoring. I try to find methods and patterns on how to design programs better just from the start. And I try to find ways to solve general refactoring issues.

I was searching for sample code that was not mine. But where to start?

The idea was that bad code will be long. Some people say that good functions only have some few lines of code but we all know that there are exceptions from this. If you have to fill a BAPI structure you will necessarily have a lot of code. However if there are hundreds of lines in a method or a function or a sub routine you can be sure that this code badly needs some refactoring…

To find these classes and reports I wrote a litte report.

Identify forms and methods with a large number of commands

And even within this little tool I faced a lot of situations where I thought: “do it the easy way or do it right…?”

I optimized the report in a way that it should be understandable. Although I see a lot of improvements.

  • I use two variants for getting the components. One for the class and one for reports. This could be optimized by using the strategy pattern
  • the output is quick and dirty. I just display the include name, the name of the function, the function type (METHOD, FORM, MODULE) and the numbers of statements.It would be nice to have the report/ class name and then the FORM or METHOD
  • The program could be adapted to clearly describe
    • classes
    • reports
    • function groups
  • The analysis could be optimized
    • by distinguishing between class methods and test methods
    • by showing local classes in a report or class
  • it is not possible to jump directly to the mehtod/ form/ module by double clicking a line

What does it do?

the report selects PROG and CLAS from TADIR. Object name and package must be given to prevent long run.

The report then analyzes the selected includes (+ embedded includes) and displays them in CL_SALV_TABLE.

I added a sort on the column “number of commands” to have the ones with largest number of lines on top. I also added a filter to only show the functions above selected threshold.

It’s late.

Here is the report:

https://github.com/tricktresor/blog

REPORT.

CLASS lcl_rinfo DEFINITION.
  PUBLIC SECTION.
    METHODS analyze_prog IMPORTING object TYPE clike.
    METHODS analyze_clas IMPORTING object TYPE clike.
    METHODS display IMPORTING max type i..
    METHODS constructor.

  PROTECTED SECTION.
    TYPES: BEGIN OF ts_data,
             repid TYPE syrepid,
             type  TYPE c LENGTH 10,
             name  TYPE c LENGTH 80,
             len   TYPE i,
           END OF ts_data,
           tt_data TYPE STANDARD TABLE OF ts_data WITH EMPTY KEY.
    METHODS get_prog_includes IMPORTING reportname TYPE clike RETURNING VALUE(includes) TYPE programt.
    METHODS get_clas_includes IMPORTING classname TYPE clike RETURNING VALUE(includes) TYPE seoincl_t.
    METHODS analyze IMPORTING object TYPE clike.

    DATA keywords   TYPE STANDARD TABLE OF char30.
    DATA mt_data    TYPE tt_data.

ENDCLASS.

DATA h_name TYPE trdir-name.
DATA h_devc TYPE devclass.
SELECT-OPTIONS s_name FOR h_name OBLIGATORY.
SELECT-OPTIONS s_devc FOR h_devc OBLIGATORY.
parameters     p_max  type i DEFAULT 50.

START-OF-SELECTION.

  DATA(rinfo) = NEW lcl_rinfo( ).
  SELECT object AS obj_type, obj_name
    FROM tadir
    INTO TABLE @data(objects)
   WHERE obj_name     IN @s_name
     AND devclass IN @s_devc
     AND pgmid     = 'R3TR'
     AND object   IN ( 'PROG','CLAS' ).

  LOOP AT objects INTO DATA(object).
    CASE object-obj_type.
      WHEN 'PROG'.
        rinfo->analyze_prog( object-obj_name ).
      WHEN 'CLAS'.
        rinfo->analyze_clas( object-obj_name ).
    ENDCASE.
  ENDLOOP.

  rinfo->display( p_max ).

CLASS lcl_rinfo IMPLEMENTATION.

  METHOD constructor.
    keywords = VALUE #(
                        ( 'FORM' )   ( 'ENDFORM' )
                        ( 'METHOD' ) ( 'ENDMETHOD' )
                        ( 'MODULE' ) ( 'ENDMODULE' )
                        ).
  ENDMETHOD.

  METHOD get_prog_includes.
    DATA include_names TYPE programt.
    DATA include_name  TYPE program.
    CALL FUNCTION 'RS_GET_ALL_INCLUDES'
      EXPORTING
        program    = CONV syrepid( reportname )
      TABLES
        includetab = include_names
      EXCEPTIONS
        OTHERS     = 3.
    IF sy-subrc = 0.
      LOOP AT include_names INTO include_name.
        analyze( include_name ).
      ENDLOOP.
    ENDIF.
  ENDMETHOD.

  METHOD get_clas_includes.
    includes = cl_oo_classname_service=>get_all_class_includes( CONV #( classname ) ).
  ENDMETHOD.

  METHOD analyze_prog.
    rinfo->analyze( object ).
    LOOP AT get_prog_includes( object ) INTO DATA(prog_include).
      analyze( prog_include ).
    ENDLOOP.
  ENDMETHOD.

  METHOD analyze_clas.
    rinfo->analyze( object ).
    LOOP AT get_clas_includes( object ) INTO DATA(clas_include).
      analyze( clas_include ).
    ENDLOOP.
  ENDMETHOD.

  METHOD analyze.

    DATA source     TYPE STANDARD TABLE OF string.
    DATA tokens     TYPE STANDARD TABLE OF stokes.
    DATA statements TYPE STANDARD TABLE OF sstmnt.
    DATA message    TYPE string.

    READ REPORT object INTO source.

    DATA ls_data TYPE ts_data.

    SCAN ABAP-SOURCE source
         TOKENS INTO tokens
         MESSAGE INTO message
         KEYWORDS FROM keywords
         STATEMENTS INTO statements.
*break-point.
    LOOP AT statements INTO DATA(statement).
      DATA(command) = tokens[ statement-from ]-str.

      CASE command.
        WHEN 'FORM'
          OR 'METHOD'
          OR 'MODULE'.
          CLEAR ls_data.
          ls_data-repid = object.
          ls_data-type  = command.
          ls_data-name  = tokens[ statement-from + 1 ]-str.
          ls_data-len   = statement-number.
        WHEN OTHERS.
          ls_data-len  = statement-number - ls_data-len.
          APPEND ls_data TO mt_data.
      ENDCASE.
    ENDLOOP.

  ENDMETHOD.

  METHOD display.
    TRY.
        cl_salv_table=>factory(
          IMPORTING
            r_salv_table   = DATA(o_table)
          CHANGING
            t_table        = mt_data
        ).
        o_table->get_functions( )->set_all( ).
        o_table->get_sorts( )->add_sort(
            columnname         = 'LEN'
            position           = 1
            sequence           = if_salv_c_sort=>sort_down ).

        o_table->get_filters( )->add_filter( columnname = 'LEN'
                                             sign       = 'I'
                                             option     = 'GE'
                                             low        = conv #( max ) ).
        o_table->display( ).
      CATCH cx_salv_error.
    ENDTRY.
  ENDMETHOD.

ENDCLASS.

Assigned Tags

      10 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Tapio Reisinger
      Tapio Reisinger

      Hi Enno, nice Blog and thanks for the tool. I think it is quite useful.

      In general I agree that long function or methods are an indicator of bad code. But I think Abap is some kind of special compared to other languages like C', Java, Scala, erc..., where Objects are inlcuded there more "naturally" into the language itself.

      Within the official programming guidelines there is a very good rule of thumb

      Modularize rather than atomize

      Best regards, Tapio

      Author's profile photo Enno Wulff
      Enno Wulff
      Blog Post Author

      Thanks for your comment, Tapio!

      I agree that ABAP is different here. Thanks for the link! I just read a blog where the author recommended that a unit should have at least FOUR lines of code. That's definitively not constructive in SAP environment.

      When testing the report I found units with 2000 lines of code and more.

      That could be a slight indicator for having not even modularized code... 😉

      Author's profile photo Enno Wulff
      Enno Wulff
      Blog Post Author

      SAP recommends to not have more than 150 commands per unit:

      https://help.sap.com/doc/abapdocu_752_index_htm/7.52/en-US/abenproc_volume_guidl.htm

       

      Author's profile photo Uwe Fetzer
      Uwe Fetzer

      Hi Enno,

      you may also use the ABAP Test Cockpit (ATC) / Code Inspector to check the lines of code within a module:

      Author's profile photo Enno Wulff
      Enno Wulff

      Thanks Uwe, Didn't know that...

      Author's profile photo Jacques Nomssi Nzali
      Jacques Nomssi Nzali

      Hello Enno,

      nice idea!

      I would propose/try to create a histogram from the data (x -> number of command y-> number of routines ).

      Note: get_proc_includes( ) does not return anything, the loop is processed in the method.

      best regards,

      JNN

      Author's profile photo Enno Wulff
      Enno Wulff
      Blog Post Author

      Thanks for your suggestion! I already said, there is a lot of potential... 😉

      The diagram is only useful if the program/ Class will be clearly extracted so you can see the units of an object. It's not that easy because a report can have several local classes and classes can also have local classes plus test classes.

      You're right: In case that there are includes in an include, the method get_prog_includes will not get the complete report!

      I should use analyze_prog again:

        METHOD analyze_prog.
          rinfo->analyze( object ).
          LOOP AT get_prog_includes( object ) INTO DATA(prog_include).
            analyze_prog( prog_include ).
          ENDLOOP.
        ENDMETHOD.
      Author's profile photo Jacques Nomssi Nzali
      Jacques Nomssi Nzali

      Hm, I think the observable behavior of the program is already correct, as you are doing the LOOP inside the analyze_prog( ) routine already. I think it should just be

        METHOD analyze_prog.
          rinfo->analyze( object ).
          get_prog_includes( object ).
        ENDMETHOD.

      I would use the histogram not so much to detect deviation as to recognize the frequency distribution.

      JNN

       

       

      Author's profile photo Mike Pokraka
      Mike Pokraka

      Hi Enno,

      You can also get all these stats from the custom code analyser (tx /SDF/CD_CCA), in particular the Code Metrics tool. I also like to use the coverage analyser, because it works at class/report level and gives you the ability to drill down into different components and provides an execution profile.

      Personally I find the branch count more useful than lines of code, as it is an indicator of the overall complexity of a component. As you already point out, pure size is not a reliable indicator. An If / else block counts as two branches regardless of the amount of statements each contain, to me this is much more relevant.

      Author's profile photo Enno Wulff
      Enno Wulff
      Blog Post Author

      Thanks for your input Mike!

      I will check that out.

      Branch count is an interesting indicator! I think you're right that this is more important than length.

      I like to do this code analyzing by myself from time to time. That's one reason why i wrote this program.