Skip to Content
Technical Articles
Author's profile photo Fatih Pense

An Open Source ABAP JSON Library – ZCL_MDP_JSON_*

Hi ABAP developers,

Update 2021-07-11: After 5 years, to my surprise, this library is still useful for some edge cases. I couldn’t support fixing regex bugs(Because they are impossible to fix without introducing new bugs). Now there is a new library with the same methods, which doesn’t use regex for parsing JSON. It is just a bit slower but doesn’t contain parsing bugs. You can find it here: https://github.com/fatihpense/abap-tasty-json

I would like to introduce a new open-source ABAP JSON library we have developed. Why the world needs a new JSON library? I will explain our rationale behind developing this library with its features. In the end it is about having more choices and knowing trade-offs. I would like to thank MDP IT Consulting for letting this work become open-source with the MIT License.

Table of Contents:

  • Summary
  • Alternatives
  • Reasoning and features
  • Examples
  • Performance
  • Links
  • Warning
  • Conclusion

Summary

You can generate any custom JSON with this library unlike alternatives. Thus you can easily achieve API compatibility with another JSON server written in another language. Beside providing a serializer and a deserializer, this library defines an intermediate JSON class in ABAP. Further development may enable more JSON utilities based on this JSON representation.

Alternatives

CL_TREX_JSON_*

Standard transformation using JSON-XML:  https://scn.sap.com/community/abap/blog/2013/01/07/abap-and-json

Manual string manipulation: While it provides flexibility, it is tedious and error prone work. Sometimes it is used together with CL_TREX_JSON_*

These libraries also seek automatic mapping:

https://github.com/se38/zJSON/wiki/Usage-zJSON

https://wiki.scn.sap.com/wiki/display/Snippets/One+more+ABAP+to+JSON+Serializer+and+Deserializer

Reasoning and features

It is intriguing to me that there was no JSON node representation in ABAP. Let me give examples from other languages:

Working with JSON in dynamic or loosely typed languages is easier since easily modifiable representations for JSON object and array already exists in the standard language:

In strongly typed languages like ABAP, Java, Go there are two approaches:

Our library has chosen the intermediary representation approach defining the class ZCL_MDP_JSON_NODE.

/wp-content/uploads/2016/07/abap_json_node_988103.jpg

Features:

  • It provides flexibility down to JSON spec. This is important because you get the same flexibility as manual string manipulation without errors. So compatibility of your ABAP service or client with any other JSON API becomes possible without string manipulation.
  • You can deserialize any JSON string.
  • You can know exactly what deserializer will produce when you see a JSON string.
  • You don’t need to define intermediary data types just for JSON input/output.

Future ideas for development:

  • Intermediary ZCL_MDP_JSON_NODE class enables development of methods like JSON equality checker, beautification of JSON output, checks for spec validity for string and number values.
  • The library uses regexes for parsing. Most of the time regex can be a quick solution. However, I think finite-state machines are better suited for parsers in general.
  • We will work on this library based on our needs and your suggestions. For example, we can work towards 100% compliance with the JSON specification running edge case tests.

Examples

Examples here are in the shortest form to show how easy JSON manipulation can become. There will be more examples at the project repo using other features of the class. JSON node class is easy to understand if you study attributes and methods once.

Deserialization Example:

DATA:  l_json_string TYPE STRING.

CONCATENATE

‘{‘

‘ “books”: [‘

‘ {‘

‘ “title_original”: “Kürk Mantolu Madonna”,’

‘ “title_english”: “Madonna in a Fur Coat”,’

‘ “author”: “Sabahattin Ali”,’

‘ “quote_english”: “It is, perhaps, easier to dismiss a man whose face gives no indication of an inner life. And what a pity that is: a dash of curiosity is all it takes to stumble upon treasures we never expected.”,’

‘ “original_language”: “tr”‘

‘ },’

‘ {‘

‘ “title_original”: “Записки из подполья”,’

‘ “title_english”: “Notes from Underground”,’

‘ “author”: “Fyodor Dostoyevsky”,’

‘ “quote_english”: “I am alone, I thought, and they are everybody.”,’

‘ “original_language”: “ru”‘

‘ },’

‘ {‘

‘ “title_original”: “Die Leiden des jungen Werthers”,’

‘ “title_english”: “The Sorrows of Young Werther”,’

‘ “author”: “Johann Wolfgang von Goethe”,’

‘ “quote_english”: “The human race is a monotonous affair. Most people spend the greatest part of their time working in order to live, and what little freedom remains so fills them with fear that they seek out any and every means to be rid of it.”,’

‘ “original_language”: “de”‘

‘ },’

‘ {‘

‘ “title_original”: “The Call of the Wild”,’

‘ “title_english”: “The Call of the Wild”,’

‘ “author”: “Jack London”,’

‘ “quote_english”: “A man with a club is a law-maker, a man to be obeyed, but not necessarily conciliated.”,’

‘ “original_language”: “en”‘

‘ }’

‘ ]’

‘}’

INTO l_json_string

SEPARATED BY cl_abap_char_utilities=>cr_lf .

DATA: l_json_root_object TYPE REF TO zcl_mdp_json_node.

l_json_root_object = zcl_mdp_json_node=>deserialize( json = l_json_string ).

DATA: l_string TYPE STRING.

l_string = l_json_root_object->object_get_child_node( KEY = ‘books’

)->array_get_child_node( INDEX = 1

)->object_get_child_node( KEY = ‘quote_english’ )->VALUE.

START-OF-SELECTION.

WRITE: ‘Quote from the first book: ‘, l_string .

Serialization Example:

DATA: l_string_1 TYPE STRING.

DATA: l_root_object_node TYPE REF TO zcl_mdp_json_node

,l_books_array_node TYPE REF TO zcl_mdp_json_node

,l_book_object_node TYPE REF TO zcl_mdp_json_node

,l_book_attr_string_node TYPE REF TO zcl_mdp_json_node .

*Create root object

l_root_object_node = zcl_mdp_json_node=>create_object_node( ).

*Create books array

l_books_array_node =  zcl_mdp_json_node=>create_array_node( ).

*add books array to root object with key “books”

l_root_object_node->object_add_child_node( child_key = ‘books’  child_node = l_books_array_node ).

*You would probably want to do this in a loop.

*Create book object node

l_book_object_node = zcl_mdp_json_node=>create_object_node( ).

*Add book object to books array

l_books_array_node->array_add_child_node( l_book_object_node ).

l_book_attr_string_node = zcl_mdp_json_node=>create_string_node( ).

l_book_attr_string_node->VALUE = ‘Kürk Mantolu Madonna’.

*Add string to book object with key “title_original”

l_book_object_node->object_add_child_node( child_key = ‘title_original’ child_node = l_book_attr_string_node ).

l_string_1 = l_root_object_node->serialize( ).

*ALTERNATIVE:

DATA: l_string_2 TYPE STRING.

*DATA: l_root_object_node_2 type zcl_mdp_json_node.

*Create same JSON object with one dot(.) and without data definitions using chaining.

l_string_2 = zcl_mdp_json_node=>create_object_node(

)->object_add_child_node( child_key = ‘books’ child_node = zcl_mdp_json_node=>create_array_node(

)->array_add_child_node( child_node = zcl_mdp_json_node=>create_object_node(

)->object_add_child_node( child_key = ‘title_original’ child_node = zcl_mdp_json_node=>create_string_node(

)->string_set_value( VALUE = ‘Kürk Mantolu Madonna’ )

)

)

)->serialize( ).

START-OF-SELECTION.

WRITE: / ‘string 1: ‘ , l_string_1.

WRITE: / ‘string 2: ‘ , l_string_2.

Challenge: Try doing these examples with CL_TREX_JSON_*

For more examples please visit GitHub repo.

Performance

On a test machine, using JSON string example above(l_json_string) deserializing and serializing again 10000 times takes 2.1 seconds on average. It shouldn’t have any performance problems with general usage. Complete benchmark code will be on the project repo.

DO 10000 TIMES.

zcl_mdp_json_deserializer=>deserialize(

EXPORTING json = l_json_string

IMPORTING node = l_jsonnode ).

zcl_mdp_json_serializer=>serialize(

EXPORTING node = l_jsonnode

IMPORTING json = l_json_string ).

ENDDO.

Links

Here is a presentation about this JSON library:

Medepia ABAP JSON Library ZCL_MDP_JSON

Project code repository:

GitHub – fatihpense/zcl_mdp_json: MDP ABAP JSON library that can generate and parse any JSON string.

Warning

The library isn’t extensively battle tested as of now. Testing your use case before using it in production is strongly advised. Please report if you encounter any bugs.

Conclusion

If you are just exposing a table as JSON without much modification, it is easier and probably better to use CL_TREX_JSON_*

If you are developing an extensive application and if you want to design your API beautifully, this library is a pleasant option for you.

Thanks for reading.

Best wishes,

Fatih Pense

Assigned Tags

      39 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Sandra Rossi
      Sandra Rossi

      Is there any advantage compared to using direct calls to the SXML library? (in JSON mode) cf ABAP and JSON in ABAP documentation

      By the way, CL_TREX_JSON* is not supported by SAP (cf https://service.sap.com/sap/support/notes/2141584).

      Author's profile photo Fatih Pense
      Fatih Pense
      Blog Post Author

      Thank you for the SAP note. So, CL_TREX_JSON* classes are intended for internal use only.

      As for the differences between this library and SAP transformation solution. Features of SAP transformation for JSON, and my personal thoughts are below:

      1. Standard and supported
        • ➕ This is certainly an advantage.
      2. Brings an extra layer of XML, which is necessary for utilizing underlying XML transformation structure.
        • ℹ If you like working with XML, this can be an advantage. Also you may use JSON-XML representation as a standard between two systems. However, performance effect of this extra layer should be researched. If they had created a JSON solution from scratch they probably wouldn't have added the XML layer.
      3. ID Transformation (predefined default transformation, ABAP to JSON and back). It has the same logic as XML id transformation
        • ➕ If you are using this, then you want a standard JSON representation for your data. Then it is an advantage.
      4. XSLT or Simple Transformation for manual mapping. You should create a valid JSON-XML representation.
        • ➖ If you are using this, you want flexibility. Most of the time, using a class within your code to map fields should be way easier to develop and debug than using XSLT or ST. 

      In that solution they have chosen XML as an intermediary object. We have chosen an ABAP class. I think this class is better suited for complex and custom JSON scenarios.

      Best wishes,

      Fatih

      Author's profile photo Sandra Rossi
      Sandra Rossi

      Thank you for this detailed feedback. For point 2, maybe Horst Keller can give us insights whether the JSON is based on the XML engine. Anyway, I doubt that an ABAP (interpreted byte code) solution is faster than a C (compiled) program, but you're right that's just a guess, this requires a performance comparison.

      Author's profile photo Horst Keller
      Horst Keller

      As described in ABAP News for Release 7.40 - ABAP and JSON | SCN and in the documentation ABAP and JSON, the native support of JSON in ABAP is indeed based on the XML engine used by the sXML-library, where JSON-XML serves as an intermediate format. The idea was reuse of an existing performant and robust infrastructure instead of creating another one besides.

      That in fact means, that you cannot deal with JSON alone, but that for parsing and rendering you have to know the JSON-XML-fomat. For serializing and deserializing ABAP data you have to know the asJSON format (that can be mapped to asJson-XML).

      Regarding functionality I'd say, that JSON-writers and JSON-readers based on the sXML-library should allow you to perform all rendering and parsing tasks that you need.

      Regarding performance one should really test. One could take the examples given above, transfer them to standard parsing and rendering and compare the runtime.

      Since the sXML-Library is implemented by kernel modules, I tend to agree with Sandra Rossi, that the natve support should be faster than an ABAP only implementation. The question is, if ZCL_MDP_JSON_NODE is doing it from scratch or if it is a wrapper für ABAP's native JSON support, that simply hides the JSON-XML-format.

      Author's profile photo Sandra Rossi
      Sandra Rossi

      Thanks Horst. When I posted, I didn't imagine that sXML JSON needed such "tricks" (JSON-XML) for generating JSON from scratch. ZCL_MDP_JSON_NODE is doing from scratch (string concatenation). I did a little performance comparison, I get on my computer a ratio of 1 for JSON-sXML against 1.7 for ZCL_MDP_JSON*. Test code at https://wiki.scn.sap.com/wiki/display/Snippets/Performance+of+JSON+from+scratch+-+sXML+versus+ZCL_MDP_JSON

      Author's profile photo Fatih Pense
      Fatih Pense
      Blog Post Author

      Horst Keller Thank you for clarification. I think utilizing an existing infrastructure is a good decision both from technical and from business perspectives.

      Sandra Rossi Thank you for benchmark. I was going to write it. Another result of the benchmark: You are faster than me 😀

      So in the light of new information, the advantages of this library can be:

      1. Manipulating JSON like XML DOM:
        • Cutting a part from one JSON and pasting under in another JSON is much easier in ZCL_MDP_JSON_NODE.
      2. In the scenario:
        • You want to render/encode a custom JSON or parse/decode from a custom JSON
        • And, you don't want to use Simple Transformation or XSLT

        Sandra, you have used the library for benchmark. I have learned a few nice tricks from your code. If you have any suggestions for improvement regarding ease of use, methods etc. I will be glad to listen.

        In the future, I want to try finite-state machine parser instead of regex. Also maybe I will try adding optional JSON-sXML serializer/renderer backend (maybe I will even write my own kernel module for fun 😳 ).

        Thank you both for your interest and fruitful conversation.

        Best wishes,

        Fatih

        Author's profile photo Marco Dahms
        Marco Dahms

        Hi Faith,

        thanks for contributing to this blog community and sharing your JSON library.

        A while ago I tried out all existing JSON libraries (all the ones that you mention above) available and encountered that most of them perform relatively slow when we deal with large volumes of data. I suspect this is due to the internal string handling (parsing and concatenating strings) in the ABAP layer. After some research I found that using the simple transformations (in particular I refer to identity transformations) the performance improved tremendously even with large data volumes. I believe this is the case because such transformations are executed inside the XML engine.

        Author's profile photo Uwe Fetzer
        Uwe Fetzer

        This is true, because the transformation is part of the Kernel (C/C++ is always faster than ABAP 😉 ).

        The addiditional libraries are only necessary if you have special needs (like lower case names, special date formats etc.)

        Author's profile photo Marco Dahms
        Marco Dahms

        Hi Uwe,

        thanks for the clarification.

        I always wondered why using transformations we have to stick to upper case JSON keys. What is the cause of this restriction?

        Thanks

        Marco

        Author's profile photo Sandra Rossi
        Sandra Rossi

        I guess that's only an arbitrary choice, but I think it's best to say either everything lower case, or everything upper case, and not allowing exceptions ("except in deserialization any case is allowed"), especially when it's about writing transformations for JSON, especially for Simple Transformations which are symmetrical.

        Author's profile photo Horst Keller
        Horst Keller

        See CALL TRANSFORMATION, where the facts are described:

        Serialization

        "The case of the names in the XML or JSON data depends on how they are specified in the ABAP runtime environment. If specified statically (b1, b2, ...), uppercase is used; if specified dynamically in stab, the case used there is used."


        Deserialization


        The case used in the XML or JSON data must match exactly the case specified in the ABAP runtime environment. If specified statically (b1, b2, ...), uppercase is used; if specified dynamically in rtab, the case used there is used.


        We had an discussion about that already in ABAP and JSON. Rüdiger Plantiko

        pointed it out: How should ABAP know how to convert upper case to mixed? lower?  case.


        Therefore, use your own transformations to adjust your data.



        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hi Marco,

        Thank you for your interest and support. I also think that slowness is caused by string operations. I searched for alternative options, but ABAP has a limited set of functions on this subject. If there was an string or byte stream interface between kernel and ABAP, any ABAP-only solution could be as nearly fast as C/C++ solution. (Interpreted languages can also be very fast!) Slowness increasing with size also suggests that underlying operations for string concatenation and substring is not suitable for this kind of algorithm.

        Performance by size can be an interesting benchmark subject. I wonder at which size(kb,length) performance of ABAP-only libraries starts to become unacceptable.

        On a side note, I learned that kernel modules are for internal usage. Even for research getting the source is difficult, otherwise I was eager to implement a module 😏

        Finally I think in many scenarios JSON data is not that big, if you are using service as an API for user interface(citation needed 😳 ). And if you want flexibility this library provides another option with nice manipulation methods.

        If you have any questions about this library, I will be glad to answer.

        Best wishes,

        Fatih

        Author's profile photo Uwe Fetzer
        Uwe Fetzer

        Hi Fatih,

        I've added your project to the ABAP Open Source projects list. (you need a fancy name for your project 😆 )

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hi Uwe, do you have a suggestion for name? What about "ABAP Flexibel JSON"? 😳

        Author's profile photo Uwe Fetzer
        Uwe Fetzer

        AFJ, yes, sounds good. But wait a week or so. Maybe one day you wake up in the night with the new name in mind 🙂

        Author's profile photo Attila Berencsi
        Attila Berencsi

        Hi Fatih,

         

        this is the second time I used your library in non-performance critical/non-mass processing objects. I can say, this is the most convenient library to use so far for my requirements. I do not need to fight with the pain, and I can realize injections really easily within 2 or 3 lines of coding. I can easiily access nodes where I already obtained a reference, and nested loops not needed. To be able to manage the JSON I need to use only node keys existing in the JSON documentat in real.  Many thanks for your great job again.

        Attila

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hi Attila, I'm glad this library made your work easier! I think you are perfectly in the audience I had in my mind when I was writing the library. Maybe I should use your comment as "user testimonial" 🙂

        Thanks for your kind words, support & bug reports!

        Best wishes,

        Fatih

        Author's profile photo Former Member
        Former Member

        Hi Fatih,

        I think in the "manuel mapping" part, we can use the class cl_trex_json*, the model will be

        change, internal data object -->JSON--->JSON_NODE--->JSON.

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hi Jack,

        I think cl_trex_json* directly converts to JSON string. If you accept the overhead of encoding/decoding two times, it can be used as a clever technique. In that case, zcl_mdp_json* will only give you the ability to edit your JSON structure in a flexible way.

        I have used the term "manual mapping" because with zcl_mdp_json* you have to decide which field in JSON string represents which field in your data object. There are alternative libraries that try to automatize mapping. However, I believe mandatory automatic mapping takes away expressiveness&power in the current solutions. There is a trade-off and you can use different libraries for different tasks. I haven't found an elegant way to connect the parsing and automatic mapping yet. I'm open to suggestions.

        Also, cl_trex_json* is not supported by SAP as Sandra noted in the comment above. So it is for internal use and it shouldn't be a base for future work.

        Thanks for your comment,

        Fatih

        Author's profile photo Yves Kemps
        Yves Kemps

        Hello Fatih,

        I have implemented the solution that you have developed and it works fine. However, when I have a variable with value null (null rather than “”), it gives a short dump as it  jumps to the next “. The problem is situated in method deserialize_node of class zcl_mdp_json_deserializer where statement 

        FIND REGEX ‘\{|\[|”|\d|t|f’ IN SECTION OFFSET l_offset OF json’ takes us to the next variable of the JSON string and ignores the null value. I have solved this by looking for ‘n’ in parallel. If the offset of the new regex is less than the original regex, I continue with the new offset.

        Here you can find an extract of the JSON file I have used.

        { "SI": [{"zzreference": "1/53201/2018/000001", "land1": null, "fwbas": "40.10", "wears": "EUR", "stcteg": null, "kursf": "1", "wmwst": "0.00", "stras": null, "zztrans": "INSTOR", "pstlz": null, "budat": "2018-02-06", "wrbtr": "40.10", "ort01": null, "vat_rate": "0.0", "name1": null, "bidat": "2018-02-06"},

         

        Here are the code lines I have added to solve the issue

        METHOD deserialize_node.
        
        DATA l_json TYPE string.
        l_json = json.
        
        DATA l_offset TYPE i.
        l_offset = offset_before.
        DATA(lv_offset_null) = offset_before.
        
        DATA l_len TYPE i.
        
        DATA : l_jsonnode TYPE REF TO zcl_mdp_json_node.
        
        FIND REGEX ‘\{|\[|”|\d|t|f’ IN SECTION OFFSET l_offset OF json
        MATCH OFFSET l_offset.
        * correction >>>>>>>>
        FIND REGEX ‘\{|\[|n|\d|t|f’ IN SECTION OFFSET lv_offset_null OF json
        MATCH OFFSET lv_offset_null.
        IF lv_offset_null LT l_offset.
        l_offset = lv_offset_null.
        ENDIF.
        * correction <<<<<<<<  
        
        CASE l_json+l_offset(1).

         

        Hope this is helps you to (further) improve this nice piece of work

        Kind regards

        Yves Kemps

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hi Yves,

        Thank you for your report and nice words!

        I have opened an issue here and I will add your fix: https://github.com/fatihpense/zcl_mdp_json/issues

        (I also edited your comment as a mod, because it was containing CSS, and used code sample feature instead.)

        Kind regards,

        Fatih

        Author's profile photo Alejandro Bindi
        Alejandro Bindi

        Hello Fatih,

        I’m trying the classes and both pending bugs aside (which are documented in github and I solved manually), I think they are working good, but it seems I found an error on this method (which doesn’t seem to have impact, maybe in performance with large data it could):

         

          METHOD object_add_child_node.
        
            DATA : wa_array_children TYPE lcl_mdp_json_node=>typ_array_children .
        
            wa_array_children-node = child_node .
        
            APPEND wa_array_children TO me->array_children.
        
        
            DATA : wa_object_children TYPE  lcl_mdp_json_node=>typ_object_children .
            wa_object_children-key = child_key .
        
            wa_object_children-node = child_node .
        
            INSERT wa_object_children INTO TABLE me->object_children.
        
            object_node = me.
        
        
          ENDMETHOD.                    "object_add_child_node
        
        

         

        Shouldn’t the method insert only to OBJECT_CHILDREN (and not append to ARRAY_CHILDREN, which is what ARRAY_ADD_CHILD_NODE does…)?

        Thank you!

         

        EDIT: Just found another bug which generates a dump: Escaped double quotes. This JSON is valid, can be checked online in https://jsonformatter.curiousconcept.com/#

        {"AreaFunc":"/BDL/BDL3","NumMsj":"050","Texto":"Las entradas en log de mensajes que no son del tipo \"E\" se pueden ignorar"}

         

        However it causes a dump in DESERIALIZE_OBJECT (deserialization operation).

        Also, when trying to serialize a string node which has double quotes, they are not escaped.

         

        In both operations, backslash for escaping double quotes should be considered.

         

        Fatih Pense could you please check this issues and maybe open a bug? (I'm no github user, at least yet)

         

        Thank you again!

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hello Alejandro, thank you for your comment & findings! I will create issues for them so they will be more visible & documented.

        EDIT: I have created issue#10 and issue#11

        It has been 4 years since I have shared this library so I can write a recap of the situation.

        • There are good standard libraries but this is still getting some interest. I believe that is because this library still fills a niche for arbitrary JSON data.
        • I want to improve the library or find a maintainer friend to help with the library. As I don’t do much ABAP daily, I need to focus on it.
        • Another reason for the stalled development is, I wanted to make it perfect regarding JSON validity & errors. But ABAP lacks good low-level string operations. Maybe I need to find another way to operate on the strings. I’m open to suggestions here!
        • If I find another approach with string handling, I can give it another shot. But I can’t promise at the moment ?

        Thanks & best regards,
        Fatih

         

         

         

        Author's profile photo Alejandro Bindi
        Alejandro Bindi

        Hello Fatih Pense and thanks again for your reply!

        In my case the interest in this library is because A) I’m working at a customer with a really old release (7.0). So there’s no support at all for JSON with neither CALL TRANSFORMATION, nor recent classes such as /UI2/CL_JSON, and B) it’s the only one I’ve found so far which allows me to control case sensitive names in the JSON, both for serializing and deserializing. That is lost with automated mapping.

        For serializing I did a quick fix; although it doesn’t consider other characters which could need to be escaped apart from double quotes and backslash, I’m not sure other libraries do either (see https://stackoverflow.com/questions/3020094/how-should-i-escape-strings-in-json):

        method serialize_node.
        ...
            case    jsonnode->json_type.
              when  zcl_mdp_json_node=>co_json_string.
                l_json = jsonnode->value.                           "FIX A.BINDI 2020.11.30
                replace all occurrences of '\' in l_json with '\\'. "FIX A.BINDI 2020.11.30
                replace all occurrences of '"' in l_json with '\"'. "FIX A.BINDI 2020.11.30
                concatenate '"' l_json '"' into l_json.             "FIX A.BINDI 2020.11.30
        ...

        (EDITED 2020.11.30)

         

        The bigger problem is at deserializing (DESERIALIZE_NODE), since this RegEx interprets an escaped double quote as the string ending:

        find regex '"([^"]*)"' in section offset l_offset of json

        I have learning RegEx in my pending list, so I’m not sure how to fix it. But I think the logic should be: Consider double quote as the ending, ONLY if:

        1. It doesn’t have a backslash escaping it (1 position back) OR
        2. The backslash has another backslash escaping it (2 positions back)

        So:

        “This has an escaped double quote: \”” (the string is really “This has an escaped double quote: “”)

        “This has an escaped backslash: \\” (the string is really “This has an escaped backslash: \”)

        “This has an escaped backslash AND an escaped double quote: \\\”” (the string is really “This has an escaped backslash AND an escaped double quote: \””)

        Then the unescaping of the backslash itself should be taken care of.

        https://jsonformatter.curiousconcept.com is great for checking this cases out…

         

        Maybe some RegEx expert can help us to fix it!

        Thanks again

        Author's profile photo Sandra Rossi
        Sandra Rossi
        find regex '"((?:\\.|[^\\"])*)"' in section offset l_offset of json

        EDIT: syntax error corrected.

        Author's profile photo Alejandro Bindi
        Alejandro Bindi

        Thank you Sandra, but your proposal throws an error: Regular expression '"(?:\\.|[^"])*)"' is invalid in character position 15. I've checked in program DEMO_REGEX_TOY and throws error as well. Could you please check it?

        Thanks!

        Author's profile photo Sandra Rossi
        Sandra Rossi

        Thx! Corrected, please check (corrections: start parenthesis to capture the value between quotes was missing, and there was an error with \)

        Author's profile photo Alejandro Bindi
        Alejandro Bindi

        Thanks again Sandra, it seems to be working good now!

        I did the following modifications to method DESERIALIZE_NODE (change in RegEx + REPLACE for escaped chars, inverse case of SERIALIZE_NODE above):

        ...
            when '"'.
              data l_submatch type string.
        
        *     FIND REGEX '"([^"]*)"' IN SECTION OFFSET l_offset OF json             "FIX S.ROSSI 2020.11.30
              find regex '"((?:\\.|[^\\"])*)"' in section offset l_offset of json   "FIX S.ROSSI 2020.11.30
              match offset l_offset match length l_len
              submatches l_submatch.
              if co_debug_mode = 1.
                write: / 'string:' , l_submatch.
              endif.
        
              replace all occurrences of '\"' in l_submatch with '"'.               "FIX A.BINDI 2020.11.30
              replace all occurrences of '\\' in l_submatch with '\'.               "FIX A.BINDI 2020.11.30
        
              create object l_jsonnode
        ...

         

        Fatih Pense I think this also solves this issue which I didn’t relate before: https://github.com/fatihpense/zcl_mdp_json/issues/9

        I’ll let you know if I find something more.

        Thanks again to you both!

         

        EDIT: Updated credits on the comments, also updated SERIALIZE_NODE fix part above (older version made replacements on the node instance itself, which cumulated replacements on repeated SERIALIZE calls).

        Another minor fix, methods SERIALIZE_OBJECT and SERIALIZE_ARRAY can be safely removed from class ZCL_MDP_JSON_SERIALIZER (they are empty and unused anywhere).

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Dear Sandra Rossi and Alejandro Bindi,

        Thank you for your interest in the library and for the code discussion.

        I thought about these regexes in the past but I figured when I fix something, something else breaks! It can be a case study of why using regex is not the best option for parsing things 🙂 As a quick fix, you can find a regex that works for you, test it, and continue using it.

        Recently, I have made some performance tests for reading JSON string character by character. Since everything is stored in the memory it seems pretty fast. I can also trade some performance for correctness. I will give it a try. The only problem is time. In the case of new developments, I will keep you updated.

        Best regards,
        Fatih Pense

         

        Author's profile photo Alejandro Bindi
        Alejandro Bindi

        Hello again, Fatih Pense another issue and quick fix for method ZCL_MDP_JSON_DESERIALIZER=>DESERIALIZE (you could document it on github):

        On trying to deserialize an empty JSON string, you get the following short dump:

        STRING_OFFSET_TOO_LARGE
        CX_SY_RANGE_OUT_OF_BOUNDS
        Illegal access to a string (offset too large)

        I solved it by adding the following correction to raise the same exception used for other errors:

        method deserialize.
        
        
            data : l_jsonnode type ref to zcl_mdp_json_node.
        
        * correction >>>>>>>>    (FIX A.BINDI 2020.12.11)
          if json is initial.
            raise exception type zcx_mdp_json_invalid.
          endif.
        * correction <<<<<<<<    (FIX A.BINDI 2020.12.11)
        
            deserialize_node(
          exporting
            json = json
            offset_before = 0
          importing
            jsonnode = l_jsonnode ) .
        
            node = l_jsonnode.
        
        endmethod.

         

        Regards

         

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hello Alejandro,

        Thank you for the report and fix! I have created issue#12 It will be useful for the documentation.

        Regards,
        Fatih Pense

        Author's profile photo María Elena Ramírez Martínez
        María Elena Ramírez Martínez

        how deserialize all nodes without index, for example, i need all information from node "quote english" not only one. thank you

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hi Maria, good question. Maybe I should make `array_children` returnable by a public method.

        I will continue new development in another repo, because regex was causing too many bugs: https://github.com/fatihpense/abap-tasty-json

        I created an issue for this use case:
        https://github.com/fatihpense/abap-tasty-json/issues/1

        Regards,
        Fatih

        Author's profile photo Alejandro Bindi
        Alejandro Bindi

        Hello Again Fatih Pense , Sandra Rossi and all, after a while I have found two other bugs in the deserializer coding, for one of them I have a quick fix, but the other one involves a REGEX as before so I'm not finding the solution so easily...maybe you can lend a hand.

         

        • First one: this JSON throws an exception in method DESERIALIZE_OBJECT:
        				{
        					"applicationDate": null,
        					"applicationQuantity": 0.0,
        					"lackPeriodDay": 0
        				}

        After using this REGEX:

        * require a key
              FIND REGEX '\A\s*"([^:]*)"\s*:' IN SECTION OFFSET l_offset OF json
                                MATCH OFFSET l_offset MATCH LENGTH l_len
                                SUBMATCHES l_submatch.

        It seems, the decimal separator is not contemplated in the possible values. Been trying a while but I'm struggling to fix it.

         

        • Second one: this JSON also throws the exception, and I've narrowed it down to the spaces between colons and object values:
        {
        	"movBatchItem": [
        		{
        			"downgrade": false,
        			"movBatchItemBins": [
        				{
        					"downgrade": false
        				}
        			],
        			"movBatchItemQuality": [
        				{
        					"Downgrade": false
        				}
        			],
        			"movBatchItemTreatment": [
        				{
        					"Downgrade": null
        				}
        			]
        		}
        	]
        }

        So my quick fix is to do this simple replace on the whole JSON string before calling deserialize:

        REPLACE ALL OCCURRENCES OF `": ` IN g_json WITH `":`.

        ...but probably there's a better solution, and also doesn't contemplate escaped double quotes which would be the case for a text containing this sequence.

         

        Would appreciate any help.

        Thank you!

        Author's profile photo Sandra Rossi
        Sandra Rossi

        You should better report the issue to the github project. I'd say that the regex

        \A\s*"([^:]*)"\s*:

        is strange because of ^: which I think should be ^" (just a feeling) :

        \A\s*"([^"]*)"\s*:

         

        Author's profile photo Alejandro Bindi
        Alejandro Bindi

        Hello Sandra, I'm not a github user yet and since we were already discussing all this kind of bugs in here, I thought it would be better to continue like this, then Fatih can upload it to github as before.

        I've been trying to solve this on my own, took a Regex tutorial and reviewed the class code, still far from even rookie on the subject, but I now think actually the problem is again in DESERIALIZE_NODE method. When acording to the first character, a number type node is determined ("WHEN OTHERS. FIND REGEX '\d+' IN SECTION OFFSET l_offset OF json", near the end):

        METHOD deserialize_node.
        
          DATA l_json TYPE string.
          l_json = json.
        
          DATA l_offset TYPE i.
          l_offset = offset_before.
        
        *correction >>>>>>>>   https://github.com/fatihpense/zcl_mdp_json/issues/6
          DATA lv_offset_null TYPE i.
          lv_offset_null = offset_before.
        *correction <<<<<<<<   https://github.com/fatihpense/zcl_mdp_json/issues/6
        
          DATA l_len TYPE i.
        
          DATA : l_jsonnode TYPE REF TO zcl_mdp_json_node.
        
          FIND REGEX '\{|\[|"|\d|t|f' IN SECTION OFFSET l_offset OF json
          MATCH OFFSET l_offset.
        
        *correction >>>>>>>>   https://github.com/fatihpense/zcl_mdp_json/issues/6
          FIND REGEX '\{|\[|n|\d|t|f' IN SECTION OFFSET lv_offset_null OF json
          MATCH OFFSET lv_offset_null.
          IF lv_offset_null LT l_offset.
            l_offset = lv_offset_null.
          ENDIF.
        *correction <<<<<<<<   https://github.com/fatihpense/zcl_mdp_json/issues/6
        
          CASE l_json+l_offset(1).
            WHEN '{'.
              l_offset = l_offset + 1.
              deserialize_object( EXPORTING json = l_json offset_before = l_offset IMPORTING jsonnode = l_jsonnode offset_after = l_offset ).
              jsonnode = l_jsonnode.
              offset_after = l_offset.
            WHEN '['.
              l_offset = l_offset + 1.
              deserialize_array( EXPORTING json = l_json offset_before = l_offset IMPORTING jsonnode = l_jsonnode offset_after = l_offset ).
              jsonnode = l_jsonnode.
              offset_after = l_offset.
            WHEN '"'.
              DATA l_submatch TYPE string.
        
        * correction >>>>>>>>   https://github.com/fatihpense/zcl_mdp_json/issues/11 (FIX A.BINDI 2020.11.30)
        *     FIND REGEX '"([^"]*)"' IN SECTION OFFSET l_offset OF json             "FIX S.ROSSI 2020.11.30
              FIND REGEX '"((?:\\.|[^\\"])*)"' IN SECTION OFFSET l_offset OF json   "FIX S.ROSSI 2020.11.30
              MATCH OFFSET l_offset MATCH LENGTH l_len
              SUBMATCHES l_submatch.
              IF co_debug_mode = 1.
                WRITE: / 'string:' , l_submatch.
              ENDIF.
        
              REPLACE ALL OCCURRENCES OF '\"' IN l_submatch WITH '"'.               "FIX A.BINDI 2020.11.30
              REPLACE ALL OCCURRENCES OF '\\' IN l_submatch WITH '\'.               "FIX A.BINDI 2020.11.30
        * correction <<<<<<<<   https://github.com/fatihpense/zcl_mdp_json/issues/11 (FIX A.BINDI 2020.11.30)
        
              CREATE OBJECT l_jsonnode
                TYPE
                  zcl_mdp_json_node
                EXPORTING
                  json_type         = zcl_mdp_json_node=>co_json_string.
              l_jsonnode->value = l_submatch .
        
              offset_after = l_offset + l_len.
            WHEN 't'.
              IF l_json+l_offset(4) = 'true'.
                CREATE OBJECT l_jsonnode
                  TYPE
                    zcl_mdp_json_node
                  EXPORTING
                    json_type         = zcl_mdp_json_node=>co_json_true.
                l_jsonnode->value = l_json+l_offset(4).
                offset_after = l_offset + 4.
        
                IF co_debug_mode = 1.
                  WRITE: / 'true'  .
                ENDIF.
              ELSE.
                RAISE EXCEPTION TYPE zcx_mdp_json_invalid.
              ENDIF.
        
            WHEN 'n'.
              IF l_json+l_offset(4) = 'null'.
                CREATE OBJECT l_jsonnode
                  TYPE
                    zcl_mdp_json_node
                  EXPORTING
                    json_type         = zcl_mdp_json_node=>co_json_null.
                l_jsonnode->value = l_json+l_offset(4).
                offset_after = l_offset + 4.
        
                IF co_debug_mode = 1.
                  WRITE: / 'null'  .
                ENDIF.
              ELSE.
                RAISE EXCEPTION TYPE zcx_mdp_json_invalid.
              ENDIF.
            WHEN 'f'.
              IF l_json+l_offset(5) = 'false'.
                CREATE OBJECT l_jsonnode
                  TYPE
                    zcl_mdp_json_node
                  EXPORTING
                    json_type         = zcl_mdp_json_node=>co_json_false.
                l_jsonnode->value = l_json+l_offset(5).
                offset_after = l_offset + 5.
        
                IF co_debug_mode = 1.
                  WRITE: / 'false'  .
                ENDIF.
              ELSE.
                RAISE EXCEPTION TYPE zcx_mdp_json_invalid.
              ENDIF.
            WHEN OTHERS.
              FIND REGEX '\d+' IN SECTION OFFSET l_offset OF json
              MATCH OFFSET l_offset MATCH LENGTH l_len.
              IF co_debug_mode = 1.
                WRITE: / 'number:'  , l_json+l_offset(l_len).
              ENDIF.
        
              CREATE OBJECT l_jsonnode
                TYPE
                  zcl_mdp_json_node
                EXPORTING
                  json_type         = zcl_mdp_json_node=>co_json_number.
              l_jsonnode->value = l_json+l_offset(l_len).
              offset_after = l_offset + l_len.
          ENDCASE.
        
          jsonnode = l_jsonnode.
        ENDMETHOD.

         

        I think this FIND REGEX '\d+' should be replaced considering number nodes can have dot as decimal separator (as in my case), also an optional negative sign in front, and an "e" for exponential notation, as this link for example states: https://json-schema.org/understanding-json-schema/reference/numeric.html

        Been trying to build something like '-?(\d+\.\d+|\d+)' but I'm not sure it's ok...and also the e would be missing...

        Comments or corrections appreciated...

        Thank you!

        Author's profile photo Sandra Rossi
        Sandra Rossi

        https://stackoverflow.com/questions/13340717/json-numbers-regular-expression

        Author's profile photo Fatih Pense
        Fatih Pense
        Blog Post Author

        Hello Sandra Rossi,

        Thanks for responding in the comments for years and being a valuable contributor to the community.

        I have updated the library under another name, the new version doesn't use regex. So, at least bugs are solvable in case they occur. I also plan to play with ABAP unit testing.

        If you wonder how it is done without regex, here is the deserializer class: https://github.com/fatihpense/abap-tasty-json/blob/main/src/zcl_tasty_json_deserializer.clas.abap

        Hopefully, we won't have to solve regex problems and see more interesting issues to tinker with.

        I might publish a blog post starting with the phrase "Use the standard if you can!"

        I'm open to any advice/ideas.

        Best regards,
        Fatih

        Author's profile photo Alban Leong
        Alban Leong

        Just wanted to drop in a comment to say that I absolutely love this library.. it makes traversing a json string so logical and so much easier!!! Thanks!