Skip to Content

In first part of this blog I give introduction to OpenXML in word processing. In second part I will provide ABAP code how to read word files.

Starting with Microsoft Word 2007 when you create new document in word and save it – a new file is created with extension  “*.docx”. This file represents zipped xml files which describe whole word document. It includes, texts, tables, font sizes, colors, comments, margin settings, sections settings and everything what user manually placed and maintained in document. It is all about xml files bounded via relations one with each other in specific structure and zipped into file.

To explore this structure create your test document with something in it and save it. Rewrite extension “*.docx” into “*.zip” and unzip file. After unzpipping you see all xml files in specified structure. If you need to check and have a look at these xml files often I reccomend more convenient way. I suggest to install OOXML Tool which is add-on for Chrome browser. In easy drag and drop way you can see whole word document.

For example I created Test.docx with text “Hello World”. Note that until you provide any input in word it has size of 0. I drag word file into chrome using above mentioned add-on to see xml structure of word docuemnt. I look for /word/document.xml to see text tag which holds value “Hello world”.

Each xml file describes properties for document parts or relation between parts. For example:

  • Conten_types xml describes type of content used in each part of whole document(package)
  • _rels part describes relation between two parts
  • doc properties part describe general properties of document in app and core xml file (application, author, version…)
  • custom xml part is part which can hold customer specific data – this will be more described in other blog
  • content of document is in /word/document.xml file
  • fontTable xml contains information about used font types
  • styles xml describes used styles

SAP provides class CL_DOCX_DOCUMENT which can help us to read and modify word document and go through its structure. Here is simple code which does the job..

*&---------------------------------------------------------------------*
*& Report  ZDOCX_DOCUMENT
*&
*&---------------------------------------------------------------------*
*& Report demonstrates using CL_DOCX_DOCUMENT class to read and maintain
*& word document.
*& Pavol Olejar 23.4.2017
*&---------------------------------------------------------------------*
REPORT zdocx_document.

DATA: lv_length   TYPE i,
      lt_data_tab TYPE STANDARD TABLE OF x255,
      lv_docx     TYPE xstring,
      lv_string   TYPE string,
      lv_xml      TYPE xstring,
      lr_docx     TYPE REF TO cl_docx_document,
      lr_main     TYPE REF TO cl_docx_maindocumentpart.
* Upload file
CALL METHOD cl_gui_frontend_services=>gui_upload
  EXPORTING
    filename   = 'C:\Test.docx'
    filetype   = 'BIN'
  IMPORTING
    filelength = lv_length
  CHANGING
    data_tab   = lt_data_tab.
* Get XSTRING format from BIN table
CALL FUNCTION 'SCMS_BINARY_TO_XSTRING'
  EXPORTING
    input_length = lv_length
  IMPORTING
    buffer       = lv_docx
  TABLES
    binary_tab   = lt_data_tab.
* Instanciate word document in ABAP class CL_DOCX_DOCUMENT
CALL METHOD cl_docx_document=>load_document
  EXPORTING
    iv_data = lv_docx
  RECEIVING
    rr_doc  = lr_docx.
* Get main part where content of word document is stored
lr_main = lr_docx->get_maindocumentpart( ).
* Get data (XSTRING) of main part
lv_xml = lr_main->get_data( ).
* Convert to string for simple maintaining
CALL FUNCTION 'CRM_IC_XML_XSTRING2STRING'
  EXPORTING
    inxstring = lv_xml
  IMPORTING
    outstring = lv_string.
* Change text
REPLACE FIRST OCCURRENCE OF 'Hello world.' IN lv_string
WITH 'Hello world. This is my Test_new.docx document.'.
* Convert back to XTSRING
CALL FUNCTION 'SCMS_STRING_TO_XSTRING'
  EXPORTING
    text   = lv_string
  IMPORTING
    buffer = lv_xml.
* Replace main part with new data and save it
lr_main->feed_data( iv_data = lv_xml ).
lv_docx = lr_docx->get_package_data( ).
* Save new word document locally
lv_length  = xstrlen( lv_docx ).

CALL FUNCTION 'SCMS_XSTRING_TO_BINARY'
  EXPORTING
    buffer     = lv_docx
  TABLES
    binary_tab = lt_data_tab.

CALL METHOD cl_gui_frontend_services=>gui_download
  EXPORTING
    bin_filesize      = lv_length
    filename          = 'C:\Test_new.docx'
    filetype          = 'BIN'
    confirm_overwrite = 'X'
  CHANGING
    data_tab          = lt_data_tab.

Methods get*part of class can provide different parts of document. Inhere we were interested in main part.

Method get_data( ) will give you back xml file from the part and using method feed_data( ) you store xml in used part of the document. These methods are part of every class which represents different parts of documents. For example In our case it is CL_DOCX_MAINDOCUMENTPART. See in debugger

Method get_package_data( ) of class CL_DOCX_DOCUMENT will save all current parts and pack them into zip file.

You can check that in debugger when looking at variables lv_xml and lv_docx using view XML browser. For variable lv_xml you see xml file of main part.

For lv_docx you are prompt with pop-up if you want to save zip.file which is result of get_package_data( ) method.

In my next blog I will describe custom part of word document and how ABAP developer can use it.

To report this post you need to login first.

11 Comments

You must be Logged on to comment or reply to a post.

  1. Athavan Raja Durairaj

    Excellent blog. A good ABAP tools for listing the package is available. check program ROPENXML_LISTER. Also I have enhanced CL_DOCX_ALTERNATIVEFORMATPART to allow for DOCX and MHT files as altChunks. I have written few transformations to update chart data and the replace content controls. Hope to write a blog about it sometime soon.

     

    (0) 
  2. Andre Adam

    Hello colleagues,

    as fas as I understand this is only possibe, if the gui frontend services are available, right?

    Are there also solutions if we work with UI5, OData and HANA, means without any SAP Gui Services?

    Regards

    André

     

    (0) 
    1. Pavol Olejar Post author

      Hi Andre,

      Maybe I can imagine if you do this stuff in a way of web service which can be consumed by your application this might be possible. But take this only as possible option to explore. I do not have much experience with this. Actually Java nad C offers much more possibilities in more convenient way how to work with altChunk. To be able to process this in ABAP first I had to explore sample codes in C and Java – you can find really plenty of those on web.

      Regards

      Pavol

      (0) 
      1. Andre Adam

         

        Hello Pavol,

        thanks for you answer, but I do not understand how Java and C comes in here. I searching for a ABAP solution without SAP Gui installed. We use NetWeaver 7.50 with OData (Gateway) and because of this no SAP Gui.

        Regards

        André

        (0) 
        1. Pavol Olejar Post author

          Hi Andre,

          I was not probably clear. I am not so familiar with SAP UI5 but as far as I know javascript is big part of it. Which is also not my cup of tea – yet :). So it came into my mind that instead of relying only on ABAP, try implement some kind of “Service” which does the job and call it from your UI5 application. And this service does not need to be limited for ABAP and you can use maybe Java, C to code it. Then make this service avalilable. But I repeat take this only as suggestion – it is not something I tried before.

          Regards

          Pavol

           

          (0) 
          1. Andre Adam

            Hello Pavol,

            you are right JavaScript is used in UI5, but there we can’t convert. We are searching for a solution where we can convert an xstring, which is an excel file, in ABAP to ABAP internal tables.

            We get the xstring using OData in our ABAP stack and need to convert it here into ABAP internal tables. We have no Java or C in our stack. So we need a ABAP solution. Because of this I ask if your solution is usable in ABAP, but without SAP GUI.

            We can’t use cl_gui_frontend_services, but I’m not sure if we can use cl_docx_document. Is it possible to use this class cl_docx_document also without having cl_gui_frontend_services in place?

            Regards

            André

            (0) 
            1. Pavol Olejar Post author

               

              Hi Andre,

              Once you have XTSRING it is all you need as input for class CL_DOCX_DOCUMENT. If you have this class (standard or Z-one) in your ABAP stack you can use it without SAp GUI I think. Just try it.

              Regards

              Pavol

              (1) 
            2. Durairaj Athavan Raja

               

              You can do that. use the xstring with cl_docx_document class to parse through and get the sheet part(CL_XLSX_WORKSHEETPART) and get the sheet data in xml format. you can then use a custom ST or XSLT to transform the xml to internal table.

              (1) 

Leave a Reply