OpenXML in word processing – how to merge multiple...

paly_o · ‎04-25-2017

In my previous blogs Custom XML part – mapping flat data and Custom XML part – mapping structured data I have discussed custom XML parts and how to use it to bind custom data in document. Please read these blogs before going through this one and if you are new to this topic I reccomend to start with introduction to OpenXML in ABAP. Here I will try to show how we can merge multiple word documents into one final document. By merging in this case I mean putting separately stored word documents files into one big word file going one by one.

There are two ways how to do this. You can write a program which will read whole documents tag by tag and put his into other document. During this process you will have to deal with plenty of issues like comments, styling, margins, footers, headers and many other things. Something like you would do when manually copying content of one word document into other using Ctrl+C and Ctrl+V. Second much easier way is to use tag called altChunk.

AltChunk stands for alternative chunk. This will do most of the job instead of you. It is very useful and powerful technique in comparism to 1st option. AltChunk tag tells application to import content stored in alternative part of document into main document part where tag can be used. Microsoft Word is able to import couple of content types with tag AltChunk. You can import html, rtf, xhtml, xml, textplain, macro, word template or another word document. We are interested in last option.

I investigated if SAP offers any standrad funcitonality what can be used to "altChunk" more documents into one final and I came into conclusion that there is none. Standard ABAP class which work with word document in openXML approach is CL_DOCX_DOCUMENT. But it does not offer options needed to use altChunk technique.

In order to altChunk one document into other you have to:

create new alternative part with unique ID for document (merging document)

store a content of document which you want to merge/import in newly created alternative part

create new relation in final document between main document part (final merged document) and alternative part(merging document)

add altChunk tag in main document part

To be able to do this we will enhance, copy and change couple of standard classes. Before you want to use any code from this blog you have to go through steps in altChunk - preparation of code where I describe all preparation steps to use altChunk.

Once finished with preparation code part let's prepare some test documents for merging. I created 3 documents for merging and 1 empty document which is empty.

merge1.docx with simple text "Hello World."

merge2.docx with simple text and section break - next page.

merge3.docx with simple text.

final.docx as empty document - make sure it has no size 0.

Documents are ready so we can test following code.

*&---------------------------------------------------------------------*

*& Report ZALTCHUNK

*&---------------------------------------------------------------------*

*& Report demonstrates altChunk usage in ABAP

*& Pavol Olejar 23.4.2017

*&---------------------------------------------------------------------*

REPORT zaltchunk.



DATA: lr_merge1        TYPE REF TO cl_docx_document,

      lr_merge2        TYPE REF TO cl_docx_document,

      lr_merge3        TYPE REF TO cl_docx_document,

      lr_final         TYPE REF TO zcl_docx_document,

      lr_main          TYPE REF TO zcl_docx_maindocumentpart,

      lr_altpart1      TYPE REF TO cl_docx_alternativeformatpart,

      lr_altpart2      TYPE REF TO cl_docx_alternativeformatpart,

      lr_altpart3      TYPE REF TO cl_docx_alternativeformatpart,

      docx             TYPE xstring,

      mainx            TYPE xstring,

      lv_id            TYPE string,

      s                TYPE string,

      lv_current_chunk TYPE string,

      lv_replace       TYPE string,

      lv_length        TYPE i,

      lt_data_tab      TYPE STANDARD TABLE OF x255.



* READ final document. Note we are using z-class.

PERFORM load_file USING 'C:\final.docx'

                  CHANGING docx.

lr_final = zcl_docx_document=>load_document( iv_data = docx ).

lr_main = lr_final->get_maindocumentpart( ).

* ADD alternative parts

lr_altpart1 = lr_main->add_alternativeformatpart( iv_content_type = cl_docx_alternativeformatpart=>co_content_type_word ).

lr_altpart2 = lr_main->add_alternativeformatpart( iv_content_type = cl_docx_alternativeformatpart=>co_content_type_word ).

lr_altpart3 = lr_main->add_alternativeformatpart( iv_content_type = cl_docx_alternativeformatpart=>co_content_type_word ).

* Read document to be merged/inserted

PERFORM load_file USING 'C:\merge1.docx'

                  CHANGING docx.

* Provide data to store in alternative part

lr_altpart1->feed_data( iv_data = docx ).

* REPEAT for 2nd and 3rd file

PERFORM load_file USING 'C:\merge2.docx'

                  CHANGING docx.

lr_altpart2->feed_data( iv_data = docx ).



PERFORM load_file USING 'C:\merge3.docx'

                  CHANGING docx.

lr_altpart3->feed_data( iv_data = docx ).

* Get xml of main part to insert altChunk tags using string operations

mainx = lr_main->get_data( ).

CALL FUNCTION 'CRM_IC_XML_XSTRING2STRING'

  EXPORTING

    inxstring = mainx

  IMPORTING

    outstring = s.



lv_id = lr_main->get_id_for_part( lr_altpart1 ).

CONCATENATE '<w:altChunk r:id="' lv_id '" />' INTO lv_current_chunk.

lv_id = lr_main->get_id_for_part( lr_altpart2 ).

CONCATENATE lv_current_chunk '<w:altChunk r:id="' lv_id '" />' INTO lv_current_chunk.

lv_id = lr_main->get_id_for_part( lr_altpart3 ).

CONCATENATE lv_current_chunk '<w:altChunk r:id="' lv_id '" />' INTO lv_current_chunk.

* Prepare alt chunk tags

CONCATENATE '<w:body>'

            lv_current_chunk

            '</w:body>' INTO lv_replace.

* Replace body tag

REPLACE FIRST OCCURRENCE OF REGEX '<w:body>.*</w:body>' IN s WITH lv_replace.

CALL FUNCTION 'CRM_IC_XML_STRING2XSTRING'

  EXPORTING

    instring   = s

  IMPORTING

    outxstring = mainx.

* Provide new main part with alt chunk tags and save document

lr_main->feed_data( iv_data = mainx ).

docx = lr_final->get_package_data( ).

lv_length  = xstrlen( docx ).



CALL FUNCTION 'SCMS_XSTRING_TO_BINARY'

  EXPORTING

    buffer     = docx

  TABLES

    binary_tab = lt_data_tab.



CALL METHOD cl_gui_frontend_services=>gui_download

  EXPORTING

    bin_filesize      = lv_length

    filename          = 'C:\final_new.docx'

    filetype          = 'BIN'

    confirm_overwrite = 'X'

  CHANGING

    data_tab          = lt_data_tab.



FORM load_file  USING path TYPE string

                CHANGING docx TYPE xstring.



  CALL METHOD cl_gui_frontend_services=>gui_upload

    EXPORTING

      filename   = path

      filetype   = 'BIN'

    IMPORTING

      filelength = lv_length

    CHANGING

      data_tab   = lt_data_tab

    EXCEPTIONS

      OTHERS     = 19.



  CALL FUNCTION 'SCMS_BINARY_TO_XSTRING'

    EXPORTING

      input_length = lv_length

    IMPORTING

      buffer       = docx

    TABLES

      binary_tab   = lt_data_tab

    EXCEPTIONS

      OTHERS       = 2.

ENDFORM.

Result final_new.docx document should looks like this:

1st page

2nd page

In this example you can see that altChunk imports content of alternative part one by one. When 1st and 2nd documents were merged their content is going one by one. With 3rd document I used section break - new page tag which moves its content to new page.

There is lot of room to play with it and take this as very simple example how to use it. You can play with page orientation, footers and headers, page numbering or different section breaks to achieve what you need.

Also note that if you open final document (after creating it using ABAP) in word application and save it again then altChunk tags will be gone. All imported contents will be saved under standrad tags and relation to alternative parts will be also gone.

OpenXML in word processing – how to merge multiple word documents into one using altChunk

Get Started with the ABAP Development Tools for SAP NetWeaver

Become an ABAP in Eclipse Feature Explorer and earn the Explorer Badge

Six kinds of debugging tips to find the source code where the message is raised