Skip to Content
Technical Articles
Author's profile photo Pavol Olejar

OpenXML in word processing – how to merge multiple word documents into one using altChunk

In my previous blogs Custom XML part – mapping flat data and Custom XML part – mapping structured data I have discussed custom XML parts and how to use it to bind custom data in document. Please read these blogs before going through this one and if you are new to this topic I reccomend to start with introduction to OpenXML in ABAP. Here I will try to show how we can merge multiple word documents into one final document. By merging in this case I mean putting separately stored word documents files into one big word file going one by one.

There are two ways how to do this. You can write a program which will read whole documents tag by tag and put his into other document. During this process you will have to deal with plenty of issues like comments, styling, margins, footers, headers and many other things. Something like you would do when manually copying content of one word document into other using Ctrl+C and Ctrl+V. Second much easier way is to use tag called altChunk.

AltChunk stands for alternative chunk. This will do most of the job instead of you. It is very useful and powerful technique in comparism to 1st option. AltChunk tag tells application to import content stored in alternative part of document into main document part where tag can be used. Microsoft Word is able to import couple of content types with tag AltChunk. You can import html, rtf, xhtml, xml, textplain, macro, word template or another word document. We are interested in last option.

I investigated if SAP offers any standrad funcitonality what can be used to “altChunk” more documents into one final and I came into conclusion that there is none. Standard ABAP class which work with word document in openXML approach is CL_DOCX_DOCUMENT. But it does not offer options needed to use altChunk technique.

In order to altChunk one document into other you have to:

  • create new alternative part with unique ID for document (merging document)
  • store a content of document which you want to merge/import in newly created alternative part
  • create new relation in final document between main document part (final merged document) and alternative part(merging document)
  • add altChunk tag in main document part

To be able to do this we will enhance, copy and change couple of standard classes. Before you want to use any code from this blog you have to go through steps in altChunk – preparation of code where I describe all preparation steps to use altChunk.

Once finished with preparation code part let’s prepare some test documents for merging. I created 3 documents for merging and 1 empty document which is empty.

merge1.docx with simple text “Hello World.”

merge2.docx with simple text and section break – next page.

merge3.docx with simple text.

final.docx as empty document – make sure it has no size 0.

Documents are ready so we can test following code.

*&---------------------------------------------------------------------*
*& Report ZALTCHUNK
*&---------------------------------------------------------------------*
*& Report demonstrates altChunk usage in ABAP
*& Pavol Olejar 23.4.2017
*&---------------------------------------------------------------------*
REPORT zaltchunk.

DATA: lr_merge1        TYPE REF TO cl_docx_document,
      lr_merge2        TYPE REF TO cl_docx_document,
      lr_merge3        TYPE REF TO cl_docx_document,
      lr_final         TYPE REF TO zcl_docx_document,
      lr_main          TYPE REF TO zcl_docx_maindocumentpart,
      lr_altpart1      TYPE REF TO cl_docx_alternativeformatpart,
      lr_altpart2      TYPE REF TO cl_docx_alternativeformatpart,
      lr_altpart3      TYPE REF TO cl_docx_alternativeformatpart,
      docx             TYPE xstring,
      mainx            TYPE xstring,
      lv_id            TYPE string,
      s                TYPE string,
      lv_current_chunk TYPE string,
      lv_replace       TYPE string,
      lv_length        TYPE i,
      lt_data_tab      TYPE STANDARD TABLE OF x255.

* READ final document. Note we are using z-class.
PERFORM load_file USING 'C:\final.docx'
                  CHANGING docx.
lr_final = zcl_docx_document=>load_document( iv_data = docx ).
lr_main = lr_final->get_maindocumentpart( ).
* ADD alternative parts
lr_altpart1 = lr_main->add_alternativeformatpart( iv_content_type = cl_docx_alternativeformatpart=>co_content_type_word ).
lr_altpart2 = lr_main->add_alternativeformatpart( iv_content_type = cl_docx_alternativeformatpart=>co_content_type_word ).
lr_altpart3 = lr_main->add_alternativeformatpart( iv_content_type = cl_docx_alternativeformatpart=>co_content_type_word ).
* Read document to be merged/inserted
PERFORM load_file USING 'C:\merge1.docx'
                  CHANGING docx.
* Provide data to store in alternative part
lr_altpart1->feed_data( iv_data = docx ).
* REPEAT for 2nd and 3rd file
PERFORM load_file USING 'C:\merge2.docx'
                  CHANGING docx.
lr_altpart2->feed_data( iv_data = docx ).

PERFORM load_file USING 'C:\merge3.docx'
                  CHANGING docx.
lr_altpart3->feed_data( iv_data = docx ).
* Get xml of main part to insert altChunk tags using string operations
mainx = lr_main->get_data( ).
CALL FUNCTION 'CRM_IC_XML_XSTRING2STRING'
  EXPORTING
    inxstring = mainx
  IMPORTING
    outstring = s.

lv_id = lr_main->get_id_for_part( lr_altpart1 ).
CONCATENATE '<w:altChunk r:id="' lv_id '" />' INTO lv_current_chunk.
lv_id = lr_main->get_id_for_part( lr_altpart2 ).
CONCATENATE lv_current_chunk '<w:altChunk r:id="' lv_id '" />' INTO lv_current_chunk.
lv_id = lr_main->get_id_for_part( lr_altpart3 ).
CONCATENATE lv_current_chunk '<w:altChunk r:id="' lv_id '" />' INTO lv_current_chunk.
* Prepare alt chunk tags
CONCATENATE '<w:body>'
            lv_current_chunk
            '</w:body>' INTO lv_replace.
* Replace body tag
REPLACE FIRST OCCURRENCE OF REGEX '<w:body>.*</w:body>' IN s WITH lv_replace.
CALL FUNCTION 'CRM_IC_XML_STRING2XSTRING'
  EXPORTING
    instring   = s
  IMPORTING
    outxstring = mainx.
* Provide new main part with alt chunk tags and save document
lr_main->feed_data( iv_data = mainx ).
docx = lr_final->get_package_data( ).
lv_length  = xstrlen( docx ).

CALL FUNCTION 'SCMS_XSTRING_TO_BINARY'
  EXPORTING
    buffer     = docx
  TABLES
    binary_tab = lt_data_tab.

CALL METHOD cl_gui_frontend_services=>gui_download
  EXPORTING
    bin_filesize      = lv_length
    filename          = 'C:\final_new.docx'
    filetype          = 'BIN'
    confirm_overwrite = 'X'
  CHANGING
    data_tab          = lt_data_tab.

FORM load_file  USING path TYPE string
                CHANGING docx TYPE xstring.

  CALL METHOD cl_gui_frontend_services=>gui_upload
    EXPORTING
      filename   = path
      filetype   = 'BIN'
    IMPORTING
      filelength = lv_length
    CHANGING
      data_tab   = lt_data_tab
    EXCEPTIONS
      OTHERS     = 19.

  CALL FUNCTION 'SCMS_BINARY_TO_XSTRING'
    EXPORTING
      input_length = lv_length
    IMPORTING
      buffer       = docx
    TABLES
      binary_tab   = lt_data_tab
    EXCEPTIONS
      OTHERS       = 2.
ENDFORM.

Result final_new.docx document should looks like this:

1st page

2nd page

In this example you can see that altChunk imports content of alternative part one by one. When 1st and 2nd documents were merged their content is going one by one. With 3rd document I used section break – new page tag which moves its content to new page.

There is lot of room to play with it and take this as very simple example how to use it. You can play with page orientation, footers and headers, page numbering or different section breaks to achieve what you need.

Also note that if you open final document (after creating it using ABAP) in word application and save it again then altChunk tags will be gone. All imported contents will be saved under standrad tags and relation to alternative parts will be also gone.

 

 

 

 

Assigned Tags

      14 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Marco Beier
      Marco Beier

      Hello  Mr. Olejar,

      awesome blog, very helpful, enjoyed reading it!

      However, I still have a question: Is there any method using this type of code (or the cl_docx_docusment class) to include images/ create tables into a single word document?

      Kind regards,

      Marco

      Author's profile photo Pavol Olejar
      Pavol Olejar
      Blog Post Author

      Hello Marco,

      I have not tried this. But I think you can store any image in document as part of it(alternative part for example) and then you have to place correct tag for image within document. For table you can store own data also in alterantove part and then use table tags to populate table. But I am have not tried this so this can be challenging. With tables I used only technique described in this blog.

      Regards

      Pavol

       

      Author's profile photo Joschka Rick
      Joschka Rick

      Hi Pavol,

      thanks for the tutorial. I managed to merge two docx files with your solution.

      Unfortunately the merged file does not show page numbers anymore. Do you know how to fix that?

      Best regards
      Joschka

      Author's profile photo Pavol Olejar
      Pavol Olejar
      Blog Post Author

      Hello Joschka,

      Try to set numbering you need (and other formating) in your final doc. Numbering is not overtaken from alternative part but should be taken from file into which you insert these parts.

      Regards

      Pavol

      Author's profile photo Joschka Rick
      Joschka Rick

      Hello Pavel,

      thanks for the quick reply. Adding the page number (footer) to my final.docx didn't help. Any other suggestions?

      My merged docx contains three footer.xml's now, footer2.xml contains my original footer.

      The merged documents are both created with SAP Document Builder, maybe that's also an issue?

      Best regards
      Joschka

      Author's profile photo Pavol Olejar
      Pavol Olejar
      Blog Post Author

      Hello Joschka,

      Not sure what migth be an issue here. Did you manage to fix?

      Regards

      Pavol

      Author's profile photo Joschka Rick
      Joschka Rick

      Hello Pavol,

      I didn't manage to fix it. We just left he page numbers out, which is OK for the moment. If I find the time one day, I'll look further into it :-).

      Thanks for your help.

      Best regards
      Joschka

      Author's profile photo Joschka Rick
      Joschka Rick

      Hello Pavol,

      I think I found out what was missing. In document.xml.rels my two to-be-merged word documents were using two different reference-ids to include the numbering/footers/etc. (rId13 and rId6). This way my second word file referenced into nothing and no page number was shown.

      After fixing it manually everything is working fine now.

      I did not find a solution to use the standard classes for this, so I've created my own methods.

      Best regards
      Joschka

      Author's profile photo Guru Prasad
      Guru Prasad

      Hi Pavol,

      Great job, it's really working.

      I have some issues like if final.docx having page orientation Landscape, but after merging all documents, the final.docx layout is not the same.

      Please help with coding to get layout, page numbers, and most importantly page headers and footers.

      Thank you

      Regards

      Guru Prasad.

      Author's profile photo Pavol Olejar
      Pavol Olejar
      Blog Post Author

      Hi Guru,

      I am not sure how you want to influence page numbers, that should be clear through whole merged document. When talking about orientation of pages I suggest to use page breaks when orientation of page is changed. Headers and footers should be similar as page numbers, that should be unified through whole document I guess.

      Regards

      Pavol

      Author's profile photo Guru Prasad
      Guru Prasad

      Thank you so much for your reply, did not get your inputs, how can we add page headers and footers dynamically, also how to overcome page orientation issues. Could you please ping me s.guruprasad16@gmail.com. If you have a sample code please fo send for layout, headers, and footers, Really appreciated your help, we have a road block in one of my project.

      Author's profile photo Pavol Olejar
      Pavol Olejar
      Blog Post Author

      Hi,

      Page orientation issue can be solved with using page breaks at the end of each docuemnt which will be merged. I do not know how to dynamically work with headers and footers.

      Regards

      Pavol

      Author's profile photo Guru Prasad
      Guru Prasad

      Hi Pavol,

      Appreciate your help if you can provide ABAP code to insert page breaks after the end of each document while merging in your code.

      Thanks

      Guru Prasad

      Author's profile photo Pavol Olejar
      Pavol Olejar
      Blog Post Author

      Hello Guru,

      I did put page breaks manually in document using word processor and not dynamically in coding, so I cannot help you here. Try to study page breaks tags in document.

      Regards

      Pavol