SAP Data Services supports a number of file formats for Data Extraction and processing. XML Files is one the key input formats for Data Services.
This document explains in details how to import and process a XML file into Data Services. This document also covers a key Transform – XML Pipeline, to process the data.
We have a XML file that we need to process and import into Data Services. Have named the file as Test_Demo.xml.
To see the contents of the file right click and Open with Notepad.
The XML file looks like this as below.
Double clicking on the XML it open in html format and looks like.
To start processing the XML File, we will have to create a XML schema(XSD) for the XML . This is because, Data Services does not take XML as input. It reads the schema which calls the file for processing (Explained in detail below) .
Converting XML document to generate XML schema in XSD format. There are many online tools available to generate XSD schema form a XML document.
I have used the one available at – https://www.liquid-technologies.com/online-xml-to-xsd-converter
Copy and paste your xml from the notepad into the space above.
Once it is pasted, Click on Generate schema. If the document is valid, it will give amessage as below.
The converter creates 2 schemas for the xml provided. This is because the xml generated is for a report which contains Header and Transactions.
If you see closely at Schema 0, you will it is calling Schema 1 from within.
Copy both schema 0 and schema 1 in notepad and save files and Schema0.xsd and schema1.xsd.The 2 xsd files should look like below.
Use this into Data Services as Input. Login into Data Services to start processing. Please make sure the files XML and XSD are placed at the locations where Data Services can access it. I have placed them into the FTP server which Data Services can access.
When you login into Data Services into your Local Repository. click on File formats as below.
Right click on Nested Schemas, New, XML Schemas .
The format for importing XML schema looks like this.
Once you select the schema0.xsd as Filename/URL , it will automatically gives you Namespace and Root Element name, These details are important to process the XML schema. I have named the format at TEST_DEMO.
The File format is now created and available for processing.
Create a new job to read and process the schema. For demo I have created a Job- JB_READ_XML_DATA with one Data Flow – XML_DEMO .
Drag and drop TEST_DEMO nested schema into Data Flow and select Make File Source.
Once the XSD Schema is showing in the Project Area, double Click on it.
Double Click should show the details of the schema that came with it.
This is important step as you now must select the XML File that this schema is made of. Click File and Select the TEST_DEMO.XML
Once the File is selected Click back and get into the work area again.
To process the data and to extract the right set we will use the Data Integrator Transform – XML_PIPELINE.
- This transform is used to process large XML files one instance of a repeatable structure at a time.
- With this transform, Data Services does not need to read the entire XML input into memory then build an internal data structure before performing the transformation.
- An NRDM structure is not required to represent the entire XML data input. Instead, the XML_Pipeline transform uses a portion of memory to process each instance of a repeatable structure, then continually releases and reuses memory to steadily flow XML data through the transform.
- During execution, Data Services pushes operations of the XML_Pipeline transform to the XML source.
Drag and drop XML_PIPELINE into the Work area and connect the schema to it.
Double Click on XML_PIPELINE and the Transform area would look as below.
Next Step is to Drag and Drop the fields in the output field on the right hand side . make sure you only select the fields and not the structures. Once all the required fields are mapped the Transform would look like as below.
Next Step is to map to the Target table to hold the XML Data being extracted.
The job and project area looks like below once you save the job.
Execute the job and see the processing steps.
Once jobs is Finished, click on the table preview and you can see data from XML has been successfully imported.
With this Post you will be able to read XML based Source Data Files into SAP Data Services for processing. XML in itself can be very complex and this document will help in generating the correct schema which can then be consumed in Data Services . Once schema can be used for multiple files if they are based on same schemas.