File Processing in BODI (Multiple Format File)
Problem Synopsis: To process a Flat file containing data with multiple formats.
Relevant Business Example: A Batch process is run to stream data from a database. Streaming is where multiple occurrences of the same program run in parallel with each instance accessing a sub-set of the database. In this way the overall time taken to extract data can be substantially reduced, depending on the number of streams used. The output file will contain data from multiple tables, each table format being identified by the first column: RECTYPE
Sample File: The below diagram shows a Flat File with 2 types of record formats. The format with RECTYPE= REC1 has 5 columns, whereas the column with RECTYPE=REC2 has 7 columns.
Problem with normal processing in this scenario: A Run time error will be thrown by BODS if such a file is processed normally because the row delimiter in such a file will be placed after 5 columns for RECTYPE= REC1 and after 7 columns for RECTYPE=REC2.
File Format Options to process a file with multiple Formats: Create a file format with the maximum number of columns in all the record formats present in the input file. In this example this number is 7. In the File Format, select “Yes” for Adaptable Schema option. This indicates that the schema is adaptable. If a row contains lesser number of columns, as is the case for RECTYPE=REC1, the software loads Null values into the columns missing data.
After using the aabove settings in the file format, data can be extracted from the input file without any errors. The below screenshot shows the extracted data.
The data for the 2 record formats can be easily separated using a Case transform. Two different cases are added to the case transform . Case_RECTYPE_1 is created for RECTYPE=’REC1’ and Case_RECTYPE_2 is created for RECTYPE=’REC2’. The data for the 2 different record formats are hence successfully separated into two tables with correct record structure.