Skip to Content

SAP Business Objects Data Services (DS) does not provide the functionality to concatenate multiple row entries into a single row, where the number of rows that might appear is undefined.

Description:

The Data Services have the functionality to concatenate multiple rows into a single row entry using the reverse pivot transform and then concatenating the columns provided that the number of rows that need to be concatenated is pre- defined or fixed.

If the number of rows that might appear which needs to be concatenated is not defined then the same functionality can be achieved at the database level through SQL using the XML functions. The XML functions XMLAGG and XMLELEMENT are used to concatenate the multiple entry rows into a single row. XMLELEMENT will take an element name for identifier and returns an instance of XML type. While XMLAGG takes a collection of XML fragments and returns an aggregated XML document. E.g.(xmlagg(xmlelement(e,<column_to_be_concat>,’,’).extract (‘//text()’)),'<delimiter>’)

Using the XML functions at the database level will affect the performance of the jobs. For millions of records to be processed the usage of XML function would consume lot of time and memory space which would reduce the performance of the jobs.

Solution:

A DS Custom Function can be used to concatenate the multiple rows into a single row even when the number of row sets that might appear for a set is not defined.

Advantages:

  1. 1.     The set of rows that must be concatenated need not be defined.
  2. 2.     Improved Job performance.
  3. 3.     Memory space used for cache will be low.
  4. 4.     Easy implementation of the function.

Input Data:

The source table has four columns namely unique_id, name, Text and Note_seq_num. The source information has the data set like

Unique_id

Name

Text

Note_Seq_num

1

John

Hello

1

1

John

World

1

2

Rob

How

2

2

Rob

Are

2

2

Rob

You

2

Requirement:

For each of the distinct Unique_id and name we should have only a single entry in the Target table and the Text column should be concatenated. The Note_seq_num column is used as reference column to identify the set of rows that needs to be concatenated based on the corresponding unique_id and Name.

The sample output should be like

Unique_id

Name

Text

1

John

Hello,World

2

Rob

How,Are,You

Concatenation Custom Function:

The Custom function uses the column Unique_id (the primary column based on which the row are to be concatenated) and the Note_seq_num.

Pre-Requisites:

  1. 1.     The Note_seq_num should be generated for every group of data(if it is not present in the data set), which identifies those records which need to be concatenated.
  2. 2.     For each unique_id, the note_seq_num should be sorted.
  3. 3.     The Global Variables should be created which would be used for the custom function.

$G_VAR_NOTE_SEQ_NUM and $G_VAR_TEXT – variable will hold the Note_seq_num value and concatenated Text column during the time of execution of the custom function

Create the Custom Function:

CF.jpg

After the execution of the custom function, the output would be as shown below:

Unique_id

Name

Text

1

John

Hello

1

John

Hello,World

2

Rob

How

2

Rob

How,Are

2

Rob

How,Are,You

Post Custom Function Execution:

The resultant output can be obtained by grouping the records based on unique_id and Name and by getting the Max of Text column.

Unique_id

Name

Text

1

John

Hello,World

2

Rob

How,Are,You

To report this post you need to login first.

18 Comments

You must be Logged on to comment or reply to a post.

  1. Joshua Blythe

    I want to clarify that in order for this to work, you need the input variables assigned as input/output.

     

    This feels like an exploit to me, but a much welcomed one.      good find.

    (0) 
    1. Bernatsha Panneerselvam Post author

      Hi Ian,

       

      Yes, this can be done in BODS 3.2.

      The global variable needs to be created at the job level as we do normally. The reason we opt for global variable is that, after every custom fuinction call the values might get refreshed if its a local variable, but in case of global variable this cant happen and the latest iteration value from the custom funtion will only be stored.

      Inititally the global variable will hold the value ‘null’ only.

       

      Thanks,

      Bernatsha P

      (0) 
  2. Shaheen Makandar

    When I re-created the same function, it gives me an error that the Global variable is undeclared. I created this Global Variable at the Job level. How do we resolve this issue?

    (0) 
    1. Shaheen Makandar

      Here is the error message:

       

      [Function:TEXT_CONCAT]

      The function <TEXT_CONCAT> contains an invalid expression. Additional information: <Referencing an undeclared variable <$G_VAR_NOTE_SEQ_NUM>. Declare the variable in the context that it is being used.>. (BODI-1111182)

       

      And I get this error when I validate the Custom Function.

       

      What I also did, was to use this in the Job, where I have declared those Global Variables, but no Luck. I get this message when I run the job using this function:

       

      Found erroneous expression <>. Check its syntax and fix this expression.


      Let me know if this information helps.


      Custom Function.jpg

      (0) 
  3. KRISHNA MOHAN

    Hi,

     

    I have a similar requirement, I just need small info about how u have taken note_seq_num in your source based on unique id and name. could you please help.

     

    Regards

    Krishna Mohan

    (0) 
    1. Bernatsha Panneerselvam Post author

      Hi,

       

      I had taken the note_seq number based on the combination of the unique_id and Name, as in my requirement we had the unique_id and name to be a primary key.

       

      Regards,

      Bernatsha P

      (0) 
  4. Michael Meusel

    That works.

     

    Caveats to keep in mind:

    – The global variables need to be created/defined in every job that uses the function.

    – You can only use the function once in a dataflow (so if you want to concatenate rows for two different columns, you will need to create a second function using a second set of global variables!)
    – You might be able to use the same function repeatedly in multiple dataflows in the same job, but only if you don’t run those dataflows in parallel, and then you must clear out the global variables (setting them to NULL in a script) before using the function again in a different dataflow of the same job!

    (0) 
    1. Cristina Talmaciu

      Hi Michael, how did you make it work? I get the following error: Invalid mapping expression for column. Cannot parse expression. How did you declare the global variable and how did you write the formula in the query transform?

      thank you in advance!

      (0) 
  5. sravankumar yammanuru

    Hi All,

    I am also having similar requirement like below but i dont have unique id in the source.

    Input:

    SKU COLOR

    123   red
    124   blue
    123   orange

    output:

    SKU color

    123  red,orange
    124  blue.

    Please provide some inputs.

    thanks,
    sravan
     

    (0) 
  6. chaithanya JAMMALAMADAKA

    Hello, I tried the function. I am getting the error alert as ”

    [Function:T_ZONE3_OAA_TEXT_CNCT]
    The function <T_ZONE3_OAA_TEXT_CNCT> contains an invalid expression. Additional information: <Referencing undeclared variable <$G_VAR_NOTE_SEQ_NUM>. Please declare the variable in the context it is being used in.>. (BODI-1111182)

    at the Custom function level.

    How and where to declare? Please explain with screenshtos if possible. Also some on how to implement in DataFlow would be useful.

     

    (0) 

Leave a Reply