TUTORIAL: How to duplicate a Job batch in SAP Data Services Designer
This tutorial will guide you along the job batch and component duplication processes. We will explain also the duplication mode operation.
First of all, we must know something before duplicating object. For Data Services Designer (DSD), there is 2 kind of objects concerning duplication: reusable objects and non-reusable objects (or single-use objects).
- The non-reusable objects will be duplicated when we want the copy of a job and his component. We can cite on this category the scripts, conditions, loops, global variables…
- The reusable objects won’t be duplicated. We can cited on this category the Worklfows, Dataflows. It will happen that the reusable objects will be referenced and non-copied.
For more information, see the SAP Data Services Designer documentation Page 54
The logic duplication illustrated
To show you this politics of DSD kind of object, I’ll show you an example:
I made a copy of my job batch « Job_batch_A » since my repository named Job_Batch_B.
Here’s what happen :
Diagram of duplication of a Job Batch
The Job_Batch__B will be dupplicated, but the inherited reusable objects won’t be copied: these are original objects references wich be created, duplicated objects.
To be more specific, if I made a change in my Job_Batch B, on Workflow_A or on a DataFlow, these changes will be reported on our Job_Batch_A.
Only our inherited reusable objects from our dupplicated object are references to the original objects! Our copied object (which is here a Batch Job) has been really duplicated.
So be careful when you make a copy of our jobs, workflows and dataflows.
To further explain:
Job_Batch A ≠ Job_Batch B
Job_Batch_A Reusable objects = Job_Batch_B Reusable objects.
The logic may seem strange and complex, but it is simpler than it seems. To better illustrate this:
– If I remove my Workflow_A of my Job_Batch_B, it will not also removed from Job_Batch A (and vice versa).
– But if I remove a DataFlow Workflow_A since my Job_Batch B, it will also be removed from Job_Batch A, the two jobs point to the same object that is Workflow_A (and vice versa).
Although this logic is the same for added component:
– If I add a Workflow in my Job_Batch_B, it will not be added into my Job_Batch_A (and vice versa)
– If I add a DataFlow in my Workflow_A since my Job_Batch_B it will also be added to Job_Batch A (and vice versa).
I took for example a batch Job and Workflow Dataflow inherited, but this logic copy / reference inherited remains the same on the lower level with what the Workflow Dataflows.
Hierarchy of reusable objects
What to mainly know of this logic:
Duplicating a reusable object creates a new object, but the inherited objects will not be duplicated: they will be references to the original objects!
This logic of inheritance reusable object stops at our dataflows, because the objects contained in a dataflow are non-reusable objects (These are objects that can’t be reused and therefore are copied.). So if I make a copy of a dataflow, the changes I will do in the duplicated dataflow will not be reflected, because I am at the lowest level of reusable objects. So now I can answer at the question “If I make a change in a copied DataFlow, she will affect in another objects? ». Please note however, all objects contained in our batch Job and our workflows are NOT ALL reusable objects (ex: scripts).
Here is the presentation of the logic of DSD duplication. This logic reveals surprising at first glance because it doesn’t respect the traditional logic of “copy / paste” we know. But with hindsight, it brings a lot of benefit including the reuse of elements and widespread change.
If you understand the logic, so we answer at the following question:
“How to duplicate a job and make it independent ?”
Returning to our job illustrated above to explain the procedure to follow.
I want to duplicate our Job_Batch_A for a duplicate and independent Job_Batch_B, where I can do whatever I want on it Job_Batch_B without impacting our Job_Batch_A and vice versa.
To duplicate a job, we will start duplicating our Job_Batch_A, select the job to copy from the local object library and do a right-click it, and then “replicate”. Name this new Job “Job_Batch_B”.
Import the job batch then duplicated in the project area of your choice.
Step 2 :
Delete all Job_Batch_B reusable objects so that it is completely independent (workflows). In this case just delete our Workflow_A. Non-reusable objects will be copied so no bother to remove them.
Note: If some reusable objects need’s dependence with the original object, and you do not mind, you can keep them, but beware of consequences!
Step 3 :
Once our Job_Batch_B created, create our new workflow, we will name it Workflow_B:
And once our Workflow_B created, we have reached the lowest level of reusable objects, we can place our copies of dataflows.
Step 4 :
We will duplicate our dataflows that we have in our Worklow_A. We still do it since the “local library objects.”
Step 5 :
Place duplicate dataflows in our created workflow. To place them, do click and drag from the “local object library” into the workflow.
Well, we now have two identical but separate batches.