Introduction:

This document gives overview of standard recovery mechanism in Data Services.

Overview: Data Services provides one of the best inbuilt features to recover job from failed state. By enabling recovery,  job will start running from failed instance

DS provides 2 types of recovery

Recovery: By default recovery is enabled at Dataflow level i.e. Job will always start from the dataflow which raised exception.

Recovery Unit: If you want to enable recovery at a set of actions, you can achieve this with recovery unit option. Define all your actions it in a Workflow and enable recovery unit under workflow properties. Now in recovery mode this workflow will run from beginning instead of running from failed point.


When recovery is enabled, the software stores results from the following types of steps:

  • Work flows
  • Batch data flows
  • Script statements
  • Custom functions (stateless type only)
  • SQL function
  • exec function
  • get_env function
  • rand function
  • sysdate function
  • systime function


Example:

This job will load data from Flat file to Temporary Table. (I am repeating the same to raise Primary Key exception)

Running the job:

To recover the job  from failed instance, first job should be executed by enabling recovery.  We can enable under execution properties.

Job3.png

Below Trace Log shows that Recovery is enabled for this job.

Job4.png

job failed at 3rd DF in 1st WF. Now i am running job in recovery mode

Job5.png

Trace log shows that job is running in Recovery mode using recovery information from previous run and Starting from Data Flow 3 where exception is raised.

Job6.png

DS Provides Default recovery at Dataflow Level

Recovery Unit:

With recovery, job will always starts at failed DF in recovery run irrespective of the dependent actions.

Example: Workflow WF_RECOVERY_UNIT has two Dataflows loading data from Flat file. If any of the DF failed, then both the DFs have to run again.

To achieve, This kind of requirement, we can define all the Activities and make that as recovery unit. When we run the job in recovery mode, if any of the activity is failed, then it starts from beginning.

To make a workflow as recovery unit, Check recovery Unit option under workflow properties.

Job2.png

Once this option is selected,on the workspace diagram, the black “x” and green arrow symbol indicate that a work flow is a recovery unit.

Job1.png

Two Data Flows under WF_RECOVERY_UNIT

Running the job by enabling recovery , Exception encountered at DF5.

Now running in recovery mode. Job uses recovery information of previous run. As per my requirement, job should run all the activities defined under Work Flow WF_RECOVERY_UNIT instead of failed DataFlow.

Job5.png

Now Job Started from the beginning of the WF_RECOVERY_UNIT and all the Activities defined inside the workflow will run from the beginning insted of starting from Failed DF (DF_RECOVERY_5).

Exceptions:

when you specify a work flow or a data flow should only execute once, a job will never re-execute that work flow or data flow after it completes successfully, except if that work flow or data flow is contained within a recovery unit work flow that re-executes and has not completed successfully elsewhere outside the recovery unit.

It is recommended that you not mark a work flow or data flow as Execute only once when the work flow or a parent work flow is a recovery unit.

To report this post you need to login first.

1 Comment

You must be Logged on to comment or reply to a post.

  1. amjad ali

    Thanks Samatha, that was the clear information about the recovery mechanism, and could you let me know further about the table partitions and their implementation in data services.

    Thanks in advance.

    Reegards,

    Amjad.

    (0) 

Leave a Reply