API Data extraction patterns from Successfactors Employee Central for delta interfaces
Context
In this blog, I will try to explain the various data extraction mechanism that can be used to retrieve incremental/delta data from SuccessFactors Employee Central using the compound employee API. Based on the requirement of the downstream system, the EC SOAP API “CompoundEmployee” can be configured( with various parameters) to get the desired data.
Business Scenarios
Incremental or delta interfaces are the recommended approach ( compared to Full load interfaces) where only the “changed” data is sent down to interfaces. For organizations where there is a large set of employees, sending full data every time can have an adverse effect on the performance of the middleware and also for the target system.That’s why it is always best practice to send only the “changed” data to target systems.
I will try to explain with examples on how to extract the incremental and delta data.
It does NOT matter which middleware ( e.g SAP Cloud Platform Integration, Dell Boomi, SAP PO, Mulesoft etc.) one uses, the query and the underlying logic would remain same.
Delta w.r.t Employee Central Data model can categorised as 3 levels
- Empoyee Level
- Record Level
- Field Level
1) Employee Level Delta
This pattern can be used to extract employees whose data (one or more field – does not matter which portlet or field ) changed post last successful interface execution.
Business Scenario
Let us consider SF has 100 employees and only 10 employees have changed since the last run. This query will fetch only those 10 employees. In this pattern irrespective of any change ( in any record) all the downstream applicable fields will be sent
Pseudo Query
select person,
personal_information,
address_information,
email_information,
phone_information,
employment_information,
job_information,
compensation_information,
paycompensation_recurring,
paycompensation_non_recurring,
direct_deposit,
national_id_card,
payment_information
from CompoundEmployee where
last_modified_date>= lastSuccessfulRunDate
Here lastSuccessfulRunDate is variable(set in middleware) which stores the last interface run date.
2) Record Level Delta
This Pattern can be used to extract employees whose data (one or more field) changed post last successful interface execution and only changed records are needed.
Business Scenario
SF has 100 employees and only 10 employees have changed since the last run. This query will fetch only those 10 employees.Each employee has multiple record types( or Portlets) [ example – address, personal info, job, comp etc..). For each employee, the interface will map only such records which have changed post last run. Compare the value last_modified_field ( which exists in each portlet) with the lastSuccessfulRunDate in middleware.
Pseudo Query
Same as employee level delta query.
3) Field level Delta
This Pattern can be used to extract employees whose data (one or more field) changed post last successful interface execution and only changed fields(of any record) are needed.
Business Scenario
Considering the previous example(i.e the scenario described in Record Level Delta), if we go one level down i.e. individual fields for each record ( which have last modified date > = watermark) where the downstream system is interested to receive only fields which are changed then this extraction pattern should be used. Example – an employee’s personal information has been changed where the only field is changed ( marital status value changed from “Single” to “Married”).
Pseudo Query
select person,
personal_information,
address_information,
email_information,
phone_information,
employment_information,
job_information,
compensation_information,
paycompensation_recurring,
paycompensation_non_recurring,
direct_deposit,
national_id_card,
payment_information
from CompoundEmployee where
last_modified_on > lastSuccessfulRunDate
P.S- Parameter fromDate and toDate can also be used along with last_modified_on if period based delta is required.
In the middleware,
1) Compare the value last_modified_field ( which exists in each portlet) with the lastSuccessfulRunDate.
2) As a second step, in the mapping, map only those fields where there is a “previous” tag.
If you want to know on “Current and future dated data extraction using CompoundEmployee API” check here
Just be very wary how you get and store that "lastSuccessfulRunDate". You really need it to come from SAPSF, not your middleware. Think on what happens if the datetime in your middleware is a minute ahead of the datetime of the SAPSF server. Consider thinking about potentially re-replicating some deltas just to be sure that you capture all changes.
Delta replication is a damn tricky business. Beware!
Hi Chris,
Thanks for your comments.You are right it's tricky :).
Few points:
I've not seen any problem till now, in any of my several SF implementations, however happy to discuss if you think otherwise.
Regards
Biplab Das
Biplab,
Great blog here.
But, for Field Level Delta, I feel just looking for <previous> tags does not accurately present all changes to an integration.
In fact, anytime a new effective dated entry is added, the action is “INSERT”, and the “previous” tags do not exist.
Take for example, Job and Compensation objects. Most changes for Job and Compensation are made as new effective dated entries, and show as <action>INSERT</action> and not <action>CHANGE</action>
To ultimately find a “true” field level delta, you must also then do a comparison between “INSERT”-ed records to “last known state” prior to the integration’s lastSuccessfulRunDate.
This can be done by running separate queries, either back to CE, or to OData, and pulling the information that was effective prior and doing a comparison to obtain what has changed.
This is only true for effective dated objects where a new, INSERTED, record is added. Non-effective dated objects like Employment, or Person, are un-affected by this.
Simply relying on the <previous> tags will not be solely effective
Hi Brian,
Thanks for your comments. Previous tag works in combination with "CHANGE" action only.
if any new record is inserted for any effective dated portlet and if the action is "INSERT" - that needs to be accounted anyway. But for existing records if there are changes in the fields - those are captured under "previous" tag ( and action is "CHANGE").
Hi Biplab,
I had one scenario where i need to get only the person ids changed in delta model for a given offset.
I have a java service which would run every day .In the run of service want to detect changes happening
So the offset is relative based on today's date.
I only want those changes which are fitting in the offset and not the ones out of the range
I tried using the periodDelta query as follows:
WHERE ( last_modified_on > to_datetime(Initial Run of service)
AND fromDate = to_date(last run of service ,'YYYY-MM-DD')
AND toDate = to_date(last run of service + 10 ,'YYYY-MM-DD') )
Initial Run of service ==>First time when the service code ran. This is static and cannot be changed once set.
last run of service ==> Everyday run time after the service finishes delta.
last run of service + 10 ==> The offset.
This query works fine for all future dated changes/events which would happen in future.
If i do some changes which needs to be effective today itself ,it comes for today's service run ,but we have noticed it also comes for all the other days going ahead which should not happen .
Also in case of (today -10) its not working.
Am i doing something wrong here.
Hi Manoj,
Did you try:
changes for today
WHERE ( last_modified_on > today -1 )
AND fromDate = to_date(today ,’YYYY-MM-DD’)
AND toDate = to_date(today ,’YYYY-MM-DD’) )
10 days offset in past from today (today -10 )
WHERE ( last_modified_on > to_datetime(lastRun of service)
AND fromDate = to_date(today-10 ,’YYYY-MM-DD’)
AND toDate = to_date(today ,’YYYY-MM-DD’) )
Hi Biplab,
Thanks for the wonderful blog. I am facing an issue while executing the below query. My requirement is to get the delta along with previous tags. I should not get future effective records today but i need to get those when the date falls on that day. Here i am using fromdate and todate. i am able to pick the future dated when the current date falls on that day. The issue here is, i am getting duplicates as my interface schedule is for every 15 mins. How can i stop getting duplicates everytime? Can you please suggest me.
Hi Biplab,
Very useful and relevant blog!
Have a question. Am trying to pull SFSF data using the OData interface into DWC. How do I do Delta load/CDC via OData?
Thanks
-ravi
I need to get all employees data. last_modified_date only gives records for those who were updated maximum upto last 3 months. How can I get all employee details. Need urgent help