Data Decommissioning and Data Aging for SAP S/4HANA
The topics discussed here may touch legal or regulatory topics. Neither SAP nor myself are allowed to provide legal advice. The information provided here considers only technical aspects and is not legal advice any may not be used as such.
Examples are for illustration only and shall highlight possible efforts that might be necessary.
In the last years I was invoked in many customer projects related to GDPR topics, to systems decommissioning and landscape optimization. There were many issues and topics discussed. Some are very similar across all customers, some are only relevant to regulated industries like in finance, telecoms and pharmaceutical industry and some are very individual.
The intention of this blogpost is to document the most common topics and what are the solutions that is SAP offering to address these. A solution however is not just to install a piece of software. Still there is both some project work to do and perhaps some painful steps in cleaning up and modifying business processes and to question cherished habits.
Lets look a bit in the past years:
A couple of years ago long before the EU -GDP Regulation in 2018 came into effect I was already confronted with typical questions on how to handle “old” data in transactional systems.
The “personal data protection” topic wasn’t a topic except if it was certain data that should not be leaked to 3rd parties to avoid scandals and risk loss of reputation.
Still data had to be stored to fiscal law in some cases also the final destination controlled substances, medical devices, spare parts of aircraft or Defence related technology.
Data became the new gold. – But what to do with it? A customer told me, we ‘ve got so many sales slips a day – need to be stored but we cannot afford it to store it in the transactional database not even considering to do some analytics or even data mining application on sales slip data.
A typical answer at that time – put it in a data warehouse – or simply archive it early because certain (low cost) consumer products or food will be consumed or returned mostly in short time. Only expensive household goods may be subject of issues in a later phase – and then there is a customer registration, maintenance contract or other continuous relationship. And such data is less frequent that a restaurant / motorway fuel station sales slip or similar. However: What is the purpose of data that might be used only in very few cases or in most cases never ever?
But even these sales slips are tax relevant and hence must be kept – in particular if the data is subject of business travel, getting reimbursed and perhaps become suspect later if the business travel wasn’t truly one.
With increasing online business, but more regulations, more possibilities in analytics. Operations in different countries, Services in B2B platforms, Cloud, Microservices – even more data is related.
Multiple applications – now there is a SAP S/4HANA migration soon maybe on premise or your company is going to become an all cloud company more challenges arise:
- What do we do with our legacy SAP systems?
- What do we do with our legacy systems with custom or 3rd party code.
- What are with systems that belong to different governmental jurisdiction and data content.
- What about data and records that are incomplete because of payment outstanding or legal issues?
- What about analytics and access to this data to revision, audit, risk analysis and perhaps other use?
There are many other topics that I can mention here.
What we will discuss in the blogpost includes
- Connecting active and ceased systems,
- Archive Link and WEBDAV 3.x,
- Rules / Stores
- Effects / Side Effects
- Access / Reporting / Analytics
This post may not cover “exotic” topics but it should help you to get some answers and help you to develop a strategy on data aging.
Typical Problems with Aged Data
Before we get technical – I will highlight some “typical” issues that have effect and need to be considered in a data aging concept. There are several questions that I usually ask.
GDPR or NOT GDPR that is the question?
The famous quote of Shakespeare’s Hamlet slightly modified. Does this record contain personal data?
Deciding what attributes of a business object (respective corresponding data object) have to be considered as personal data is a topic that has to be decided by the data protection office and legal department of your company.
If the data object or attribute(s) have to be considered as personal data
GDPR Critical exceptions include
- B2B master data contain personal data of contact partners including email and office information.
- Any B2C personal data and minors as business partner
- Patient data, e.g. in medical approval studies
- Employee data – in particular difficult if work accidents did occur and inheritable benefits / share programme or executive role
- Apprenticeships, Students, Externals / Temp Employee and work contractors – different laws may be applicable ( Is this a supplier or an employee?)
In some cases (primary B2B relations ) GDPR is not applicable if no personal data is present.
What is typical mandatory NON- GDPR relevant data and keeping of record?
In many industries this might not be obvious. However there are many products and industries that are regulated. E.g. companies invoked in plant construction or aircraft / automotive supply and spare parts may need to archive long term non personal data.
Government and regulating bodies may request to track and archive
- Pharmaceutical Products / Prescription Tranquilizers and narcotics.
- Bio and chemical substances, half products such as certain steel alloys.
- Export restricted products / dual use
- financial data, e.g. stock exchange data like ticks & quotes
The lists are not comprehensive.
Determining Retention and Residence
No doubt data amounts to be stored will grow. However “hunter-gatherer” of data without obvious purpose “just in case we need it sometime in future” should consider the cost of storage and data will become obsolete´/ unusuable. E.g. if you still store data about consumer CRT based display market while this market defacto disappeared.
So at least data should be tagged with an end of life that perhaps might be reviewed.
Purpose of Data
Fulfil legal requirements
The purpose of data is the second most important question to ask. Of course the obvious reason is any fiscal or other government obligation to keep the data for a certain time. If this is the only case and never ever use the data for any other purpose the obvious solution should be the cheapest one to fulfil the legal minimum? However if the demand is changing what are the efforts to make the data more easy to consume?
In some cases and some countries law is quite undefined. So there is no need for preemptive obedience to provide a comfortable GUI and analytics. Some governments however request data formats, interfaces and access methods to be implemented. So an appropriate archiving and retention system that has to be implemented anyway may be used for own purpose, make some powerful analytics, and detect and investigate anomalies at the first possible moment and prevent violations of law and improve compliance issues.
Corporate Process support
Fraud / default prevention
In many cases there are certain patterns only hidden in detail data. Sometimes old school style methods and variants are reused. The frauders hope the victim is naive enough not to assume old style attempts wont be attempted anymore. Detecting patterns on historic transactional data may prevent such attempts in particular in financial and online trading business.
This is the most popular requirement, blocking of destruction end searching of records in case of a law suit in conflicts with any party. Legal case support is a must capability of any archiving tool.
Warranty & Liability Obligations
Certain contracts may imply a long term warranty or service plans, e.g plant constructions, elevators, implants, .. its sometimes necessary to retrieve a history of works / upgrades or unexpected failures of a spare part that is expected to last decades.
In certain cases data need to be verified in the past and numbers counted to recalculate insurance premiums based on historic data and not always the data needed is also found in an analytical data warehouse.
The most obvious business case: Destruction / Reactivation
Destruction of data became a business case when EU GDPR or similar laws came into effect. Data that is obviously not GDPR is usually not considered. But even this data as mentioned gets the character of a museum piece once upon a time and no business value anymore. So why wasting space?
Reactivation of data is unusual to transactional data except there are effects like receiving long term outstanding payment. In most cases this will trigger a separate process. The reactivation of master data of business partners that did not do any business for a while may be relevant also reactivation of materials that were out of production because demand is reoccurring. An archiving tool should be able to retrieve or restore the data to the main system and unblock / activate the records.
Any of the mentioned issues have to be resolved and appropriate correction processes need to be established before any data aging process is enabled.
Data aging can only performed if
- The transactional status of the record is completed.
- Related master data matches minimum quality of correctness and completeness.
Once data is archived – it is read only. Hence there is no way to correct / or delete single data fields or single records containing personal data in an archive. Restoring data in an active system may result in huge effort or even entirely impossible in an already ceased SAP or Non SAP System.
Defining retention rules.
In many customer meetings I was asked why SAP has no predefined Rules for data aging in ILM. There is one simple reason: A predefined rule may be considered as a legal correct rule even if it is predefined as utterly nonsense and taken “as is” unverified into production use by the user. The rule may be considered is a “legal advice”. This is not permitted by law.
Data Aging Options in a Transactional System
Now, lets assume data quality is sufficient and all the possible issues and topics are addressed and resolved what are the options to implement and what are the related effects of each?
In this chapter I will highlight some SAP recommended ways to implement an archiving system that serves the needs including data migration topics.
The classic archiving of SAP Business Suite is as known to be executed by transaction SARA, writing data out on files to filesystem named with with extension “adk”. Of course also as frequently planned job by job control using transaction SM36.
The process is pretty simple:
- Write data out to filesystem.
- Compare content of files with data in database
- If correct delete data from database or report error.
- Optionally move the archive files in a certified archive system
Access to archived data
Displaying archived data may occur usually by SAP transaction SARI (Archive Information System), FB03. Transactions access either ADK type files in the filesystem or on archive storage.
Remark to classic archiving
This is still the best practice to move aged data away if there is any reason the data will still be needed by business and is not affected by any legal regulation. However SARA is not intended to provide a consistent destruction of transactional or master data. In consequence this means the data archived may grow continuously. Simple destruction/ deletion of ADK type files may result in unresolved references to master data or other data resulting in possible inconsistent data or artefacts pointing to null objects..
This means if a consistent destruction of content as needed and blocking capabilities then the archiving system needs some rule engine that will determine according to creation, access and a predefined lifetime. when to block, residence time, archive and finally execute a consistent irreversible removal of data.
SAP Information Lifecycle Management (SAP ILM)
was introduced to overcome this challenge and apply a lifetime and still maintain a consistent data.
- SAP ILM is a rules engine. It complements SAP Archiving with the following capabilities:
- Blocking of objects to modification and visibility – depending on user role.
- Residence time. This means how long will blocked data remain in the database before it will be archived out
- Data Lifetime. This means what data has reached its life span and permitted storage time and finally will be erased.
- Legal hold. This means – if applied the data will be prevented from destruction as long as there is a legal business reason to keep the data for an indefinite time span, as long as the business reason is applicable and the legal case is open. Destruction will appear once the legal hold is lifted
- Maintain a singe archive access interface no matter if one or many systems are ceases and if ceased data is SAP origin or a non SAP legacy system
So by its intention it was created to overcome challenges when maintaining legacy systems that should be ceased to reduce the number of systems to be serviced and reduce cost of maintenance and licenses to be paid. The capabilities where then enhanced to support the GDPR requirements as well.
SAP ILM is not a separate application but a business function in SAP Netweaver. This means there is no need to install SAP ILM but just to enable the ILM functions in SAP Netweaver:
|ILM||General ILM Functions|
|ILM_STOR||Enable External ILM Storage (not required if 3rd Party – Storage used)|
|ILM_BLOCKING||Enable ILM Blocking Capability|
|BUPA_ILM_BF||Enable Blocking of Business Partner|
|ERP_CVP_ILM1||Enable Blocking of Customer / Supplier|
|Module Specific BF||Need to be determined. Depends on application module used.|
The business functions in table are included in the GDPR license.
The enablement of Data Controller Rules Framework is NOT included in the GDPR license.
SAP ILM Architecture Data Lifecycle Management for GDPR
This capability is included (hence not to be licensed separately) in the SAP Netweaver GDPR license of SAP Business Suite 6.0 or SAP S/4 HANA as shown above.
The function adds data aging capability to SARA to meet legal compliance as needed.
If in a business landscape multiple SAP applications are running, e.g. SAP Master Data Governance Hub, SAP HCM, etc. on each application server the functions have to be enabled and configured separately.
The data can be accessed directly from within application (access depends on privileges – blocked data cannot be seen and need to be an authorized role)
See in References a blogpost by Johannes Gilbert on certain functional capability detail and functions.
- Destruction of data after end of purpose and no matching retention rule exists.
- Block data after end of purpose if a residence rule is applicable
- Archive out blocked data after residence time is exceeded using a matching retention rule.
- Destroy data once the retention rule is not anymore applicable and no legal hold is set.
- The rules are legal hold aware – this means of there is a legal reason data not to destroy if any rule is not matching anymore – it will prevent from destruction.
- Searching on legal topics
- Restricted to the SAP Netweaver hosting the application maintaining the data to be archived.
- The permitted storage is either filesystem, HANA REAB, Hadoop, Azure Blob, SAP IQ (separately license required) or any 3rd Party ILM certified archive system with a minimum WEBDAV 3.0, BC-ILM 3.0 interface.
- Data objects restricted to a GDPR related business case.
- Not included Data Controller Rules Framework
To use already archived data from classic archiving with ILM functions the existing files need to be converted and timestamp tagged. (This is a background process) The data will be repackaged according to retention rules defined.
The NW ILM service is available on premise for on premise systems and on cloud for cloud systems.
To use ILM for archiving objects not related to GDPR a purchased license is required. However Non GDPR objects can still be archived the classic SARA style without lifecycle management.
SAP ILM Retention Warehouse
The purpose of ILM Retention Warehouse is to support system decommissioning of SAP and NON-SAP applications.
Differences between SAP ILM Retention Warehouse and SAP ILM GDPR Retention Management
- The SAP ILM Retention Warehouse runs as a separate Netweaver service and system – independent of any live application.
- The data of each legacy system will be replicated into the SAP ILM Retention Warehouse and then the legacy system will be shut down and decommissioned once all data has been taken off.
- Rules apply to residence, archive and destroy.
- The ILM retention warehouse is not intended to support active systems – see previous topic instead.
- Provides one GUI and one reporting system for all systems maintained.
- To be licensed per system maintained
- May run on premise or as private cloud edition
The retention warehouse will host data from all type decommissioned systems and provide a common interface for information retrieval It is intended to serve audit, occasional access and verification. Archived data has to remain in a regular process (e.g. long running contracts, construction plant maintenance, long term assets like buildings with lifts and escalators that are to be retrofitted every couple of years should probably not be moved into a retention warehouse during a migration process but rather migrated and archived in a direct access retention management of an active system and managed by either ILM retention management or classic archive.
The Retention Warehouse (on a server) is technically capable to archive also data from continuous active systems. There are however restrictions and efforts required to implement GDPR compliance such as e.g. ILM notification services to block master data.
SAP ILM Retention Warehouse will run on a SAP Netweaver Service on its own.
It will host all ceased systems SAP and NON SAP as archives:
Data can be stored on the included SAP IQ database, BC-ILM 3.x / WEBDAV 3.x compliant stores are supported, Azure Blob store like ILM retention management.
Licenses(*) included SAP IQ database, SLT Replication Server. Data Controller Rules Framework for Rules Development
(*) See SAP Trust Center Product Use and Support Terms:
for the most recent information on this topic.
System Cease, Reporting and Retrieval of data from SAP ILM Retention Warehouse
During the cease process the SAP SLT replication server will replicate all data to SAP ILM RWH and store it in the archive.
Once complete the SAP SLT Service will cease and the legacy system can be shut down and decommissioned entirely.
Reporting may be executed using classic SAP Netweaver Transaction.
Optional there is a SAP Services DMLT offer covering all services to move data in the SAP ILM controlled archive. This includes all data, customizing information, any attachments and extensions. Supplemental there is a reporting framework that will be used to recreate any reports on SAP retention warehouse. For this topic I refer to a separate blog on data decommissioning services by Sumanth Hedge.
Also possible – flat file export to SAP BW. This however will need some work to align data if merged with data from active systems.
Private Cloud Edition
SAP Information Lifecycle Management Retention Warehouse is also available as a cloud service. Functional capabilities are alike the on premise service but charged as a subscription fee based on connections.
See also Cloud Services Documents in References [P70]
In difference to on premise there is no storage software as supplemental license included.
Users may decide if archived data shall also stored in a cloud store or to remain on premise in a special archive store
Data Aging in SAP S/4HANA is supported on premise and cloud. The major challenge is to identify what option matches the purpose best. Of course and still the major topics in any business process change and move to S/4HANA:
- Is there any personal data affected?
- Is there any retention period to be applied or access to data to be restricted?
- Is the data needed in a business process but access in a very low frequent matter?
- Is the older data needed in combination with most recent data?
If question 1 is applicable then most likely SAP ILM GDPR is a topic to fulfill blocking and deletion functionality of GDPR.
If question 2 is applicable it will also apply to SAP ILM GDPR but also SAP ILM retention warehouse on data archived from a ceased system.
If question 3 or 4 is applicable, classic archiving or SAP ILM Retention Management (if non GDPR relevant data need to be blocked) but with a license might be the answer. During the migration process to SAP S/4HANA should be decided what data can be archived or need to be taken active.
If data to remain in ceased systems for audit only then there might be an economic decision to archive the data in SAP ILM retention warehouse and shutdown the legacy systems and save license, maintenance cost instead provide one archive for all legacy systems.
There is a choice of solutions to meet each requirement.
End Of Purpose Check by Johannes Gilbert
Data Decommissioning by Sumanth Hedge
SAP Information Lifecycle Management Private Cloud Edition [Page 70]