As a Solution Architect, I thought of providing strategy and focus on how to approach test data management for SAP applications. This blog helps in providing guidance on formulating Test Data Management (TDM) enterprise strategy & key areas to be considered for SAP database.
In today’s IT industry, companies are looking for ways and means to reduce the project cost through different ways like reducing the infrastructure cost, reducing cost of testing by shorter testing cycle, introducing life cycle automation etc. One of the key factor that is required to improve quality and reduce cost is providing right set of data to right environment at the right time. Testing SAP applications is critical and challenging because of its business operations, highly customized solution to suite its business needs and frequent changes to the application with upgrades, patches etc. Incomplete or erred data will impact the quality of testing that leads to application failure.
To deliver a tested application, you need to have right set of test data that can be created on demand multiple times easily. We need to have a robust process in streamlining the test data processes, templates, checklists and guidelines. This includes sourcing the data, consolidating it, de-sensitizing and provisioning it to lower environments. Using Test Data Management approach and solution, we can enable the team to be self-sufficient and address the following key areas to support CICD, DevOps model of software delivery.
- On-time servicing Test Data
- Securing Sensitive Data in non-production environments
- Data Integrity/Synchronization across all interface
- Reusability of Test Data Sets
- Longer Test Environment refresh cycles
- Bloated Test environments with redundant data
- Automating Synthetic Data Creation process
- Data not available for all test cases
- Retention of Test Data for audit purpose
- Availability of time sensitive data
- End to End Traceability of test data
- Invalid defects due to test data anomalies
Implementing TDM for SAP landscape is different compared to the traditional implementation for DWH or other legacy applications. Following are the key challenges of preparing and managing test data
- Data Integration
- Data being tested needs to get integrated from different sources. Due to lack of standardized process and inconsistent approaches we see huge gap in the data quality and integrity that leaves data unusable. There are also challenges with different formats of data coming from different sources, accommodating the complexity and customization happening in SAP applications.
- Data availability
- Data whether consumed from production or created synthetically should be made available to testing team as and when needed. We need to have a mechanism that will integrate the test data refresh activities in sync with other activities like environment, release and overall project delivery schedule.
- Data Compliance
- Data moving out of production needs to comply with regulatory and compliance law specific to country or domain being operated. Based on various studies and research, 30% to 40% of the testing teams in IT organizations access sensitive production data for their testing needs by taking exceptional approval or consuming data without being aware of the sensitive details attached to that data.
- Data Generation
- Synthetic data generation to validate test cases is often cost effective and does not require any masking process to be applied. It is faster means of creating data but it requires good understanding on the existing business, data models to maintain data integrity within the database. Synthetic data can be created in huge volumes also using various techniques and market tools
- Data Selection
- Consuming the entire data from production invokes production down time, longer data refresh time, huge storage space costs. Studies reveal that storage maintenance costs 7 times of buying the same set of services and this drains significant portion of IT budgets. It is essential to have appropriate subset of data covering optimal test coverage – time based subset, business process based etc.
- Data refresh (including subset and masking) and provisioning are required for every testing cycle (iteration or agile sprint). Human error is common when IT staff attempt to handle large quantities of data manually without a structured automation solution. This automated solution should be made self-sufficient to testing team where they can request for the service online and should get deployed automatically.
Adopting effective solutions approach – 5 keys drivers
- Key # 1 – Test data refresh – This includes full production copy, specific schema copy, specific set of data refresh for master / transactional data.
- Key # 2 – Identify right subset of data – This includes data snap, Intelligent data slice, right data segment to define a specific set of data for extraction to reduce time to refresh and production database network bandwidth.
- Key # 3 – Discover and Mask sensitive data – This includes identifying sensitive data, masking and maintaining integrity of data across the database. Masked data should not get de-masked and should comply with regulatory and federal laws wherever applicable
- Key # 4 – Automate the entire process – Right from the data selection, filter, masking, refresh and provisioning of data, entire process needs to be automated either through TDM COTS tool or customized automated solutions. They should facilitate end users to be self-dependent and on Self-service mode.
- Key # 5 – Any time availability of test data – Testers should have the liberty and flexibility to request data anytime based on their testing needs. This should facilitate agile testing / CICD or DevOps delivery model where there is need of test data on constant basis.
Some of the most popular test data management tools are- Attunity, CA Test Data Manager, IBM Optim, EPI-Use, Intellicorp. SAP TDMS, SNP, Datavard etc. Informatica has withdrawn the support to SAP TDM recently.
Managing test data is crucial and important for program success specific to SAP applications. TDM is the driving factor for today’s continuous delivery model and traditional waterfall model. We need to have a combined solution with TDM tool, automation processes, self-service mode delivery and industry proven best practices to implement a successful TDM solution. Tools can only aid you in moving data from one database to another, but the solutions around process, implementation road map, guidelines and checklists will drive the success of the implementation.