A data warehouse is a centralized storage of all the data relevant for carrying out the business’s functions. Having it at your disposal significantly improves the quality and speed of the company’s decision-making. Without a well-arranged data warehouse, making any kind of cross-domain analysis turns into a nightmare – bringing together information from multiple different platforms, combining it and carrying out the analysis takes an enormous amount of time and resources. In most cases, by the time you get meaningful results, the chance to make an effective decision is already lost. When you use a data warehouse, it provides a single repository of data and a place to quickly and efficiently get the answers you need. However, building and maintaining such a project is not a trivial task, and it is easy to make mistakes that will undermine its efficiency. In this article, we will cover the worst blunders you have to avoid.
1. Not considering long-term maintenance
One of the worst mistakes you can make is to think that an enterprise data warehouse (EDW) is something you arrange once, and then it just keeps performing its functions indefinitely. In reality, things are much more complicated. Yes, data warehouse services help companies implement, improve or migrate their DWH solution to consolidate disparate data sources under one roof and enhance their decision-making. But if you consider replatforming your on-premises data warehouse to the cloud, the customer based support will be needed to:
- Provide data administration services: create rules to ensure that the data is clean and accurate, add new data sources and load new data, adjust ETL processes.
- Resolve the identified issues.
- Monitor the performance and capacity of your data warehouse – query running times,
- Monitor the correctness of data transformations or a data backup.
It means, you have to account for gradually changing data formats, an inevitable increase in the amount of data you have to process, the need to add new data connections and changes in the general situation that will, over time, require the introduction of new features. While you can control and pace some of these costs, others come from outside – for example, if you collect data from an API-based service, you will have to keep up to any changes in the reporting API or lose the use of this data source.
2. Building a data warehouse based on current needs
Data warehouse rarely projects, if ever, bring useful results in the short term. They strongly rely on huge amounts of data being collected and analyzed for a significant period. Usually, the earliest you can expect them to pay for themselves is three to five years, which means that you have to factor in your company’s development roadmap for at least this amount of time. Accounting for the company’s growth expectations is just as important as considering the technical issues that are likely to arise within this period.
These are they key questions you need to answer in order to accomplish your data migration smoothly and efficiently to avoid the following mistakes:
- Not having a defined data model, database reference architecture or Data Mapping rules.
- Not having a list of ‘source’ and ‘target’ applications prepared for migration
- Not having understanding that your data migration procedures defined ( manual/automated) Vs Static/Real time
- Not establishing the data migration environment early enough in the project
- No strategy for extracting the source data for each of the types, possible rules to be applied for ‘cleansing’ the data –
- Not classifying data as ‘historical’, ‘master’, or ‘transaction’ data (or possibly all three)
- No strategy todefine the clear process in each of ETL stages
- Project Data migration approach not defined. You should have defined as one, all or some of the following:
- Big bang,
- Parallel run,
- Incremental migration,
- Zero downtime migration
- No understanding if your target solution architecture is defined
- No strategy for ‘reconciling’, that all the data has moved to the respective target without any loss of either quality or quantity.
It is also highly recommended to use the template in such projects, like creating the SAP Data Warehousing Foundation Project. While, the template contains all necessary modules (SAP HANA Data Warehousing Foundation and SAP HANA database) that are needed for your project including an example NDSO, Flowgraph, and task chain. To get the general understanding and visualization of the new functionalities in SAP Data Warehouse Cloud, you can refer to Data Modeling in SAP Data Warehouse Cloud guide in steps but take in consideration that experimental features may be changed by SAP.
3. Failing to account for the rapid evolution of data technology
All IT technology areas currently develop at breakneck speeds, often completely changing the active paradigms in the course of a year or two – and data analytics is even hotter than the rest of the field. Not just the warehouse layer per se, but all the supporting technologies change virtually on a daily basis. A data warehouse project you start today may require significant updates by the time you finish it to stay ahead of the curve, and in ten years’ time, you may have to rebuild it from scratch. While a data warehouse can and probably will bring enormous advantages to the table when establishing one, you have to be ready to continually invest into it to keep up with the ever-mutable tech landscape.
4. Overestimating the skills of your IT specialists
While your engineers may be excellent at the tasks, they currently carry out, even if your company is primarily IT-oriented and is no stranger to data technologies, the kind of tech your specialists most likely work with has very little in common with what is required to build the infrastructure necessary for a data warehouse. And it is not something they can quickly pick up in their spare time – big data tech typically has an extremely steep learning curve.
While this position requires a plethora of skills, here are four of the most important IT support skills that a specialist should possess:
- Attention to details: they have to ask the right questions to guide the interaction in a way that a customer or business employee thoroughly explains the issue and must listen carefully and never overlook any details.
- Ability to diagnose problems: they must be familiar with major operating systems and stay abreast of recent updates.
- Analytical thinking: it’s beneficial to understand the principles of geometry, calculus, and
- Communication skills: it helps an IT specialist better extract the information needed to diagnose a problem
Most likely, your experts will be able to master the necessary skills, but it will take time during which they will not be able to work on your company’s core projects. Hiring specialists from outside is only a partial solution because you will still require somebody to work the system on a continuous basis.
5. Failing to work with the customers
In the context of data warehousing, the customers are the individual units and departments of your business that are going to use the newly acquired analytical capabilities to solve their problems. One of the surprisingly common mistakes of businesses introducing this kind of infrastructure is forgetting to consider their needs when establishing it. It may result in not providing the kind of data the end-users want and need most while offering a host of information nobody has any use for, failing to account for existing workflows, creating inconveniencies for individual users and the organization as a whole, and so on.
The integration such projects, like the SAP Analytics Cloud presentation layer in SAP Data Warehouse Cloud allows gaining access to many of the functionalities of a product tested and approved for the creation of dashboards.
The modeling part is complete and provides all the functionalities required to filter data, add formulas, create joints, or to visualize data at any stage.
Buts as you can see, creating a data warehouse is an involved and complicated process that brings about not just opportunities and advantages, but also challenges and costs. Before deciding to arrange one, a company should carefully weigh all pros and cons.
I am impatient to see how this solution will evolve in time when considering the upcoming functionalities, such as the live creation of Cube or KPIs!