Data modeling considerations of ArcGIS Enterprise on the HANA Platform
Some of our asset intensive customers, like utilities, who are implementing ArcGIS on HANA with SAP ERP have asked me: how do the pieces fit together and where can I read about this in one place?
While the value that the HANA platform brings to ArcGIS Enterprise is based on the same capabilities that HANA has provided to our customers since it was released, integrating ArcGIS, HANA and S/4HANA are different enough that our customers are asking for a roadmap of some kind. To help fill that gap, this is the first of a series of blogs that will help fill in those blanks on why integrating these two technology stacks delivers substantial value to our customers who utilize both platforms.
Let’s take a look at some challenges of obtaining insight from data faced by any analyst whether or not ArcGIS Enterprise and S/4HANA are involved. To obtain insight, data from different systems is traditionally copied to one place and put into a form where insight can be extracted from the data. Techniques used to do this include pre-calculating aggregates in result tables and copying data from remote systems into a data warehouse of some kind. ArcGIS admins are especially familiar with this because medium and large asset intensive business use ArcGIS publication geodatabases which are ArcGIS specific data warehouses that underpin an ArcGIS system of engagement. The ArcGIS utility network changes this – that will be covered in another blog.
HANA has a number of capabilities that eliminate or mitigate these challenges. For example, HANA is able to aggregate at high speed over large volumes of data on-the-fly. This is because HANA has at its core a modern, columnar in-memory database. HANA was also designed to avoid the need to replicate data. But if needed, you can still replicate data using built in HANA capabilities. The HANA platform makes it easy to connect to data, rather than collect it because HANA has data federation capabilities built in. The key is these HANA capabilities reduce and/or eliminate the complexity and technical debt of ETL. Remember that ETL was created in part to address shortcomings in disk based DBMSs like IO bottlenecks but at the same time introduces data latency, complexity and increased data footprint.
By eliminating the need for ETL, the analyst goes from a schema on write scenario (where the results and data available to them are fixed) to a schema on read scenario. Because HANA holds data at the finest granularity, the analyst can aggregate on the fly and has access to all of the attributes. Note in Figure 2, the Extract, Transform and Load steps are replaced by a Replicate step. By simply replicating data into the HANA instance, there is no filtering (selection criteria) or other processing. There’s no need to map fields and define the target table(s). It’s a straight mapping from the source into the HANA instance which is much simpler than extraction, transformation and loading/mapping.
Having a schema on read scenario makes the analyst much more agile because they can simply ask a different question without having to modify, test and maintain ETL. This provides insight based on the latest state of the business, not on a snapshot some number of hours, days or weeks old. For a utility, for example, this means they can understand the state of their network at that moment – where the demand is and who is producing power (i.e. rooftop solar) for instance.
You can utilize HANA’s data federation capability called Smart Data Access to query data where it resides as shown in Figure 3 below. In this case, you don’t even need to load that data into HANA. In either case, the analyst can change their analytical lens simply by modifying the information view (called a calculation view) and they can utilize the built in analytic engines in HANA as shown in Figures 2 and 3.
These capabilities (and others) in the HANA platform provide the same value for spatial data and analytics provided by the ArcGIS Enterprise platform. Spatial datatypes in HANA are first class citizens of the HANA platform – they are not an add-on and are treated like any other type of data that HANA can store and process.
Since you’re reading this blog, you are most likely either in the process of implementing or on the path to SAP S/4HANA for your ERP system. You may also be considering a move to the cloud with RISE, SAP’s white glove move to the cloud. Many of our utility customers on this path are looking at the ArcGIS utility network as their as-built digital twin of their transmission and/or distribution networks. With these two digital twins, mashing them together provides key insight about your networks and customer usage. Having an up-to-the-minute understanding of the network state is essential to a modern utility.
In addition to the HANA capabilities discussed in this blog, Esri and SAP have done extensive work to simplify and streamline out-of-the-box integration between your organization’s two most critical systems: your ERP and EGIS systems.
The integration of ArcGIS Enterprise and the HANA platform is out-of-the-box using ODBC and JDBC. The extensive work done by Esri and SAP includes automatically pushing down into the HANA instance certain ArcGIS operations (i.e. gptools) like binning and single layer operations. This results in significant performance gains simply by putting a geodatabase on the HANA platform. For example, in tests done with grid binning, the automatic pushdown into HANA resulting in a 16x gain in performance for 10 million points.
On a related note, my colleague Shabana Samsudheen just wrote a blog on new availability of the ArcGIS platform as a provider for HANA Spatial Services. This means, from within BTP, you can perform geocoding, POI, routing and more that use ArcGIS Platform. Take a read here.
In the next blog, we will look at how these capabilities and the resulting elimination of ETL between data in ArcGIS Enterprise and in S/4HANA provide substantial value to both operational and analytical requirements faced by our customers.