Skip to Content
Technical Articles
Author's profile photo James Giffin

Data Intelligence – Considerations for Cloud or On-premises/BYOL installation

A common question I’ve been addressing with customers lately on Data Intelligence is “which version should we choose – DI Cloud or on-premises?”

My goal is to provide some guidance on which deployment Data Intelligence is available as either a cloud subscription model or an on-premises/bring your own license (BYOL) installation. Through this post, I will outline the considerations in several areas:

  • Features
  • Sizing/Licensing
  • Installation/Maintenance
  • Deployment options
  • Other influencing factors

Features

The most common question is “what differences are there between the two options?” This question is probably rooted in the conversion of Data Hub to Data Intelligence earlier this year. When Data Intelligence was released, it included the ML Scenario manager and ML Data Manager, but these were not available in Data Hub.  Data Intelligence BYOL and Data Intelligence Cloud have full feature parity. There is no difference between the deployment options!

Sizing/Licensing

There are similar concepts for Data Intelligence Cloud and On-premises for sizing and licensing.  This is a much deeper topic but I suggest start by defining a set of use-cases and then review the sizing calculator options and then expand the recommendation by 15-20% to allow for growth. The sizing calculators have detailed definitions that help you narrow down sizing requirements.

In general you need to size on-premises systems for blocks of 64GB memory in the kubernetes environment.  The sizing calculator can help you determine how many blocks are needed. The minimum configuration is 3 nodes in the Kubernetes cluster – see the help documentation for more details (https://help.sap.com/viewer/a8d90a56d61a49718ebcb5f65014bbe7/3.0.latest/en-US/7e2a9bf62ec94e9694648e2b5d2ce882.html)

For Data Intelligence Cloud, the minimum configuration is 4300 capacity units.

Here are direct links to the sizing calculators:

DI On-premises Sizing Calculator

https://sapdipricingcalculatoronprem-ynlzz2pm02.dispatcher.hana.ondemand.com/webapp/index.html

DI Cloud Sizing Calculator

https://sapdisizingcalculator-ynlzz2pm02.dispatcher.hana.ondemand.com/webapp/index.html?hc_reset

Installation/Maintenance

Data Intelligence on-premises installation has been simplified with the Software Lifecycle Bridge (SLCB) and Maintenance Planner. There are a lot of blogs (including one I wrote) on building an installation host or jumpbox for Data Hub. With Data Intelligence the installation host/jumpbox is no longer needed.  There are other blogs that detail installation of Data Intelligence 3.0 such as this one from Dimitri Vorobiev.

For on-premises maintenance, patches are installed via the same SLCB and maintenance planner.  It is a very easy process to update your installation (but be sure to follow the prerequisites on help.sap.com).

Data Intelligence Cloud is provisioned on your SCP account (it is available as a subscription model or under Cloud Platform Enterprise Agreement (CPEA). Simply provision it in your account and wait for it to be ready.

For the cloud instances. there are monthly scheduled maintenance windows where updates are applied.

Deployment Options

Data Intelligence on-premises can be installed in any certified environment.  This includes the hyperscalers and kubernetes on-premises.  As of this blog post, the minimum kubernetes version should be at least 1.14.x (but that is a bit outdated now and I would recommend moving to at least 1.16.x)  For the current list of supported bookmark this OSS Note: https://launchpad.support.sap.com/#/notes/2871970.

Data Intelligence Cloud is deployed only on AWS and Azure as of the time of this blog and not in every region.This may influence your decision to install it locally to your current region/availability zone.  For the current list of options see here – be sure to filter for Data Intelligence in the search box below the map view:  https://help.sap.com/doc/aa1ccd10da6c4337aa737df2ead1855b/Cloud/en-US/3b642f68227b4b1398d2ce1a5351389a.html 

As of this blog these are the deployment options as of today (Oct 14, 2020) but more locations will be available in the future:

 

For Data Intelligence Cloud, we have the cloud connector option to reach into your on-premises environment securely.

Other Influencing Factors

Factors that should be considered for Cloud vs. On-premises include:

  • The overall corporate strategy – if part of your digital transformation is moving to a cloud-based platform, it makes sense for Data Intelligence to be the backbone of your Business Transformation Platform in the cloud.
  • Where is the bulk of the data?  If you have a large ECC or S/4 HANA on premises where most of the data will be accessed, you want to consider the cloud ingress/egress costs and potential data movement.  If you’re using HANA Cloud as part of the overall future strategy, you may want to deploy a cloud option as your future-state will be in the cloud.
  • Deep learning Machine Learning with GPUs – Part of the Cloud deployment options is certified GPU nodes for kubernetes.   If you want GPU’s on premises you have to refer to the certified GPU node types and include them into your kubernetes environment. https://launchpad.support.sap.com/#/notes/2900587

 

Conclusion

There are many factors in determining which option to select for your Data Intelligence application.  The simplest is the cloud option as it does take all the kubernetes care and maintenance out of the equation as well as patching and upgrades.

Reach out or add a comment below if you have any other ideas on what should be considered!

Assigned Tags

      5 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Mihail Sevdiev
      Mihail Sevdiev

      Hello James,

      Thank you for your post, it summarizes the situation very well. Before DI 3.0, this was a bit of a problem for the customers to get the difference, since the two products where with different names, same functionality and only one, but significant difference - the ML. Since 3.0 it is really much easier to explain and harder to decide. However, keep in mind, that the resources for an on-premises installation should include also the repository / Docker image catalog. There are a lot of use cases, however many of the customers maybe should go with the on-premises solution, because of the sensitive data and the flow to the AWS or Azure. As mentioned - the cluster of 3 nodes is the minimum. I believe that this is for really small / test environments. P.S. SAP note 2900587 (for the DL and GPU) is not accessible for non-sap employees 😉

      Great post, keep up the good work! Will follow for future updates / insights!

      Author's profile photo James Giffin
      James Giffin
      Blog Post Author

      Thank you for the comment Mihail.  Good point on the repository - depending on the amount of custom images, it can grow but even in my internal DI cluster is only costs $20 a month (and I have all the images from 2.3 through 3.0.5. (You remind me that I should write a script to clean up the older images)

      3 nodes is the minimum on-prem configuration but it is quite small but is enough to get you started.   My cluster is 4 nodes of instance type r5.2xlarge on my EKS on AWS and we were able to achieve a lot of valuable testing.

      I just checked the release status of the SAP Note and it is in process for being released, so please check back.  At a high level it states that a CUDA 10 device driver must be installed on the nodes containing GPUs.  I'll wait to the note is released in case the actual GPUs supported might change.

      Author's profile photo Mihail Sevdiev
      Mihail Sevdiev

      Hi James,

      Image repository for an enterprise scenario is much different. As per regulations (compliance / auditing), it should be secured, should have backups, should be local (no internet access). This was the reason to mention it for taking in count. It is a service (not as everyone presumes - just a XXX TB space) - and as we know - the services in enterprise environment should be regulated and put in a lifecycle.

      Thank you for the insights on the CUDA 10 driver and GPU requirements. This is something, that I would really love to test out. I have no experience with it and if you have something to share, will be interesting to have a blog on it.

      Thank you again and stay safe!

      Author's profile photo Michael Eaton
      Michael Eaton

      Hello

      Can you expand on this statement please?

      Data Intelligence BYOL and Data Intelligence Cloud have full feature parity. There is no difference between the deployment options!

      At the time of writing, DI Cloud has 3 APIs at version 1.1, and DI has 2 APIs at version 1.0. Will DI catch up with DI Cloud, if so, how far behind is the release?

      Thanks

      Michael

      Author's profile photo James Giffin
      James Giffin
      Blog Post Author

      Fair point - the DI Cloud was just updated in May to the 2103 release so there are some slight deltas until the On-premise patch is fully tested across all the supported environments.   The on-premise  version should be released in the coming months and realign the features again.