Skip to Content
Technical Articles
Author's profile photo Johann Schallenberger

Connect ABAP Cloud with Databricks via SQL Service

Have you ever considered consuming data directly from ABAP Cloud tables using SQL?
That was a challenge for me and I want to help you.

Prerequisites

First of all, I will request you to follow the tutorial provided by SAP so you can get acquainted a little with the SQL Service and ODBC Driver – as a plus, you will be able to connect to your SQL service via Excel
https://developers.sap.com/tutorials/abap-environment-abap-sql.html

If you do not want to do the SAP tutorial make sure you already have ODBC for ABAP Driver installed and a configured SQL Service.

For the following steps I strongly recommend creating a new folder and downloading everything inside it:

  1. You will need a Linux Cluster in Databricks with unixODBC and make sure you download the following files compatible with your Linux Cluster version;
  2. ODBC driver for ABAP, you need to download it from the SAP Support Portal https://support.sap.com/en/index.html
  3. Search for SAPCRYPTOLIB in the same link above, open COMMONCRYPTOLIB 8, and download the latest version for your Linux Version;
  4. You will also need SAPCAR to be able to unpack the SAP files downloaded
  5. Unpack the files downloaded with SAPCAR.exe with the following syntax:

C:\Users\D-number\Downloads\sapcar.exe -xvf C:\Users\D-number\Downloads\Your downloaded ODBCforABAP.SAR

Following the steps above you should have a folder with the following files:

Files%20from%20pre-requisite%20steps

Files after unpacking

Installing ODBC Driver in your Databricks Linux Cluster

*Create a notebook in Databricks to proceed.

  • Move the files you have unpacked to a directory inside Databricks where your cluster can read them – I stored mine in Databricks DBFS.
  • The main two files you need are ODBC_driver_for_ABAP.so and libsapcrypto.so -> Now let’s check if you have any dependencies missing in your Linux Cluster, run the following commands:

ldd <path to your stored files>/ODBC_driver_for_ABAP.so
ldd <path to your stored files>/libsapcrypto.so

  • From the image above you can see that I had one dependency missing in my Linux Cluster – fortunately, we already have this file since it comes together with the ODBC Driver.
    To resolve this issue I simply copied all .so” files to the exact location where the other files are -> /lib/x86_64-linux-gnu/
cp <path to your stored files>*.so /lib/x86_64-linux-gnu/
  • Now run the following shell command and note down the path of the User Data Sources:
%sh
odbcinst -j
  • Let’s now create the DSN using the ODBC Driver for ABAP – REMEMBER to change the highlighted pieces of the following command to where your files were stored in Databricks
%sh
cd /root
echo -e “[ABAPCLOUD]\n
Driver=/lib/x86_64-linux-gnu/ODBC_driver_for_ABAP.so\n
HOST=<yourSQLServicehost>.com\n
PORT=443\n
CLIENT=100\n
LANGUAGE=EN\n
SERVICEPATH=/sap/bc/sql/sql1/sap/S_PRIVILEGED\n
TrustAll=true\n
CryptoLibrary=/lib/x86_64-linux-gnu/libsapcrypto.so\n
UidType=alias\n
TypeMap=semantic” | sudo tee /root/.odbc.ini <- This is the path from step above
  • Set the Environment variable LD_LIBRARY_PATH to where your “.so” files are:

%sh
export LD_LIBRARY_PATH=/lib/x86_64-linux-gnu/

  • Test if the connection is working with the following command – the DSN ABAPCLOUD was created two steps above.
iusql ABAPCLOUD <user> <pass> -v

Connecting to SQL Service through pyODBC

Now that we have set up unixODBC, we can use pyODBC to connect with the SQL service.

  • Install pyODBC into your notebook and run the following command to check if you can query the SQL Service
conn_str = f’DRIVER={driver};CRYPTOLIBRARY={CryptoLibrary};HOST={host};PORT={port};UID={username};PWD={password};CLIENT={client};LANGUAGE={language};SERVICEPATH={SERVICEPATH};TrustAll=true;UidType=alias;Typemap=native’
conn = pyodbc.connect(conn_str)

  • We won’t have any output if it works and then we can start querying our SQL Service like the following example:

In the reference section, there are some SAP docs to have a better understanding of how to query the ODBC Driver for ABAP.

Conclusion

Due to OData limitation of 100 records per request, I believe that using the SQL service can be the right approach when you want to retrieve a large amount of data from ABAP Cloud – I have not found any intuitive tutorial on how to achieve that and after some hours diving deep in ODBC concepts and how to use drivers to connect with different Databases I came up with this solution, it may not be the optimal one but that’s what I have come up after reading all over the internet!

Reference

Assigned Tags

      2 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Mustafa Bensan
      Mustafa Bensan

      Hi Johann Schallenberger,

      I am curious to understand the business use case that led to this solution.  Can you elaborate a little on that?

      Thanks,

      Mustafa.

      Author's profile photo Johann Schallenberger
      Johann Schallenberger
      Blog Post Author

      Hello Mustafa Bensan

      Inside Databricks, you have a lot of functionalities to transform your data and deal with big data since you can use Python, Spark, SQL, etc.

      But I would like to also make it available to other people in a Data lake so that they can handle it as needed and are more acquainted with Python/Databricks than ABAP.

      Databricks also has its own Auto ML integrated with MLFlow so that's a plus as well.