Technical Articles
Connect ABAP Cloud with Databricks via SQL Service
Have you ever considered consuming data directly from ABAP Cloud tables using SQL?
That was a challenge for me and I want to help you.
Prerequisites
First of all, I will request you to follow the tutorial provided by SAP so you can get acquainted a little with the SQL Service and ODBC Driver – as a plus, you will be able to connect to your SQL service via Excel
https://developers.sap.com/tutorials/abap-environment-abap-sql.html
If you do not want to do the SAP tutorial make sure you already have ODBC for ABAP Driver installed and a configured SQL Service.
For the following steps I strongly recommend creating a new folder and downloading everything inside it:
- You will need a Linux Cluster in Databricks with unixODBC and make sure you download the following files compatible with your Linux Cluster version;
- ODBC driver for ABAP, you need to download it from the SAP Support Portal https://support.sap.com/en/index.html
- Search for SAPCRYPTOLIB in the same link above, open COMMONCRYPTOLIB 8, and download the latest version for your Linux Version;
- You will also need SAPCAR to be able to unpack the SAP files downloaded
- Unpack the files downloaded with SAPCAR.exe with the following syntax:
C:\Users\D-number\Downloads\sapcar.exe -xvf C:\Users\D-number\Downloads\Your downloaded ODBCforABAP.SAR
Following the steps above you should have a folder with the following files:
Files after unpacking
Installing ODBC Driver in your Databricks Linux Cluster
*Create a notebook in Databricks to proceed.
- Move the files you have unpacked to a directory inside Databricks where your cluster can read them – I stored mine in Databricks DBFS.
- The main two files you need are ODBC_driver_for_ABAP.so and libsapcrypto.so -> Now let’s check if you have any dependencies missing in your Linux Cluster, run the following commands:
ldd <path to your stored files>/ODBC_driver_for_ABAP.so
ldd <path to your stored files>/libsapcrypto.so
- From the image above you can see that I had one dependency missing in my Linux Cluster – fortunately, we already have this file since it comes together with the ODBC Driver.
To resolve this issue I simply copied all “.so” files to the exact location where the other files are -> /lib/x86_64-linux-gnu/
cp <path to your stored files>*.so /lib/x86_64-linux-gnu/
- Now run the following shell command and note down the path of the User Data Sources:
%sh
odbcinst -j

- Let’s now create the DSN using the ODBC Driver for ABAP – REMEMBER to change the highlighted pieces of the following command to where your files were stored in Databricks
%shcd /rootecho -e “[ABAPCLOUD]\nDriver=/lib/x86_64-linux-gnu/ODBC_driver_for_ABAP.so\nHOST=<yourSQLServicehost>.com\nPORT=443\nCLIENT=100\nLANGUAGE=EN\nSERVICEPATH=/sap/bc/sql/sql1/sap/S_PRIVILEGED\nTrustAll=true\nCryptoLibrary=/lib/x86_64-linux-gnu/libsapcrypto.so\nUidType=alias\nTypeMap=semantic” | sudo tee /root/.odbc.ini <- This is the path from step above
- Set the Environment variable LD_LIBRARY_PATH to where your “.so” files are:
%sh
export LD_LIBRARY_PATH=/lib/x86_64-linux-gnu/
- Test if the connection is working with the following command – the DSN ABAPCLOUD was created two steps above.
iusql ABAPCLOUD <user> <pass> -v

Connecting to SQL Service through pyODBC
Now that we have set up unixODBC, we can use pyODBC to connect with the SQL service.
- Install pyODBC into your notebook and run the following command to check if you can query the SQL Service
conn_str = f’DRIVER={driver};CRYPTOLIBRARY={CryptoLibrary};HOST={host};PORT={port};UID={username};PWD={password};CLIENT={client};LANGUAGE={language};SERVICEPATH={SERVICEPATH};TrustAll=true;UidType=alias;Typemap=native’conn = pyodbc.connect(conn_str)
- We won’t have any output if it works and then we can start querying our SQL Service like the following example:
In the reference section, there are some SAP docs to have a better understanding of how to query the ODBC Driver for ABAP.
Conclusion
Due to OData limitation of 100 records per request, I believe that using the SQL service can be the right approach when you want to retrieve a large amount of data from ABAP Cloud – I have not found any intuitive tutorial on how to achieve that and after some hours diving deep in ODBC concepts and how to use drivers to connect with different Databases I came up with this solution, it may not be the optimal one but that’s what I have come up after reading all over the internet!
Hi Johann Schallenberger,
I am curious to understand the business use case that led to this solution. Can you elaborate a little on that?
Thanks,
Mustafa.
Hello Mustafa Bensan
Inside Databricks, you have a lot of functionalities to transform your data and deal with big data since you can use Python, Spark, SQL, etc.
But I would like to also make it available to other people in a Data lake so that they can handle it as needed and are more acquainted with Python/Databricks than ABAP.
Databricks also has its own Auto ML integrated with MLFlow so that's a plus as well.