Step-by-step: Setup SDI for your HCP account (part 3 – testing things out)
In this last step of this series, we’ll show you how to test out SDI to load data from a file into HANA on HCP. Make sure you have finished the setup of the SDI agent in the previous steps. Once you’ve tested out this simple scenario, you can go back to the parent blog for links to resources to learn more about SDI: Smart Data Integration available for HCP
For this demo we will use a very simple file (data.csv) which looks like this:
FirstName,LastName,Country
Eden,Hazard,Belgium
Lionel,Messi,Argentina
Cristiano,Ronaldo,Portugal
In order for the SDI data provisioning agent to read this file, you will need a configuration file that describes this file. You can check the documentation on the SDI file adapter ( File – Administration Guide for SAP HANA Smart Data Integration and SAP HANA Smart Data Quality – SAP Library ) for more details on how to (automatically) create such a file, but a simple configuration file (data.cfg) for the data file above would be:
Fileformat to read sample data
FORMAT=CSV
FORCE_FILENAME_PATTERN=data%.csv
COLUMN_DELIMITER=,
ROW_DELIMITER=\r\n
SKIP_HEADER_LINES=1
COLUMN=FIRSTNAME;VARCHAR(50);First Name
COLUMN=LASTNAME;VARCHAR(50);Last name
COLUMN=COUNTRY;VARCHAR(50);Country
Note: the value (filename) you use for FORCE_FILENAME_PATTERN is case sensitive. So make sure your file name matches exactly. A wildcard (%) can be used to select multiple files that match the pattern.
We assume you have the configuration file (data.cfg) in a directory D:\Data\Config and the data file (data.csv) in D:\Data\Data.
In order to read these files, you need to configure the file adapter in the data provisioning agent. This is done through the configuration tool which we used to configure and connect the agent. First we need to set some properties for the file adapter:
- Go to the menu “File” – “Preferences”
- Navigate to the FileAdapter
- Set the values for the 3 properties:
- Root directory : directory where the data files are stored. We will use directory D:\Data\Data where the data.csv file is stored.
- File format root directory : directory where the configuration files are stored. We use a directory D:\Data\Config for this, with the data.cfg file above.
- Access token : a “password” that you define here in the preferences and will be needed later in the HANA Web IDE when you create the remote source to read the files via this agent.
- Click OK to store the values.
Back on the main screen in the configuration tool, you can select the FileAdapter in the list of adapters and click “Register Adapter”. Once this is completed, the file adapter on this agent will be visible in the HANA Web IDE to create your remote source.
So log in to the HANA WebIDE now (use the SDI_USER to make sure you have the required authorizations as set up in step 4) and open the Catalog editor (https://<your HANA server>.hana.ondemand.com/sap/hana/ide/catalog). Go the the “Provisioning” folder and create a “New Remote Source”.
Complete the required fields to define the remote source:
- Source Name: any name you like as name for the remote source, e.g. “Files”
- Adapter name: select the “FileAdapter” you registered in the previous step. Note: in the list of available adapters, you will see system adapters for database sources used by SDA (Smart Data Access), however these cannot be used on HCP. At the bottom of the list you will see adapters registered from the agents (these names will be in Title case as apposed to the system adapters which are all in UPPER case).
- Location and Agent Name: no change needed, when FileAdapter is selected, location will switch to “agent” (as apposed to indexserver), in case you would have multiple agents with the File Adapter registered, you can select the correct agent in the Agent Name dropdown box.
- Connection info properties: no change needed. Root directory and directory for the fileformat definitions (cfg files) will be picked up from the agent.
- Credentials – Access Token: Here you provide the same access token as what you used in the preferences on the agent. This is a security measure to make sure you have permission to access the files available through the selected agent.
Once you save the remote source, the connection will be tested, you should see a “Remote Source saved successfully” message.
Now you can browse the remote source by expanding the Files remote source in the browser and should see several tables. There are some default tables with technical metadata like CODEPAGES, FILECONTENT, … etc. But you will also see the “Data” table which corresponds to the Data.cfg file format we created earlier. Each cfg file format will result in an additional entry in this list.
In order to access the data in the file, we can now create a virtual table by right clicking on the “Data” entry and select “New Virtual Table”. Provide a name for the virtual table and the schema to create it in.
Finally you can go to the Catalog, browse to your schema and should see the newly created virtual table. You can run SQL against this table now, or just open the content in the Web IDE. Once you see the data you have successfully proofed the agent is able to connect to onPremise sources and send the data to HANA on HCP.
This concludes this blog series. Next you can continue your learning by really replicating the data via a replication task, so that you can schedule regular data loads, you can explore flowgrpahs to add additional transformation steps before storing the data and of course look into more exiting adapters like the (real-time) database adapters, Twitter, Odata, … etc. You will find links to more resources to continue your learning on the main page of this blog series: Smart Data Integration available for HCP.
Have fun !
Not able to create virtual table. Getting the below error.
(Catalog) (dberror) 476 - invalid remote object name: Passed integer value cannot be parsed from the virtual table metadata: line 0 col -1 (at pos 0)
Did you use the same .cfg file as in the example in the blog ? That one works fine for me...
The error message seems refers to the "virtual table metadata" which is the information we get from the cfg file.
I am getting the below error when using the file as is...
Fileformat to read sample data
FORMAT=CSV
FORCE_FILENAME_PATTERN=Data%.csv
COLUMN_DELIMITER=,
ROW_DELIMITER=\r\n
SKIP_HEADER_LINES=1
COLUMN=FIRSTNAME;VARCHAR(50);First Name
COLUMN=LASTNAME;VARCHAR(50);Last name
COLUMN=COUNTRY;VARCHAR(50);Country
9:10:47 AM (Catalog) (dberror) 476 - invalid remote object name: Unknown format string (CSV ): line 0 col -1 (at pos 0)
Looking at the error, it seems there are some trailing white spaces after "CSV" in the cfg file. Can you remove the white spaces at the end of the lines ? Looks like in my sample white spaces where added in the HTML page due to this text editor control I'm using. I'll try to find a solution to make sure a copy/paste from the blog does not include the additional white space... For now, please remove them manually in your cfg fle.
Please let me know if removing the white spaces manually fixes the issue. It's not only on the FORMAT line, but could be on other lines as well.
Exactly what you said I did the same and was able to create the VT.
FORMAT=CSV
FORCE_FILENAME_PATTERN=Data.csv
COLUMN_DELIMITER=,
ROW_DELIMITER=\r\n
SKIP_HEADER_LINES=1
COLUMN=FIRSTNAME;VARCHAR(50);FirstName
COLUMN=LASTNAME;VARCHAR(50);LastName
COLUMN=COUNTRY;VARCHAR(50);Country
But now when I try to open the content of the table I get :
(dberror) 403 - internal error: Error opening the cursor for the remote database Can't Find Data File : C:\Users\mypc\Documents\SDI\data\Data.csv, Please Make Sure Data File is exist and Configuration File Point to the Correct Data File
Am I missing something?
I assume the file C:\Users\mypc\Documents\SDI\data\Data.csv exisits on your system ? If the file exists and SDI can still not access it, it must be a permissions issue. C:\Users folder has special default permissions so that only the user who owns this folder "mypc" will be able to see he files. The SDI agent ("SAP HANA Data Provisioning Agent") runs as a windows service with a user you have defined during the install. Is this is a different user then the one who has access to C:\Users\mypc you will not be able to access the file.
A simple solution would be to use a folder like "C:\Data" where all users would have access.
Ben.
I did as you said. I have full access to my local folders. Not sure if there a level of trace further to figure out.
Error: (dberror) 403 - internal error: Error opening the cursor for the remote database Can't Find Data File : C:\SDI\data\Data.csv, Please Make Sure Data File is exist and Configuration File Point to the Correct Data File!
The framework.trc file in the log folder on the agent machine has more details. But I'm not sure it will be of much help in this case. The error message is pretty clear: the agent is not able to access the file at the path specified.
I'm afraid I'm out of ideas...
Thanks Anyway. I tried a whole lot but not sure what is wrong. I will try out other scenarios too.
Thanks for the amazing blog series to make the concept clear.
Hello Sridhar,
I had the same issue on Windows PC and was able to resolve: The filename is forced case sensitive (see the config file), so make sure you name your csv file exactly "Data.csv", not "data.csv" etc.
HTH...
Great Article and clear steps even though sap dpa agent version is higher than the article mentions ,Dated Oct 25th 2017 I have followed the steps and everything went smooth infact I used windows desktop with sap hana studio(eclipse) on windows Host system while my sap local install is actually on Suse Linux leap 42.1 (Guest) ,I could have done all the steps on guest VM ,but preferred windows desktop anyway
I would give 5 stars to the author for his great part 1 ,part 2 ,part 3 articles and SAP for providing the SAP HANA trial on Cloud is really helping people like me who are trying to learn SAP hana and acquire skills
Also ,I had been a consistent advocate of DB based stored procs ,functions etc with SAP Hana this is a reality with performance
Delighted to see Nodejs adopted into hana xs engine along with Apache Tomcat based Java ,C++ fast CGI etc
while it is a full circle of technical paths
Rama Anne
Hi Ben,
one question regarding automatic creation of config file:
Somehow nothing happens when I try to use the tool - there is no error message but also no generated config-file.
For simplification I have tested with the following csv-content:
c1;c2;c3
1;2;3
Lines ending with \r\n
And this is my call:
createfileformat.bat -file c:\Temp\SDITest\data.txt -cfgdir c:\Temp\SDITest -colDelimiter ;
Tried it both in standard and admin-mode.
Do you hava a clue what can be wrong here?
Thanks in advance,
Patrick
Patrick, I have observed the same when I was playing with the createfileformat.bat. I opened a bug for this and it's being worked on. No timeline on a fix yet, but once available, we will upload that version to tools.hana.ondemand.com (as well as Service Market Place obviously).
Thanks,
Ben.
Thanks Ben,
keep up the great work!
Cheers,
Patrick
Hi
I am getting the error while saving the remote source. I already given all the roles and privileges to SDI_USER. Please suggest
23:50:52 (Remote Source Editor) (dberror) 258 - insufficient privilege: Not authorized
Ok, Now the issue is resolved . After adding CREATE REMOTE SOURCE to system privilege it is working fine 🙂
Good point ! I have added this to the blog so that others don't run into the same authorization issue while testing out things.
Hi Ben,
I am getting the below error when i am trying to register the file adapter. Can you please let me known what could be the possible issue.
Thanks in advance
Vignesh J
Hi Vignesh,
I had the same error. I think it has to do with the port 8080 in the connection settings.
Uncheck the 'Use HTTP proxy' and specify the HANA port as 443. This fixed the issue for me.
-Benedict
Hi Ben,
I am facing following issue, could you please suggest me any fix for this.
Once file adapter configuration is done,while creating remote source the configured root directory path does't get reflected on initial screen of new remote source. Only by default root directory path gets reflected because of which i am getting below error.Tried changing it manually but it does't help.
FYR-Please find below attached screen shot.
Pankaj
In the agent configuration tool, you have set a root directory. The directory you specify in the remote source should be the same, or a sub-directory of that directory.
This is a security measure: in the agent onPremise you set the maximum scope you can access (e.g. C:\Data), when you define the remote source later from the cloud, you can get files from subdirectories (e.g. C:\Data\myData), but not from other locations (e.g. C:\ConfidentialData).
Thanks,
Ben.
Hi Ben,
Thanks for your input. I have done the same. But I am facing below issue while trying to create remote source.Any input would be a great help.
Pankaj
Pankaj, the root directory should be just the path (:\data) , not including the file name itself (data.csv).
Thanks,
Ben.
Hi Ben,
yeah sorry i changed it now, i was just trying all the possibilities.
I read your second last comment again and worked on it. Now it works perfectly fine. Thanks a lot for your input. I am able to load files from my local system to cloud account . Thanks
Pankaj
Hi Ben,
I'm abel to add remote source successfully.
However system is not fetching any object under the remote source .
attached is the error message.
Remote source and FileAdapter Access token are same. Still not able to connect .
Please advise if i'm missing anything here.
Thanks, Brijesh
Hi Ben,
especially during setup of realt time replication there is also a triggering part in the source Hana which is established. Once you drop a table from this Scenario or would like to disable SDI completely again what steps are required to clean up the Source Hana? Would SDI recognize the last table dropped and remove the triggering Tables/Components in the source Hana?
Hello Ben,
I don't see the "data" entry in de Remote Services. Any idea what I might be doing wrong?
Thanks, Bart Claesen
It worked. Thanks Ben.
I'm getting an error while configuring the file adapter in the data provisioning agent. Can you please help?
Thanks & Regards,
Sekhar
Make sure you launch the configuration tool as Administrator (right click and select "Run as Administrator"), you might have some file permissions issues on your Windows system when writing/reading from the properties file.
Also keep in mind that on Windows you cannot use mapped drives (G:/...) for a technical user. You need to use UNC paths like \\server\share\ in order for a technical service user to access your shared drives.
Can we access data from multiple on-premise systems simultaneously?
The dpagentconfigtool hangs if I go to Configure->Preferences->->Adapters. In the log I only see “Number of adpaters 21”. How can I fix that? (this concerns HANA_DP_AGENT_20_LIN_X86_64)
Hello, I'm trying to replicate a table in Hana by SDI, I did a replication task but at the moment of executing the procedure I get the following error:
I set Abap adapter.
Hi Ben,
When I open the configuration agent , and connect to HANA and then when I goto preferences and click on adapters, my configuration agent stops responding. Please see the screen shot below. Please help.
Regards,
Kapil Pokharna
Hi Kapil,
I am facing the same issue , were you able to get it working?
Regards,
Ramya
Hello Kapil,
I am too facing the same issue:
dpagentconfigtool is hanged when I open Configuration->preferences->adapters.
Is there any solution for this ?
Thanks.
Hi,
I am also facing the same issue. Could anyone please help us?
Regard,
Sudheer
I receive the following error when I try to create a new datasource based on a remote source of sftpadapter.