HANA Connect to Hive by EIM Option
Note: all the below steps are done by HANA SYSTEM user. If you use other HANA user, you may grant the roles/privilege if needed.
- Enable Data Provisioning Server
- Import Data Provisioning Delivery Unit
Download HANA SDI 1.0 and install Data Provisioning Delivery Unit. You could use the UI in this Delivery Unit to monitor the replication task status later.
- Grant Data Provisioning Monitor privilege to SYSTEM
Log on HANA using SYSTEM and execute:
CALL GRANT_ACTIVATED_ROLE('sap.hana.uis.db::SITE_DESIGNER','SYSTEM'); CALL GRANT_ACTIVATED_ROLE('sap.hana.im.dp.monitor.roles::Monitoring','SYSTEM');
- Test DP monitor cockpit
Then you should access the UI of the Data Provisioning Delivery Unit:
- Install Data Provisioning Agent in your Hadoop server
Note: you may install the agent in a separate Linux server.
Log on Linux server using root user. Add one new OS user name with dpagent.
Extract the installation file in /home/dpagent/installation:
./SAPCAR_617-80000935.EXE -xvf ./IMDB_DPAGENT100_01_0.sar
Install Data Provisioning agent:
cd /home/dpagent/installation/HANA_DP_AGENT_20_LIN_X86_64 ./hdbinst --silent --batch --path="/usr/sap/dpagent" --user_id=dpagent
–path: specify the directory of dpagent installation.
–user_id: specify use which user to run the dpagent service
- Start and Connect DPAgent to SAP HANA
Use root user:
cd /usr/sap/dpagent vi ./dpagentconfig.ini
Modify the following configurations:
agent.name=mydpagent hana.port=30015 hana.server=vhhan110
nohup ./dpagent &
- Register DPAgent
Start Xming on your terminal server.
Log in to RedHat running Data Provisioning Agent use root user:
cd /usr/sap/dpagent/configTool ./dpagentconfigtool
Now you should see below dialog to configure Data Provisioning Agent:
Click ‘Connect to HANA’.
HANA database user: SYSTEM
HANA database user password: password_of_SYSTEM
Then you should connect to HANA:
Click Register Agent, and input:
Agent Name: mydpagent (you could use your own name instead of this)
Agent Hostname: vcRHL660 (you should use your server name to instead of this)
Click Register, then your agent should be registered successfully:
Now we need to register the HiveAdapter distributed by SAP.
Select HiveAdapter and click ‘Register Adapter’:
The ‘Registered with HANA’ should be ‘Yes’:
- Copy Hive JDBC to /usr/sap/dpagent/lib/
Now let’s configure the HiveAdapter.
Download Hive 0.13.1 from the below link:
Note: please download one version of 0.12.0, 0.13.0 or 0.13.1.
Use root user to log on the Linux Server running Data Provisioning Agent, copy the jar files into /usr/sap/dpagent/lib/
- Create Remote Source using ‘SYSTEM’
Use SYSTEM log on HANA in HANA studio and create one new remote source to point to the Hive through your Data Provisioning Agent:
Please input the information according to your own environment:
Source Name: HIVE_EIM (you could input your own source name)
Adapter Name: HiveAdapter
Source Location: mydpagent (your agent configured in the above)
Hive Version: 0.13.1
Host: your own Hive server name
Port: 10000 (Port of Hive)
Database Name: test (your own Hive database name)
Use SSL: false
Credentials Mode: Technical User
Then please save and activate it. Now you should see your HIVE_EIM under Remote Sources:
You can create virtual table based on your remote table:
Data Preview from the virtual table:
- Grant Role and Privilege to ‘SYSTEM’
If you would like to create Replication Task in HANA, your should grant role and privilege as below:
Use SYSTEM log in SAP HANA:
Grant ‘sap.hana.xs.ide.roles::EditorDeveloper’ to SYSTEM user:
Grant SELECT,CREATE on MYHIVETEST(owner is SYSTEM) to _SYS_REPO:
GRANT SELECT,CREATE ANY ON SCHEMA MYHIVETEST TO _SYS_REPO WITH GRANT OPTION;
- Create Replication Task
log in using SYSTEM and create Replication Task:
Input your Replication Task Name, such like ‘HIVE2HANA’. Then please fill the information as below:
When saved and activated successfully, please log on HANA in HANA studio using ‘SYSTEM’ user. You should find the below objects created by HANA automantically:
HANA will create one virtual table points to your remote Hive table, you could review the data from this virtual table. One destination table is also created by HANA, now it’s empty. You could execute the procedure created by your replication task to load remote Hive data into your local HANA table. Double click the procedure to review its code:
Then open one SQL console and execute the procedure:
After it’s executed successfully, then you could preview the data in your local table.
- Monitoring the replication tasks
You could also monitor the Data Provisioning Agent and replication tasks in the UI of Data Provisioning Delivery Unit you imported in the above.
Please open this link to monitor the Data Provisioning Agent status:
Open this link to monitor the replication tasks status: