Skip to Content
Author's profile photo Sean Holland

How smart is Smart Data Access on HANA SP7?

Background

We feel SDA has great potential in a number of use cases, however we wanted to see how it would perform when combining local and remote data sources, which is key for any practical use. For this particular blog, I’m using both HANA for Local(BW) & Remote(Sidecar) sandboxes that are both on revision 72, both sitting on the same physical rack but on different servers. I may follow on with a Local HANA & a remote Oracle db if I folks are interested and we can get the drivers to work for connecting to an 11g db.

Setting up SDA

Not going to go over the actual setup in this blog, as it’s been well documented on saphana and in other blogs; some I’ve noted below..

Smart Data Access: Connecting multiple SAP HANA… | SAP HANA

Smart Data Access with HADOOP  HIVE  & IMPALA

To Confirm:

(Local & Remote) are both HANA Rev72


select database_name as remote_db, substr(version,0,7) as remote_db_version from sys.m_database



local_db_revision.PNG || db_revision.PNG

Demo Objective:

Find the Equipment Status from the “CV_EQUIP_GEN_NEW_STATUS” calculation view on HANA Sidecar HT2 (contains SLT replicated tables from ECC) for a small range of Equipment Numbers from the Equipment Master table “/BI0/QEQUIPMENT” in BS1 (BW on HANA).

Step 1: Add Virtual Table

Navigate to _SYS_BIC (where all activated content resides)

Provisioning.PNG

Scroll to find the desired view, right click and Add as Virtual Table.

Virtual_Table.PNG

Create Virtual Table in a local Schema we created for SDA Testing.

SaveVirtualTable.PNG

Let’s execute some queries..

Query 1

Query on BS1 to remote calc view through virtual table with some filters added.


select * from "SDA_TEST"."Zhana_CV_EQUIP_GEN_NEW_STATUS" r_cv
where r_cv.equnr like '6XR39%' -- equip range
and r_cv.datbi = '99991231' -- active records only



What’s being Passed to the Remote db? You can check using..

  1. Highlighting the Select statement and doing a Explain Plan or Visualize Plan, I’d recommend to a Explain Plan minimum just in case you don’t get the remote query you expect, e.g. In this case we want to avoid materializing the entire calc view on the remote db and bringing back the entire result set.
  2. For this demo and what I believe is the most accurate way is to query SQL Cache on the remote database to find the actual sql statement being executed.

There’s a lot more fields of interest in m_sql_plan_cache, just selecting a couple of fields for illustration.


select STATEMENT_STRING,TOTAL_CURSOR_DURATION,TOTAL_EXECUTION_TIME,TOTAL_EXECUTION_FETCH_TIME,
TOTAL_PREPARATION_TIME,TOTAL_RESULT_RECORD_COUNT
from m_sql_plan_cache
where upper(statement_string) like '%CV_EQUIP_GEN_NEW_STATUS%'
and user_name = 'SMART_BS1' -- User name from the SDA connection details
order by last_execution_timestamp desc




I’ll execute each statement 3 times so we can average these by dividing by 3.

(Note the equipment range record count is 10,224/3 = 3408)

Qry1_results.PNG

This is confirmed by BS1 results window


Statement ‘select * from “SDA_TEST”.”Zhana_CV_EQUIP_GEN_NEW_STATUS” r_cv where r_cv.equnr like ‘6XR39%’ — …’

successfully executed in 481 ms 263 µs  (server processing time: 352 ms 300 µs)

Fetched 3408 row(s) in 2.048 seconds (server processing time: 9 ms 802 µs)

So we can see that the filtering is being successfully passed to the remote query as expected. 🙂

Qry1_SQL_String.PNG

Query 2

Query same range, but this time apply filters to local BW equipment master table and join local to remote table in query.


select le.equipment, re.stat
from sapbs1."/BI0/QEQUIPMENT" le, "SDA_TEST"."Zhana_CV_EQUIP_GEN_NEW_STATUS" re
where le.equipment like '6XR39%'
and le.equipment = re.equnr -- inner join on equipment
and le.dateto = '99991231' -- active records only
and re.datbi = '99991231' -- active records only




SQL Plan cache results

Qry2_results.PNG

Statement String,

Qry2_SQL_String.PNG

SDA was smart enough to apply the equipment range filter to the remote calc view, even though I had applied that filter to the local db table. 🙂

Query 3

(2 local tables & remote join to Calc View)

Create a local column table and insert the same equipment numbers range as previously used.


create column table sda_test.temp_equip_list (EQUIPMENT NVARCHAR(18) );
insert into sda_test.temp_equip_list
select equipment from sapbs1."/BI0/QEQUIPMENT" where equipment like '6XR39%';





Now include the new column table in the same query.


select lt.equipment, re.stat
from sda_test.temp_equip_list lt, sapbs1."/BI0/QEQUIPMENT" le, "SDA_TEST"."Zhana_CV_EQUIP_GEN_NEW_STATUS" re
where lt.equipment = le.equipment
and le.equipment = re.equnr
and le.dateto = '99991231'
and re.datbi = '99991231'





I was a bit weary of this one, so let’s check the explain plan ahead of executing.

Qry3_ExplainPlan.PNG

That’s not good, the remote query has lost the crucial equipment filter and is now only filtering out the in-active records. I would have expected some remote cache or remote join type in this scenario. Let’s let it roll to confirm..

Query hangs and eventually after a couple of cancel attempts, gives a rollback error on the local db.

Checking the remote to check the damage..

Qry3_results.PNG

Ugghh, 🙁 not good at all. it bombed after 56million rows, the remote calculation view can return up to 80 million active records. The sql statement string reflects the same as the explain plan above.

Query 4

Let’s simplify this a bit, let’s just see if having one local table and one remote table helps the remote cache and remote query join kick in. So created a new virtual table called Zhana_EQBS (4 million rows x 12 columns) and join to local sda_test.temp_equip_list (50K rows x 1 column)


select lt.equipment, re.b_werk plant, re.b_lager sloc
from sda_test.temp_equip_list lt, "SDA_TEST"."Zhana_EQBS" re
where lt.equipment = re.equnr


While the response time is quite impressive, all 4 million rows are still fetched from the remote query

(Note, again executed same query 3 times)

Qry4_SQL.PNG

Qry4_SQL_String.PNG

Observations in Summary:

  • So based on the results above, we have to be quite selective on how we use SDA as it comes out of the box in rev 72. It has potential and there may be improvements on the way.
  • I’ve been able to create a calculation view based on a virtual table and expose it to BW through an external view, but I haven’t been able to create a Attribute or Analytical view based on virtual tables as all virtual tables get IS_COLUMN attribute set to false.
  • There’s a bug in Studio (again rev 72) where you can’t filter on the remote schema/table. It only brings up the local tables, which is quite painful if you have a lot of activated content in _SYS_BIC for example.
  • There are parameters available under configuration that may be able to help, so I’m going to open a message and engage SAP to see if we can improve the efficiency on how smart data access is suppose to work. I’m not sure if statistics have a part to play in the query optimization, but again hoping SAP folks can help out here.

Assigned Tags

      9 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Sean - excellent information.  Thanks for sharing!

      Author's profile photo Sean Holland
      Sean Holland
      Blog Post Author

      Thanks Ryan, glad you found it the information useful.

      Author's profile photo John Appleby
      John Appleby

      Nice first blog. The SDA team are charged with improving this so I'm sure they will be responsive.

      However you can work around this right, by avoiding to use Calc Views in the remote system and consuming the tables correctly? Although for BW that's not ideal.

      Author's profile photo Sean Holland
      Sean Holland
      Blog Post Author

      Thanks John,

      Query 4 is just that, one small local table and one remote table, apologies if this wasn't very clearly illustrated.

      What I had hoped for and I have played around with the smart data access config parameters post the analysis above, was some intelligence on when and where it was going to be most efficient to join the local & remote data sets.

      For Query 4, I would have hoped the local 50K rows would have been transferred to the remote db, stored in some type of temporary table and then joined with the remote much larger table EQBS table. Of course this could be done through manual coding, but the label does say "Smart Data Access"...

      Author's profile photo Balaji Krishna
      Balaji Krishna

      Thanks for a great blog Sean, interesting to see that you have selected the BWoH and HANA native as one of the first scenario to test SDA. As you mentioned in the comment above within SDA we have techniques like 'join relocation' and 'semi-join reduction' which are supposed to do exactly this, if the local dataset is much smaller compared to the remote dataset, then move the local data to remote DB perform the join there and then bring back the result dataset to HANA. this is supported on most Remote Sources with some exceptions. we will review the behavior internally to see if these are known issues and then work towards fixing them.

      Author's profile photo Sean Holland
      Sean Holland
      Blog Post Author

      Thanks Balaji,

      We're working on setting up a Oracle remote source also, however the documentation suggests on 12c being supported, so we need to see if we can work around this for an 11g db.

      The 'join relocation' for example is a hidden parameter, so it's a little confusing if this remote join/cache is really enabled out of the box for rev 72 or is there some additional setup required? Any guidance you can provide would be most appreciated, should I go ahead and open up a message in the meantime for this?

      Thanks,

      Sean

      Author's profile photo Balaji Krishna
      Balaji Krishna

      remote join is enabled by default on most sources, please go ahead and log a message so that we can have a Developer look into the issue.

      Oracle11g should work since most of the DB connectivity is downward compatible.

      Author's profile photo Sean Holland
      Sean Holland
      Blog Post Author

      hi Balaji, we did open a new incident for help with connecting to an Oracle 11g db. After a couple of weeks back and forth, the last update from SAP was to indicate only Oracle 12c was supported. Have you seen other customers set-up remote data sources for Oracle dbs with a lower db rev to Oracle 12c?

      10.07.2014 - 18:22:44 GMTUK - Reply by SAP

      Oracle 12c is the only supported version of Oracle.  Please refer to

      the following note

      1868209 - SAP HANA Smart Data Access: Central Note


      09.07.2014 - 01:23:08 GMTUK - Info for SAP by

      We did try to us the .odbc.ini and specify DSN, however this did not

      work

      for 11.2.0.3 DB. Can you confirm if its possible, and if so can you

      provide

      instructions?

      Author's profile photo John Appleby
      John Appleby

      Yes Oracle 11g is not supported, which is ironic because Oracle 12c is not supported for SAP applications.

      I don't see why you couldn't use the generic ODBC connector for Oracle 11g though. Did you try that? We found configuring SDA for Oracle was fiddly, because of Oracle. Are you sure this is an Oracle 11g problem and not just a SDA misconfiguration?