Understanding Referential Joins..
In SAP HANA there are 2 new type of db joins in addition to 4 standard db joins, they are “Referential Join” and “Text Join”. Knowing how and when to use Referential Join will be useful when there are complex data models in works.
This post explains Referential Join topics in the below order:
- Referential Integrity with an Example.
- Referential Join.
- Examples with different query scenarios.
- When to use Referential Join?
Referential Integrity with an Example.
With the relationship of “Primary Key” and “Foreign Key” between multiple tables, the meaning of the data is complete with all dependencies established by the relationship. If primary key record is deleted or edited then the meaning might no more be same as data in secondary tables might be separated without any dependency.
Any foreign key is expected as a primary key in the referenced table this is referential integrity.
B_ID column in Books table is primary key and foreign key in Publisher and Author table. With this established dependency the Book details will also get Author and Publisher names (data completeness ensured).
In a scenario where a record from Books table is deleted, the data in Publisher and Author will not make any sense on their own as they are linked with B_ID. By ensuring referential integrity the record in Books table cannot be deleted without first deleting the related records in Publisher and Author tables (this can be done with cascade delete also). In a similar way any primary key record is expected to have a foreign key record to ensure the completeness of the meaning of the data.
Referential Join works like Inner Join with the Referential Integrity of the data ensured and also behaves differently in different query scenarios
Examples with different query scenarios.
Note: The scenarios mentioned below are with respect to SAP HANA Calc Views
Join: Join Books.B_ID = Publisher.B_ID;
Scenario 1: Select * from Join;
In this case the referential join behaves like an inner join and gives out the common records
Result: B_ID: 101,102,103 with TITLE, P_NAME
Scenario 2: Select B_ID, TITLE from Join;
In this case data from right table is not queried for, therefore the join itself will not be executed (in calc views) which improves the performance of calc engine. We can say that in this scenario the referential join behaves like join from the Books table.
When to use Referential Join?
Use Referential join when the referential integrity of data is ensured. If the referential integrity is not ensured then the Join might not return expected result.
If a Star join is being built, by default the join choice is Referential. This is because the star join will contain the data from different dimensional tables and fact table. Even though it is not mandatory to have this join, it is preferred.
Referential Join is built on Inner Join, and ensures Referential Integrity. It is optimised to use the calc engine in best way possible like in scenario 2 above.
Nice blog explaining the basics of Referential Join.
One comment from my side,
For Scenario2 : the Referential Join will not be executed if the cardinality is n:1 but if the cardinality on right table is not :1, the join will always be executed even if no column from right table is requested.