Creating Graph in SAP HANA

kayur_goyal · ‎04-03-2020

Creating Graph in SAP HANA

Today we will learn about the fundamentals of Graph data structure and how with the latest SAP HANA Advanced, graph option we can solve network related problems.

What is a Graph?

A Graph is a non-linear data structure. A Graph has nodes containing information and properties of object we are trying to abstract and edges which are essentially connection between two nodes.

For example, a family hierarchy can be visualised as a Graph with persons and the family as nodes and relationships with other family members as Edges between different nodes.

Graphs are also used in social networks like LinkedIn, Facebook.

Graph Data Definition

A graph contains vertices/ nodes and connection between nodes called Edges.

Below we will see a sample of how we create Edges, vertices and Graph workspace in SAP HANA.

Visualising SAP HANA, graph

Once we have defined Edges, vertices and Graph, we can visualise the data as a Graph in SAP HANA.

Navigate to your database container.

Expand your database container.

Select graph workspace.

On selection you can see the above created graph workspace in the object list below.

Right click on graph workspace and select "View Graph".

Once you open the Graph you can see all the vertices and edges between them.

The SAP HANA, graph options supports graph script - a programming model for developing custom graph algorithms.

Graph script provides some built-in functions and algorithms.

Neighbors

Strongly Connected Components

Breadth First Search

Shortest Path One-to-One

Shortest Path One-to-All

Below is a sample implementation for Neighbors function.

*************************************/

-- GraphScript functions and algorithms - Neighbors function

-- running in version SAP HANA 2.0 SPS04

/*************************************/

/*************************************/

-- tables and workspace

DROP SCHEMA "GRAPHSCRIPT" CASCADE;

-- Temporary schema 

CREATE SCHEMA "GRAPHSCRIPT";

--Vertex/Node table definition containing list of all nodes in the Graph

CREATE COLUMN TABLE "GRAPHSCRIPT"."NODES" (

"ID"BIGINTPRIMARYKEY

);

--Edges definition containing connections between vertex/ nodes.

CREATE COLUMN TABLE "GRAPHSCRIPT"."EDGES" (

"ID"BIGINTGENERATEDALWAYSASIDENTITYPRIMARYKEY,

"SOURCE"BIGINTREFERENCES"GRAPHSCRIPT"."NODES"("ID") ON DELETE CASCADENOTNULL,

"TARGET"BIGINTREFERENCES"GRAPHSCRIPT"."NODES"("ID") ON DELETE CASCADENOTNULL

);

--Sample data for Nodes and Edges Table 

INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (1);

INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (2);

INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (3);

INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (4);

INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (5);

INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (1, 2);

INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (1, 3);

INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (2, 3);

INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (2, 4);

INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (3, 4);

INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (5, 4);

--Graph data structure definition specifying the vertex table and the edges table with Source and Target for every edge

CREATE GRAPH WORKSPACE "GRAPHSCRIPT"."GRAPHWS"

EDGE TABLE "GRAPHSCRIPT"."EDGES"

SOURCE COLUMN "SOURCE"

TARGET COLUMN "TARGET"

KEY COLUMN "ID"

VERTEX TABLE"GRAPHSCRIPT"."NODES"

KEY COLUMN "ID";

/*************************************/

-- NEIGHBORS procedure

--Type definition to be used in our Neighbors Procedure

CREATE TYPE "GRAPHSCRIPT"."TT_NODES_NEI" AS TABLE ("ID" BIGINT);

CREATE TYPE "GRAPHSCRIPT"."TT_EDGES_NEI" AS TABLE ("ID" BIGINT, "SOURCE" BIGINT, "TARGET" BIGINT);

CREATE OR REPLACE PROCEDURE "GRAPHSCRIPT"."GS_NEIGHBORS"(

IN i_startNode BIGINT, -- the ID of the start node

IN i_min BIGINT, -- the minimum hop distance

IN i_max BIGINT, -- the maximum hop distance

OUT o_nodes "GRAPHSCRIPT"."TT_NODES_NEI",

OUT o_nodesCount BIGINT,

OUT o_edges "GRAPHSCRIPT"."TT_EDGES_NEI"

)

LANGUAGE GRAPH READS SQL DATA AS

BEGIN

-- create an instance of the graph, refering to the graph workspace object

GRAPH g = Graph("GRAPHSCRIPT", "GRAPHWS");

-- create an instance of the start node

VERTEX v_start = Vertex(:g, :i_startNode);

-- create a multiset of all neighbor nodes of the start node

MULTISET<Vertex> m_neighbors = Neighbors(:g, :v_start, :i_min, :i_max);

-- project the result from the multiset

o_nodes = SELECT :v."ID" FOREACH v IN :m_neighbors;

o_nodesCount = COUNT(:m_neighbors);

-- create a vertex induced subgraph to get all edges between the nodes in the neighbors multiset

GRAPH g_sub = SubGraph(:g, :m_neighbors);

o_edges = SELECT :e."ID", :e."SOURCE", :e."TARGET" FOREACH e IN Edges(:g_sub);

END;



CALL "GRAPHSCRIPT"."GS_NEIGHBORS"(i_startNode => 1, i_min => 0, i_max => 1000, o_nodes => ?, o_nodesCount => ?, o_edges => ?);

Using the script above we have learned how to define the nodes and Edges table, define Graph using SAP HANA, graph options and create procedures to perform graph specific operations.

Do note, that while defining a graph edges and vertex table, the vertex Id has to be only primary key of the Vertex table and only the vertex Id can be referenced in edges table for source and target columns and also while defining graphs.

Hope this blog helped you learn how SAP HANA enables us to use one of the most complex and useful data structure and come up with cutting Edge analysis and breakdown of problems exhibiting network like behaviour.

Thanks for reading! For more information on SAP HANA, graph options, I will be publishing further blogs containing details about more complex operations that we can perform in SAP HANA.

SAP HANA, graph option

Creating Graph in SAP HANA

What is a Graph?

Graph Data Definition

Visualising SAP HANA, graph

Get Your SAP HANA Idea Incubator Badge Today!

SCN Mission - SAP HANA Quiz Challenge is now retired

Share your #HANAStory and Win