###### Technical Articles

# SAP HANA, graph option

# Creating Graph in SAP HANA

Today we will learn about the fundamentals of Graph data structure and how with the latest SAP HANA Advanced, graph option we can solve network related problems.

## What is a Graph?

A Graph is a non-linear data structure. A Graph has nodes containing information and properties of object we are trying to abstract and edges which are essentially connection between two nodes.

For example, a family hierarchy can be visualised as a Graph with persons and the family as nodes and relationships with other family members as Edges between different nodes.

Graphs are also used in social networks like LinkedIn, Facebook.

# Graph Data Definition

A graph contains vertices/ nodes and connection between nodes called Edges.

Below we will see a sample of how we create Edges, vertices and Graph workspace in SAP HANA.

## Visualising SAP HANA, graph

Once we have defined Edges, vertices and Graph, we can visualise the data as a Graph in SAP HANA.

Navigate to your database container.

Expand your database container.

Select graph workspace.

On selection you can see the above created graph workspace in the object list below.

Right click on graph workspace and select “View Graph”.

Once you open the Graph you can see all the vertices and edges between them.

The SAP HANA, graph options supports graph script – a programming model for developing custom graph algorithms.

Graph script provides some built-in functions and algorithms.

- Neighbors
- Strongly Connected Components
- Breadth First Search
- Shortest Path One-to-One
- Shortest Path One-to-All

Below is a sample implementation for *Neighbors* function.

```
*************************************/
-- GraphScript functions and algorithms - Neighbors function
-- running in version SAP HANA 2.0 SPS04
/*************************************/
/*************************************/
-- tables and workspace
DROP SCHEMA "GRAPHSCRIPT" CASCADE;
-- Temporary schema
CREATE SCHEMA "GRAPHSCRIPT";
--Vertex/Node table definition containing list of all nodes in the Graph
CREATE COLUMN TABLE "GRAPHSCRIPT"."NODES" (
"ID"BIGINTPRIMARYKEY
);
--Edges definition containing connections between vertex/ nodes.
CREATE COLUMN TABLE "GRAPHSCRIPT"."EDGES" (
"ID"BIGINTGENERATEDALWAYSASIDENTITYPRIMARYKEY,
"SOURCE"BIGINTREFERENCES"GRAPHSCRIPT"."NODES"("ID") ON DELETE CASCADENOTNULL,
"TARGET"BIGINTREFERENCES"GRAPHSCRIPT"."NODES"("ID") ON DELETE CASCADENOTNULL
);
--Sample data for Nodes and Edges Table
INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (1);
INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (2);
INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (3);
INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (4);
INSERT INTO "GRAPHSCRIPT"."NODES" VALUES (5);
INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (1, 2);
INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (1, 3);
INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (2, 3);
INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (2, 4);
INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (3, 4);
INSERT INTO "GRAPHSCRIPT"."EDGES"("SOURCE", "TARGET") VALUES (5, 4);
--Graph data structure definition specifying the vertex table and the edges table with Source and Target for every edge
CREATE GRAPH WORKSPACE "GRAPHSCRIPT"."GRAPHWS"
EDGE TABLE "GRAPHSCRIPT"."EDGES"
SOURCE COLUMN "SOURCE"
TARGET COLUMN "TARGET"
KEY COLUMN "ID"
VERTEX TABLE"GRAPHSCRIPT"."NODES"
KEY COLUMN "ID";
/*************************************/
-- NEIGHBORS procedure
--Type definition to be used in our Neighbors Procedure
CREATE TYPE "GRAPHSCRIPT"."TT_NODES_NEI" AS TABLE ("ID" BIGINT);
CREATE TYPE "GRAPHSCRIPT"."TT_EDGES_NEI" AS TABLE ("ID" BIGINT, "SOURCE" BIGINT, "TARGET" BIGINT);
CREATE OR REPLACE PROCEDURE "GRAPHSCRIPT"."GS_NEIGHBORS"(
IN i_startNode BIGINT, -- the ID of the start node
IN i_min BIGINT, -- the minimum hop distance
IN i_max BIGINT, -- the maximum hop distance
OUT o_nodes "GRAPHSCRIPT"."TT_NODES_NEI",
OUT o_nodesCount BIGINT,
OUT o_edges "GRAPHSCRIPT"."TT_EDGES_NEI"
)
LANGUAGE GRAPH READS SQL DATA AS
BEGIN
-- create an instance of the graph, refering to the graph workspace object
GRAPH g = Graph("GRAPHSCRIPT", "GRAPHWS");
-- create an instance of the start node
VERTEX v_start = Vertex(:g, :i_startNode);
-- create a multiset of all neighbor nodes of the start node
MULTISET<Vertex> m_neighbors = Neighbors(:g, :v_start, :i_min, :i_max);
-- project the result from the multiset
o_nodes = SELECT :v."ID" FOREACH v IN :m_neighbors;
o_nodesCount = COUNT(:m_neighbors);
-- create a vertex induced subgraph to get all edges between the nodes in the neighbors multiset
GRAPH g_sub = SubGraph(:g, :m_neighbors);
o_edges = SELECT :e."ID", :e."SOURCE", :e."TARGET" FOREACH e IN Edges(:g_sub);
END;
CALL "GRAPHSCRIPT"."GS_NEIGHBORS"(i_startNode => 1, i_min => 0, i_max => 1000, o_nodes => ?, o_nodesCount => ?, o_edges => ?);
```

Using the script above we have learned how to define the nodes and Edges table, define Graph using SAP HANA, graph options and create procedures to perform graph specific operations.

Do note, that while defining a graph edges and vertex table, the vertex Id has to be only primary key of the Vertex table and only the vertex Id can be referenced in edges table for source and target columns and also while defining graphs.

Hope this blog helped you learn how SAP HANA enables us to use one of the most complex and useful data structure and come up with cutting Edge analysis and breakdown of problems exhibiting network like behaviour.

Thanks for reading! For more information on SAP HANA, graph options, I will be publishing further blogs containing details about more complex operations that we can perform in SAP HANA.