Skip to Content

SAP HANA Application Example : Personal Spending Analysis – HANA Models

All the HANA modeling details are included in the free online eBook, which can be downloaded from this link.
About the project
Some credit card companies provide customers with tools to download and analyze their transaction history. Discover credit card provides the customers a detailed analysis tool in its website (www.discovercard.com). It allows users to review the transaction history and analyze the customers’ history spending.  SAP HANA provides you a data computing platform.   We want to use the personal sample data This example is building on the similar data structure as discover credit card web site to show how to build a similar personal spending analysis tool. By following the steps, you will build your own personal spend analysis tool using SAP HANA and Java open source solutions. With your own Amazon Web Service account, you can download your transaction history data and follow this example to analyze it and you can expand it to do your own analysis.
Application Architecture
This application has seven components. (We have multiple implementations, such as BI, Java, or SAP HANA XS.  The UI components won’t be necessary exactly same, but will be similar.) To get the data and show the values in UI, each component will call a query against a HANA model, which can be the analytic view or calculation view.  In this section, we categorize these components into two groups: query during loading and event driven query. The first category will call queries when this report is loading. The second type UI components will call queries when some events are triggered.
Based on the HANA models, we have multiple way to show the result.  The following diagram shows the UI implemented by HTML5 + Ajax + REST Web Service going through JDBC connector.  More details can be found from this blog.  You can also have a similar UI by using BusinessObject Dashboard.  More details can be found in this blog. 
/wp-content/uploads/2012/10/diagram_150479.png
When to call query? Notes
Query Call During Loading

Component 1

(Transaction List)

SELECT ID, DATE_SQL AS TRAN_DATE, POST_DATE, SUM(AMOUNT) AS AMOUNT, CATEGORY_TEXT AS
CATEGORY, DESCRIPTION FROM \”_SYS_BIC\”.\”psa/ANA_TRANSACTION\” GROUP BY ID, DATE_SQL,
POST_DATE, CATEGORY_TEXT, DESCRIPTION ORDER BY ID;

Component 2

(Pie Chart)

SELECT SUM(AMOUNT) AS AMOUNT, CATEGORY_TEXT AS CATEGORY

FROM \”_SYS_BIC\”.\”psa/ANA_TRANSACTION\” GROUP BY CATEGORY_TEXT ORDER BY CATEGORY_TEXT

Component 3

(Static Transaction Report – 3, 6, and 12 months)

SELECT SUM(AMOUNT) AS AMOUNT, RANGE, OUTPUT_TYPE

FROM \”_SYS_BIC\”.\”psa/CAL_GET_AMOUNT_REPORT\” GROUP BY RANGE, OUTPUT_TYPE

Component 5

(Classification by Transactions)

SELECT COUNT(ID) AS ID, AVG(AMOUNT) AS AMOUNT, CLUSTER_ID

FROM \”_SYS_BIC\”.\”psa/CAL_GROUP_TRANSACTION\” GROUP BY CLUSTER_ID ORDER BY CLUSTER_ID

Event Driven Query Call

Component 4

(Line Chart)

Radio button is selected

Three radio buttons are configured to link with different queries.  Each radio button is linking to one query.
Query 1: Get Monthly SUM
SELECT SUM(AMOUNT) AS AMOUNT, MONTH_INT AS MONTH FROM \”_SYS_BIC\”.\”psa/ANA_TRANSACTION\” GROUP BY MONTH_INT ORDER BY MONTH_INT

Query 2: Get Monthly Average
SELECT AVG(AMOUNT) AS AMOUNT, MONTH_INT AS MONTH FROM \”_SYS_BIC\”.\”psa/ANA_TRANSACTION\” GROUP BY MONTH_INT ORDER BY MONTH_INT

Query 3: Get Monthly Difference
SELECT SUM(AMOUNT) AS AMOUNT, MONTH_INT AS MONTH FROM \”_SYS_BIC\”.\”psa/CAL_AVG_DIFF_TREND\” GROUP BY MONTH_INT ORDER BY MONTH_INT

Component 6

(Search by ID)

Transaction
is selected
SELECT ID, DATE_SQL AS TRAN_DATE, POST_DATE, SUM(AMOUNT) AS AMOUNT, CATEGORY_TEXT AS CATEGORY, DESCRIPTION FROM \”_SYS_BIC\”.\”psa/ANA_TRANSACTION\” WHERE id = ? GROUP BY ID, DATE_SQL, POST_DATE, CATEGORY_TEXT, DESCRIPTION ORDER BY ID

Component 7

(Search by Description)

Search button is clicked SELECT ID, DATE_SQL AS TRAN_DATE, POST_DATE, SUM(AMOUNT) AS AMOUNT, CATEGORY_TEXT AS CATEGORY, DESCRIPTION FROM \”_SYS_BIC\”.\”psa/ANA_TRANSACTION\” WHERE UPPER(description) LIKE ‘%” + description + “%’ GROUP BY ID, DATE_SQL, POST_DATE, CATEGORY_TEXT, DESCRIPTION ORDER BY ID
HANA Modeling (Calculation View SQLScript and R Script)
You can find all detailed steps from this free online ebook. The following section will include those SQLs defined for this example.  .

Average Difference in two Consecutive Months

var_out =

SELECT

SUM(T2.AMOUNT-T1.AMOUNT) AS AMOUNT,

     T1.MONTH_INT AS MONTH_INT

FROM

(SELECT AVG(AMOUNT) AS AMOUNT, MONTH_INT

  FROM

  “_SYS_BIC”.”psa/ANA_TRANSACTION”

  GROUP BY MONTH_INT) AS T1,

(SELECT AVG(AMOUNT) AS AMOUNT, MONTH_INT

  FROM

  “_SYS_BIC”.”psa/ANA_TRANSACTION”

  GROUP BY MONTH_INT) AS T2

WHERE T1.MONTH_INT = T2.MONTH_INT – 1

GROUP BY T1.MONTH_INT;

CREATE COLUMN TABLE DATE_RANGE(RANGE INT PRIMARY KEY);

Prepare the time configuration for report

INSERT INTO DATE_RANGE VALUES(1);

INSERT INTO DATE_RANGE VALUES(3);

INSERT INTO DATE_RANGE VALUES(6);

INSERT INTO DATE_RANGE VALUES(12);

Generate transaction report
TIME_RANGE_OUT = SELECT
L.DATE_SQL AS DATE_FROM,
D.LAST_DATE AS DATE_TO,
R.RANGE AS RANGE
FROM “_SYS_BIC”.”psa/ATT_TIME_VIEW” AS L,
(SELECT MAX(TRAN_DATE) AS LAST_DATE
FROM “SYSTEM”.”PSA_TRANSACTION”) AS D,
“SYSTEM”.”DATE_RANGE” AS R
WHERE DAYS_BETWEEN(L.DATE_SQL, D.LAST_DATE) = R.RANGE * 30;
CALL “_SYS_BIC”.”psa/PRO_GET_TIME_RANGE”(TIME_RANGE);

AVG_OUTPUT = SELECT R.RANGE AS RANGE, AVG(AMOUNT) AS AMOUNT, ‘AVG’ AS OUTPUT_TYPE
FROM “_SYS_BIC”.”psa/ANA_TRANSACTION” AS D, :TIME_RANGE AS R
WHERE D.DATE_SQL >= R.DATE_FROM AND D.DATE_SQL <= R.DATE_TO
GROUP BY R.RANGE;

SUM_OUTPUT = SELECT R.RANGE AS RANGE, SUM(AMOUNT) AS AMOUNT, ‘SUM’ AS OUTPUT_TYPE
FROM “_SYS_BIC”.”psa/ANA_TRANSACTION” AS D, :TIME_RANGE AS R
WHERE D.DATE_SQL >= R.DATE_FROM AND D.DATE_SQL <= R.DATE_TO
GROUP BY R.RANGE;

var_out = SELECT * FROM :AVG_OUTPUT union SELECT * FROM:SUM_OUTPUT;

Create the R script procedure
DROP PROCEDURE “SYSTEM”.”GROUP_TRAN”;
DROP TYPE “SYSTEM”.”DATA_TYPE”;
CREATE TYPE “SYSTEM”.”DATA_TYPE” AS TABLE(
“ID” INTEGER not null,
“TRAN_DATE” DATE null,
“POST_DATE” DATE null,
“DESCRIPTION” NVARCHAR (60) null,
“AMOUNT” DOUBLE null,
“CATEGORY_TEXT” NVARCHAR(20) null);
DROP TYPE “SYSTEM”.”DATA_OUTPUT_TYPE”;
CREATE TYPE “SYSTEM”.”DATA_OUTPUT_TYPE” AS TABLE (
“ID” INTEGER not null,
“TRAN_DATE” DATE null,
“POST_DATE” DATE null,
“DESCRIPTION” NVARCHAR (60) null,
“AMOUNT” DOUBLE null,
“CATEGORY_TEXT” NVARCHAR(20) null,
“CLUSTER_ID” INTEGER null);
CREATE PROCEDURE “SYSTEM”.”GROUP_TRAN”(IN data_input “SYSTEM”.”DATA_TYPE”, OUT result “SYSTEM”.”DATA_OUTPUT_TYPE” )
LANGUAGE RLANG reads sql data AS
BEGIN
library(kernlab)
model<-kmeans(data_input$AMOUNT, 3)
result<-data.frame(data_input, CLUSTER_ID=model$cluster)
END;
temp_var = SELECT “ID”,”TRAN_DATE”,”POST_DATE”,”DESCRIPTION”,”AMOUNT”,”CATEGORY_TEXT” FROM “SYSTEM”.”PSA_TRANSACTION”;
CALL “SYSTEM”.”GROUP_TRAN”( :temp_var, var_out);
8 Comments
You must be Logged on to comment or reply to a post.