Data, data, data, I cannot make bricks without clay: Smart Data Quality : Using Variables
Being able to filter data quickly according to our own criteria is taken very much for granted. However, the power to quickly filter data, and I use the word power carefully really is extremely important. In this video Tahir Hussain Babar, aka Bob, of the SAP HANA Academy demonstrates how to use variables for smart data quality and integration in SPS09.
Rapid filtering of data changes the game totally. It quickens decisions and makes mini-detectives out of many of us. The game makes me think of the nineteenth century. At this festive time, my mind always drifts back to the Victorian days, where many Xmas traditions started. We laugh at some of the theories on which Victorian science was based. However, in Victorian fiction, like in science fiction today, controversial ideas could be explored that were ahead of their time but would have shocked the average person. In today’s world as our data collection and analysis has improved, we have questioned many theories that were once held in high regard. In the social sphere, books like Freakonomics have taken data to build theories that surprise many of us. Science Fiction series like BSG have posited political questions that would be unthinkable in any documentary. This tradition goes back further than Star Trek’s inter-racial kiss.
In this video, Bob demonstrates an incredibly useful feature which is new for SPS09, how to make flow graphs. He explains that using a variable enables you to have data prompt you as an input parameter at the front end when you execute a particular task.
Bob starts from the SAP HANA Studio and uses the EMPLOYEES table below as an example.
This video shows you how to make a task that will prompt you for the ID.
The first step is to create a new flow graph from Projects. Note that the Flowgraph must be activated as a Task Plan.
Once he has created the flowgraph, Bob adds the EMPLOYEE table as a data source and makes the Target Schema DEV01. He explains the differences between variable types including global variables. He defines a variable at the top level in the Container Node with the settings below.
To use the variable, Bob is adds it to a filter object as below.
He then sets the filter expression to where the input is equal to the value of the variable using dollar signs on either side of VAR as below.
To output the results of the task, Bob goes to his Data Sink, places it on the canvas and joins it to the output of the filter. He also fills in the Authoring Schema and the Catalog Object as below.
Bob then recaps each step to create the task before Saving and Activating. No response back means everything has worked so you are ready to execute.
Once the list of tables has been refreshed, open up the variables object, you can see that the value of the default value has been displayed.
You can change the statement to allow for more values of your choice by changing the expression as below.
After this has been executed you will notice that ID 2 has been added to the list.
How does this video on data link to Xmas? Well for me, as I said at the beginning, Xmas started with the Victorians. Data enables us to theorise better, especially if we are detectives. In the Adventure of the Blue Carbuncle, the address of the owner of the goose is traced using a primary key. You may have guessed that my favourite fictional character, who typifies the Victorian era, is Sherlock Holmes. One of his famous sayings was “Data, data, data, I cannot make bricks without clay”. I remember Sherlock Holmes as a Bohemian who put seemed to put logic before all else. “Detection is, or ought to be, an exact science, and should be treated in the same cold and unemotional manner.” We should do as he did and follow the data. “It is a capital mistake to theorize before one has data. Insensibly, one begins to twist facts to suit theories, instead of theories to suit facts.” This idea that better data collation and analysis leads to better theorising and ultimately progress started as an idea on the bohemian edge. It is now part of the mainstream. Being able to better understand data means we don’t have to guess which “is a shocking habit,—destructive to the logical faculty”. It allows us to see AND observe. In a competitive business environment it is useful to remember that ‘There is nothing more deceptive than an obvious fact.’ These facts are more likely to leap to our attention if we can filter out the sense from the nonsense, which is what this video helps us to do.