When You Actually Think About It: Text Analysis and Search is Awfully Useful
When you think about the amazing text analysis capabilities available in SPS08, and reflect that these have been further improved on recently, you will agree that checking out this video is an absolute must. In this video by the SAP HANA Academy, Tahir Hussain Babar aka Bob, takes us through Text Analysis for the Public Sector, a brand new feature for SPS09.
Today I have been working out of a library close to where I live in the Northern English countryside. This place is a bastion of old England with some of the most expensive real estate for hundreds of miles. The technologies I have covered over the last few blogs have made for amazing conversation starters. I have noticed that people just take for granted what software can do. The old thumb scanner monitor trick, we used to play in the IT classrooms on April 1 when I was a teacher, would catch out most people here. However, they are all sceptical about text analysis and search as a practical concept. By this I mean they cannot see how it works, though they acknowledge it would be useful, especially to governments.
In this video you will be learning about a new configuration type for SPS09 called extraction core for the public sector. This includes a set of entity types and rules for extracting information about events, persons or organisations and their relationship specifically orientated towards security.
Bob does this in SAP HANA Studio with a SPS08 and SPS09 servers. He starts by creating a SQL console in SPS08. He does this to explain the differences in syntax. The SPS08 server will contain a table with two columns stored as below.
After he has executed the code to create the table Bob demonstrates how to load data into it. He inputs these four rows. Bob uses this data to run a full text index as shown. He uses configuration type voice of customer because public sector did not exist in SPS08. You will notice that the line referring to the public sector has been commented out. Once the tables have been refreshed you will see another table generated above the first one which contains all your types of text analysis.
Bob then runs a SQL statement that will select four columns from the first table which uses the primary key as a reference. The four columns will include the headings for a primary key, the individual counter for each object that’s going to be extracted, the type of object extracted and the value or worth. Bob then executes and discusses the results.
Bob then states this type of analysis has been improved in SPS09 and goes onto to demonstrate how to perform the same task in SPS09 using the second schema. He uses the same statements in the same order, re-explaining what he is doing. He creates a table with two columns but then loads the data one row at a time. For the index instead of using the configuration, voice of customer, he uses the extraction core public sector.
Bob then uses the SQL statement he used before and then discusses the differences in results.
The type of person, organisation and alias are broken down to the extent that SPS09 recognises that the person Tahir Hussain Babar is also known as Bob.
After executing the SQL to insert the second row Bob highlights that 6 ft is a measure of height. In SPS09 you can analyse by sentence as well.
Whilst demonstrating the execution of the third row Bob shows that Berlin has been identified as a locality. This is not unique to SPS09 but the Travel_cameTo From is a new feature.
However, new to SPS09 are the exact spatial reference, direction and distance.
Bob then states that there are lots of other new features to do with spatial references on distances, cardinal directions and exact locations. There are also lots of other features customised for the public sector to do with military units, teams that you’ve been in and squadrons you could be a member of.
When I think of the public sector I think of hide bound convention, unwieldy bureaucratic obsession with administrative correctness and inherent desire to standardise everything. You can guess that I have had enough of its starchiness and relentless crush on ambition over the last twenty years. This technology will enable the public sector to see and take account of diversity in its deliberations. Instead of playing lip service to recognising difference, Text Analysis and Search will enable them leverage huge volumes of structured and unstructured data in real time and focus it in a way that makes sense, is easy to digest and use it to make better decisions. Data can be captured from a multiplicity of sources and analyzed in real time to avoid hazardous situations and enhance operations. Now that’s what I call making a difference.