Skip to Content
Author's profile photo Arun Varadarajan

Very Large Databases – Interesting Times and Interesting Questions

I was reading through the Database column and found this to be very interesting … talks about the various needs that data warehouses / information stores need to handle and are constantly asked . And for this you will need a database which is nimble enough and relevant hardware and software for the same. I am not recommending the product advertised here but wanted to highlight the complex questions that are part of everyday life for someone in EDW.

http://databasecolumn.vertica.com/use-cases/tales-from-a-cocktail-party-how-customers-use-vertica/

And other scenarios which are also interesting and are being answered by data warehouses across the world are :
1. Promotion efficiency
How effective is a particular promotion … was the sales uptick from the promotion more that the amount of money spent on the same ? Maybe one particular reason manufacturers prefer coupons … helps them track promotions better… and throw in some historical analysis and some what if scenarios — something like ” Would we have comparable sales if our coupon was for a 10% discount and not 12% ? and in future should be go for a discount of 10% versus a 12% because a 10% promotion saves us money and does the sale outweigh the cost of the promotion ? “
And depending on the manufacturer – the data can be in millions or billions or even more but then the question still needs to be answered!!!

2. Flight Ticket Pricing
How do you price tickets bases on the load factors ? Load factors are the percentage to which a plane gets filled… and this has to calculate prices in real time and reflect the same onto the websites too… There are some complex SAS and SPSS algorithms running in the background and nowadays this is being implemented even with some large bus operators and train operators ( Questions like ” how many buses or how many carriages do we need..? “)

3. Or take the example of tiered bandwidth connections for mobile phones …
To take the example of AT&T… how do you know 200MB is an inflection point and why not 300MB ( this is of course referring to the data plans that AT&T has where you can have 200MB of data for 15 USD and 2GB for 30USD – for those who have chosen to go with the same .. )

These are all questions that are being addressed and questions many more times complicated are also being addressed .. the point of the blog being that … management is no longer satisfied with questions like :
“What are my outstanding Sales Orders for the day ?” or
“How much did we bill a particular vendor”

Most of the questions above being now classified as “Operational BI” versus a rich set of dashboards , metrics , delivery methods that are now expected from almost every reporting system. One way this is being done is by way of appliances like HANA , BWA etc and another is by having this technology integrated into the database like the examples referenced above.

One way to do this earlier was to mirror the databases into a separate system and then have the same analyzed by using SAS or SPSS – tools specifically built for statistical analysis etc , but this was not a real time process and the requirement being that you need real time anallysis for questions like these.. and hardware has also grwon to support these requirements and instead of waiting for days for the SQL query or SAS program to finish , and then rerunning it to validate the results , the need is to provide these answers on the fly or maybe within hours…

In these timeframes , we cannot build new databases or new models to answer these questions leaving it mostly to database accelerators like the BWA.. and there is a cost to this as well .. by way of hardware costs – but then questions like these if answered correctly – lead to profits in millions or billions for the enterprise by way of increased sales and customer focus which makes these accelerators affordable.

New vistas for reporting…..

 

Do you have similar experiences to share with interesting questions like these on an EDW and not on specialized databases meant purely foe analytics alone ..?

Assigned Tags

      1 Comment
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member
      Arun,
      This is a great blog that addresses realtime intelligence needed for today's businesses. I think Sap HANA might have a built in engine for performing statistical computations to do data mining and predictive analytics. We need someone from SAP to shed light on this.

      Regards,
      Ali Q.