As a technology advisor to the startups at SAP Startup Focus Program, I have the opportunities to work with the innovative startups from all different kinds of areas. SAP Startup Focus is a 12-month global program for startups with big data, predictive analytics and/or real-time data decision solutions. We make SAP HANA available to the startup community, help eligible startups accelerate the development of their solutions. We also help the startups with validated HANA solution accelerate market traction.


Im from the second phase of the program that we call Development Accelerator, in which phase we are helping startups to build the Minimum Viable Product(MVP) within 1-year period of free technical supports. As a hands-on person of the team, my job is to help the startups solve all different kinds of technical problems, advise architecture designs. I also own the technical thoughts and creations of technical contents for startup educations, the team I am with have done many prototyping workshops(a 1-day classroom training) to train the engineers from startups. We have trained many startups all over the world.


I had studied some existing educational content, thanks to SAP product and development teams in creating the amazing SAP HANA Interactive Education (SHINE), it will be a solid start for new developers to SAP HANA. Click the link to learn more about SHINE.


The reason we found that we cannot just reuse SHINE is because of the diversity of the startups, many of them aren’t in the area of the enterprise world and it is hard for them to understand the data model of SAP EPM system. Besides, startups want to have something fun so I decided to create something interesting and more close to the mindset of startups. We have used these contents to train many startups, the feedbacks we received that they are very much enjoyed that I decide to share it with you through a series of blogs and this is the first blog that in which I will cover the overview.


Ok. Let’s get started. I do want to tell you that I have evaluated many open datasets that include twitter you may have seen in my another blog, LinkedIn data and some other dataset, eventually CrunchBase data stands itself out because it is so close to what I want. For those who don’t know CrunchBase data yet, the simple description is it is the dataset about Startups, Investors, Competitors, Fundings and Acquisitions that you can imagine it is very close of the startups’ daily life.


Data Model


CrunchBase is a free database of technology companies and start-ups operated by TechCrunch, which comprises around 500,000 data points profiling companies, people, investors, fundings and acquisitions. Below is the number of points for each entity type in CrunchBase:

/wp-content/uploads/2015/04/1_686376.jpg

CrunchBase itself don’t compare the companies and there is no option to aggregate and calculate even discover the relationships between the various datasets, by loading the data into a in-memory database like SAP HANA and utilize the data modeling tool or embedded analysis algorithms, some very interesting questions like below can be answered in real time:

  • What kind of companies have more opportunities to be invested or acquired?
  • What are the likable competitors of a company?
  • What is the location distribution of companies had received investments over 3 rounds?
  • What are the shortest or average time to IPO?


The diagram below shows the entity relationships. For each company, it can have zero to multiple funding rounds, acquisitions, IPOs, persons work or had worked for the company, competitions as well as offices. The financial organizations are usually the venture capitalists.


/wp-content/uploads/2015/04/2_686377.jpg


You can think there are many ways to use the data to find the insights behind startups and investors community. But don’t forgot our mission here is to use it to demonstrate HANA capabilities, here are some examples:

  • Modeling: Investment history model to aggregate all the funding records of each financial organization
  • SQLScript Procedures: Define proprietary algorithms to calculate startup ranks based on the fundings, competition landscape analysis
  • Text Analysis: Extract sentiment results of company related information
  • Predictive Analysis: Investor clustering
  • Geospatial Analysis: Funding and acquisition location distributions
  • Visualization: Using SAPUI5 for Mobile, CVOM charts to show funding, acquisition records
  • XS Engine(OData & XSJS): Declare OData services or XSJS services for data exposure to UI layer


Applications


Ok, now let’s take a look the applications I have been created.


1. Startup Profile, Ranking, Funding Visualization. Is Twitter still a startup, maybe I should use another company as an example 🙂


/wp-content/uploads/2015/04/3_686378.png


2. Competition Analysis, algorithms implemented in SQLScript to find out the competitors


/wp-content/uploads/2015/04/4_686379.png


3. Global Startup Funding Heat-map, use SAP HANA Geospatial Engine and Google Maps as the client


/wp-content/uploads/2015/04/6_686380.png


4. Investor Clustering by K-means, use SAP HANA Predictive Analysis Library


/wp-content/uploads/2015/04/5_686381.png


5. Company Sentiment Ratings, use SAP HANA Text Analysis


/wp-content/uploads/2015/04/7_686382.png


6. Discover Startups and Investors, use most of SAP HANA Platform features


/wp-content/uploads/2015/04/8_686383.png


Investor Profile Page


/wp-content/uploads/2015/04/9_686384.png


To report this post you need to login first.

1 Comment

You must be Logged on to comment or reply to a post.

  1. Ashraf Shirani

    Hello Eric – Thanks very much for this interesting and potentially very useful set of examples/use cases for HANA.

    I plan to teach SAP HANA this summer to MBA students and was wondering if you could please share links to the data file and the HANA applications that you used for the material in this post.

    Thank you!

    (0) 

Leave a Reply