SAP HANA Enabling Genome Analysis – a Big Data Use Case – ASUG Webcast
I have been interested in genome analysis since reading that Steve Jobs had his own genome analyzed and looking at the reduced time this now takes.
“SAP HANA Enabling Genome Analysis” was ASUG webcast given last month. Speakers included Dr. Joanna Kelley of Stanford University and Enakshi Singh of SAP HANA Product Management. They reviewed the use cases for HANA with genome analysis.
Figure 1: Source: SAP
The first use case is for the clinician as shown in Figure 1. Genetic variants may cause umor formation in cancer
The needs is for a genetic profile that is more personalized
Doctor/nurse can look at data on iPad for HANA to enable personalized treatment.
Figure 2: Source: SAP
Figure 2 shows the use case for the researcher, who wants to see what patterns exist with people in autism.
Figure 3: Source: SAP
Figure 3 explains genetic variants. The speaker said that a mistake in genetic data can cause colossal effects.
The entire set of genetic information is called a genome.
Figure 4: Source: SAP
Figure 4 shows that costs are going down to analyze the Human Genome. It used to take 13 years to sequence one genome. In the past, sequencing, processing and computational were a bottleneck
The cost per base is declining steadily.
The green line in figure 4 shows the quality of basis.
Today there are devices that are not on market to allow you to sequence it at your fingertips
Figure 5: Source: SAP
Figure 5 shows how big the data is 800MB for one genome. This explains why it is “big data”.
Figure 6: Source: SAP
Figure 6 shows that SAP HANA was 17 times faster with genome analysis than BWA, which took 84 hours in BWA and it took SAP HANA 5 hours.
Figure 7: Source: SAP
Figure 7 shows that 1000 Genomes Project, and the speaker said it is now up to 2500 individuals.
Now the genome data is available for researchers, they can analyze queries, get new insights
They have variant data for 629 individuals
They have 12B entries in table for data model, with 293 GB, which gets compressed in SAP HANA with a 4x compression ratio
Figure 8: Source: SAP
Figure 8 shows they use R, which is used by researchers for stats and query results.
Figure 9: Source: SAP
Figure 9 is the future, they want to make best decision possible
Researchers to look at 1000s of genomes; analyze results in real-time
Question and Answer:
Q: Rather than in memory and R support what else does HANA do?
A: HANA compresses memory, can write SQL and SQL script
Column store architecture – run queries faster
Q: Who else can use this solution?
A: expand to clinicians, doctors, beyond Stanford
Pharmaceuticals for clinical trials; chemo doesn’t work for certain genetics
Insurance companies interested in genetic data