Fast analysis of massive amounts of patient data with SAP Connected Health and DELL EMC and Intel technology
Ever wonder how real-time analytics can lead to better patient outcomes? A few months ago I reported on the joint activities by Intel, DELL EMC and SAP in the area of Connected Health aka Precision Medicine. This cooperation is rapidly progressing, with the three partners providing optimized solutions for joint customers. SAP HANA, and more specifically the SAP Connected Health platform, allow integration and fast analysis of Big Data of biomedical origin. This leads to the possibility to use real world evidence, for instance in research, and to ultimately improve patient outcomes.
The combination of this SAP software with high performance compute technology and platforms makes it possible to bring these types of analysis to the next level. SAP HANA and SAP Connected Health platform have been optimized for and are powered by Intel® Xeon® processors, specifically tuned for SAP HANA workloads. In addition, DELL EMC has established a Reference Architecture for these software solutions, located on servers in the DELL EMC Center of Excellence. This system is built to handle the variety, velocity, and volumes of big data analytics, and to deliver outstanding scalability, performance, and reliability for high-impact health data analytics and enterprise data centers.
Highlights of this Reference Architecture include:
- The Dell® PowerEdge™ R930, Dell’s most powerful enterprise server platform, is built for speed and scalability while offering value-added features that enhance management and reliability. With 96 DIMM slots and 24 hard drives, the system scales to handle the most demanding workloads.
- The Intel® Xeon® processor E7 family, Intel’s high-end server processors, combines large memory capacities with leading performance and reliability capabilities to provide a responsive experience with SAP’s in-memory database analytics and large, complex data sets.
But how fast are such analyses in reality?
To investigate this in more detail, a joint team across the three companies has carried out benchmarking to gain deeper insights.
Disclaimer: naturally these results are highly dependent on the actual set-up of the system, as well as the complexity of the data and specific configurations on site. For that reason these results are illustrative only, and by no means exhaustive, guaranteed to be fully reproducible, or commitments by these three companies.
For system performance benchmarking, we used SAP Connected Health platform and SAP Medical Research Insights to analyze and visualize patient cohorts in regards to cardiovascular risk factors, based on a data-set of 2 million (mock) patients, each associated with information about e.g. smoking habits and blood pressure. On average 400 data-points were associated with each individual. This lead to a striking set of 800 million data points across the entire cohort. Result visualization in regards to cardiovascular risk factors across this patient cohort took place with SAP Medical Research Insights. Analytic scenarios included:
- Identification of individuals persons with elevated blood pressure (pre-hypertension, hypertension grade I, hypertension grade II)
- Stratification of individuals by their smoking habit (smokers, non-smokers)
- Stratification into populations by age at point of blood pressure measurement (e.g. ages 50-59, 60-69, 70-80 years old)
- Determination of a yearly blood pressure average, created by at least ten individual blood pressure measurements per year (with data quality requirement/check)
These different factors were used to create patient cohorts, subsequently computed and visualized in Kaplan-Meier plots. The end-to-end response time of these (and other) analytic service requests were measured. On average, the response times amounted to 1.68 seconds (range: 0.83 to 2.30 seconds).
This illustrates the ability of this system to deliver high-speed analyses on large amounts of clinical data. Analytical speeds like these greatly facilitate the process of “defining & discarding” working hypotheses, ultimately driving fast interpretation of data and quicker insights. This can greatly support the improvement of patient outcomes.
In summary, here we showcase the power of the combination of these SAP products and Dell and Intel hardware. This Reference Architecture, or variations of it, can be delivered to customers as one optimized bundle.
Many thanks to Jens Rannacher (SAP), Marten Neubauer (DELL EMC) and Stefan Englet (Intel).
Circus plot to compare genomic data of patient cohorts across a large number of patient characteristics.
Comparison of genomes (e.g. from biopsy material) on a gene variant level. Across genomes, allowing real time “zooming-in” to the nucleotide level.
Kaplan-Meier plot of cohort patients across a subset of characteristics.