SAP Data Geek Challenge!! – Now SAP Lumira Helps NASA!!
I am one of them who has much passion on Space specially all things above earth, Since Childhood space related stuffs are always occupy my night dreams. This was kind of visiting space (always without spacecrafts or space suits), meeting aliens ( very beautiful purple women ), big stars (with shape of star), shooting stars. Its kind of exploring the unknown, unlimited. Above all I don’t have any chance on getting to know the technology behind these research. Even I could some time wish to become Space scientists. My limitation ends with watching Space related movies, get the latest news of newly invented planets.
Its SAP Lumira lights up my dreams in a very practical way, This is going to be my “dream” project.
With the help of my contacts I have come to know about NASA open data challenge, there are bulk of NASA research data available for public to develop better product which helps all the BIG data challenges with NASA. I with the help of 2 friends ( non SAP ) we have been searching for the best and simple data source to fit for our analysis, my friends pushed me to choose climate data, but my passion on space not allowed me to think anything other than the space science data. but we are zero in space related terms. We were keep on thinking of data for 3 months without any action. Because open NASA data are huge with all the formats. I am sure even very big data geeks also can’t work on these data without the help of field experts who has interpret these data.
But its US shutdown helps much, during the shutdown time, I got replied from NASA scientists whom I mailed 3 months before, helped me to choose some correct track of data. They are even encouraged me to participate their NASA big data challenge 🙂 , below are my data source
True Challenge is not with the huge data, its all about getting things, predictive, analytic results from the data. For getting result we must give some dedicated time for study up the data. Gone through the data sets and make sure familiar with the terms before start to analyse. I give a very short description of what my data about and how it used for analysis.
Kepler is a space instrument kind of telescope launched by NASA to discover Earth-like planets orbiting other stars. Its uses various kind of discovery methods to observe the other planets. The size, distance, velocity etc can be obtained. All the discovered particles from the device not considered as real planet, some time it ends with false positive or some time it was eclipse binaries. For more information on Kepler data please visit http://en.wikipedia.org/wiki/Kepler_spacecraft
We created our own excel sheet data set from the meta data from the above site using various features like merge, number conversion, date and time conversion.
These are few way, I used to get some very meaningful analytic results, but using these sample methods, we can get 100+ visualisation results out of the data.
1. Impact of distance on Density and Mass ( using Scatter Graph)
This is very interesting, Its clearly shown distance gives big impact on Mass.
2. Year of Invention
Its very true, After the Kepler launch, the number of invention of new planets are dramatically increased.
And also last couple years Kepler helped to discover more number of planets, if we look at the reason behind that, there might be lot, technical improvement on discovery methods used on Kepler and other way round we can try to relate with other constraints like gravity. This is what data scientists work for. Trying to get the various possibilities from the already available data.
3. Top 5 Photometric measurements
I used SAP Lumira’s Raking functionality to get the top 5, out of 10000+ entries
4. Number of False Positives
All the inventions of the Kepler not considered as Planets, some time it ends with False Positives its kind of other objects like eclipse binaries. Out of all the measurement with in less than a second I could get the list of confirmed planets.
5. Highest Temperature
Using Tag cloud, we could see the highest Temperature Candidate
6. Density ( Close look!!! )
This chart clearly gives very detail comparison of Candidate density with Earth . If we closely look our earth is only 3.72% only.
7. Mass and Radius variation on Earth and Jupiter using Story Board
This is four dimensional data, Hence I used combined line and column chart with Storyboard to bring very meaningful visual composition.
8. Discovery Methods and Proper Motion comparison
The below helps the Kepler team to optimize identification methods
9. Number of General, Light, Transit Curves
This is one of the very sensitive analysis and mostly used in identification of false positives
10. Mass and Radius on Earth Detail Scatter Matrix
I thank everyone including NASA guys who helped me for this wonderful work. Without you all support, this was not possible. The above is just the beginning, I have very much looking on estimation of next new planet invention analysis using predictive, might be I share those information in future.
Superb blog with amazing analysis !!! I am gonna learn more about "Kepler" term.
thank you and your friends for this brilliant work. 🙂
Thanks a lot for your nice comments, Like Kepler, Space Research has lot more interesting things. Explore and Enjoy.
Very nice blog. Excellent Analysis! much appreciate your effort. Looking forward for the planet invention predictive analysis 🙂
Thanks a lot, yes we are trying for estimation predictive analysis, but its very risk,
I am looking for the way to interpret the original image data also, which i dont have idea.
But its all very interesting!
Hope i meet you @teched.
Wow what an impressive analysis. You did find an interesting dataset! 🙂
Hi Martin Grob
Thanks a lot for your support too, its your guidance I interpreted metadata too excel.
Jansi, thank you for sharing your data analysis results.
It's a good idea to show the different delivered charts of Lumira in the context of some non-business facts.
By the way .. NASA and SAP Products .. that's anyhow a good combination:
Thanks, I choose NASA, because that was the real challenge for everyone on the big data sector. Its really huge and complicated, we can do many things to show the capability of SAP HANA, will try soon..
this is very good. Excellent analysis, i got lots of thing to learn.
thanks for you and your Team for this analysis.
You have really taken a good step; an out of the box scenario, than just trying to use the tool within the sap modules. Appreciations and Best Wishes for you to continue more on this.
Fantabulous Analysis. I understand how much time you spend for this. I appreciate.
Keep sharing many things like this and motivate others to generate many reports using SAP Lumira application.
Interesting post, i appreciate you for taking your valuable time in posting this document.
Well written, fantastic presentation, over all superb. Once again thanks for sharing..!!!
Great entry, Jansi! I've posted this on our Analytics Facebook page: https://www.facebook.com/photo.php?fbid=623833317677829&set=a.114001285327704.15097.109110495816783&type=1
Very innovative and educative. I really appreciate your choice of data. By the way, have you got any feedback for this from NASA? Eager to know if Lumira can help ISRO?
It was my fortune that during the US Shut down, the NASA person helped me lot to get the data set and boost me up best.
After the work resume, no update from them, 🙁 I even sent this info the Kepler team in Nasa. I am looking forward for their feedback. Will update if i get.
I am not sure ISRO has any open data set shared.
Great work Jansi,
Thank for sharing