How to get into Machine Learning, my networking session at SAPTechEd Barcelona
At SAPTechEd Barcelona 2016 I was fortunate enough to be a community speaker. I did a lecture on machine learning, something I’ve shared here before.
Meanwhile I’ve published the slides on Slideshare, in case you’re interested.
I also did a follow up networking session that focused more on how to get into machine learning yourself. Because the slides of this short session were rather boring (being just collections of links to books, MOOCs etc) I thought it would be a better idea to write another blog post here to cover the details.
So how to get into machine learning? Of course, there’re lots of options, from just starting out dabbling with code libraries, to enrolling in a PhD program. I’d not really recommend either approach though.
In my opinion it’s important, especially with machine learning, to understand the basics before you really dive in. I know that many developers like to get to know a topic by just starting to write software and by learning from there. The ‘problem’ with machine learning is that the code is often fairly easy to write. Technically it’s not really a challenge, since there are numerous libraries providing easy access to different algorithms. However, using those libraries will teach you very little about machine learning itself. You will essentially be using the library as a black box. So, even if you understand what comes out of an algorithm, or an evaluation metric, you will still not have learned much about machine learning.
At the other extreme end you could start with some education program, and first learn a lot of mathematics (prerequisite for the more theoretical part of machine learning). Which you could then follow up by a deep dive into the different algorithms and their variations, etc. This could easily take you two, maybe three years. And at the end you would hardly have learned how to do machine learning in practice.
So, what is the middle road here? My suggestion is to read some introductory materials first, to give you an appreciation for what machine learning is. You’ll get familiar with some central topics that will always come back, no matter what algorithms you use, or what kind of problems you want to solve. My primary recommendation for this phase: the Data Science for Business book.
Then, you might to want to do a Massive Open Online Course (aka MOOC) on Machine Learning, and there are many nowadays. You could start with the very popular Coursera Machine Learning MOOC by Andrew Ng. Another MOOC that contains more mathematics but is also more in depth is the Stanford one on Statistical Learning, by Trevor Hastie and Rob Tibshirani.
And there’re lots of others. One site that I’ve found useful to get a feeling for what works and what doesn’t is Bill Kymler’s blog. He has been making the transition to data scientist over the past year, and has rated everything he’s done. Certainly worth a read!
Once you have gained an understanding of what machine learning is, how it works, and what kind of problems it solves, it’s important to get your hands dirty.
Here I don’t yet have any recommendations as it’s exactly where I am at the moment ;-).
What I have done in the past, in order to get at least some hands on experience, are more MOOCs, and especially a couple of them that went into Machine Learning & Big Data. Using Databricks community Edition (it’s free!) you get to work on Spark and do some machine learning at the same time. Interesting MOOCs are:
Big Data Analysis with Apache Spark (edX, Berkeley CS110x)
Distributed Machine Learning with Apache Spark (edX, Berkeley CS120x)
I also recommend Jason Brownlee’s blog, with lots of practical advice. I’ve bought a few of his very hands-on oriented books, and those are going to be my next steps into machine learning. But no official recommendations yet…
To finish off, here are some more links that I’ve found interesting.
Neural Networks and Deep Learning: an online book written by Michael Nielsen, who clearly has the gift of knowing how to explain things very clearly. Recommended, even if I’ve not yet finished reading.
Statistics with R Specialization, Coursera by Mine Çetinkaya-Rundel
Machine Learning Specialization, Coursera by Emily Fox and Carlos Guestrin
Data Sci Guide
Data Science Masters
Path to Data Science
Free books on Machine Learning
Although there are tons of other resources, these should be enough to keep you busy for the next couple of years ;-).
Happy Machine Learning!
I can personally recommend Andrew Ng's Coursera course as a starter for any ML field... He is a great lecturer that has a kack for explaining the complex, plus it covers allot of the different ML techniques and algorithms and doesn't focus purely on deep neural nets.
This course doesn't require any heavy programming and uses matlab (free version avail.) for the exercises so you can purely concentrate on the algorithm and not how to get things to work in a programming language.
Only basic math is required - like simple derivatives and matrix multiplication - thats pretty much it.
Beware though - you will feel sad when it finishes and want more!
Fred, thanks for sharing. Curious to know though how it is related to your day job of BIg Data consultant? Cheers!
Just noticed your comment (which shows how often I visit SAP Community these days). Well, I wish I were a Big Data Consultant in my day job (I'm not there yet) because at least it would then be related somehow. Now, in my real life (and not a parallel universe) I'm still doing ABAP development, and this is all just hobby. A fun hobby, but still...
Thank you for this blog.
I am fascinated as well by the whole Data science spectrum and have been completing courses on Python, R ,statistics and Machine learning on coursera and edx. Doing a lot of reading as well
Bill Kymlers blog is brilliant and have been a good guidance . I am looking at doing a course on Business Analytics locally and hoping to meet some experienced professionals for all the guidance with applying my basic and scattered knowledge to real life scenarios
Good luck to you .