Machine Learning in a Box (week 4) : Get your environment up and running
In case you are catching the train running, here is the link to the introduction blog of the Machine Learning in a Box series which allow you to get the series from the start. At the end of this introduction blog you will find the links for each elements of the series.
Before we get started, a quick recap from last week
Last week, we looked at the algorithm learning styles.I know that for many of you this is a lot of theory and I can already feel the impatience of some of you guys to start writing some.
So I will promise something from now on: do less theory and more hands on!
Welcome to week 4 of Machine Learning in a Box!
Get your environment up and running
There are 2 things that I learned by working with developers:
- they all have their own set of tools and ways to use and customize them
- they will try to convince you that their choice is the best
So I won’t try to convince you to use A or B as a tool, except for one thing: SAP HANA, express edition!
SAP HANA, express edition will be at the core of this blog series (at least for now, but we will start looking at other Machine Learning languages and technologies later too).
Hardware Specifications (Sounds really PC Magazine from 90’s kind of section, isn’t it)
So here is my Machine Learning Box:
Not the big one (which is my SAP machine), but the small one: the Intel NUC!
This little box (the 10 by 10 by 2.5 cm box) contains:
- a i5-5250U Processor (2 cores)
- 16 GB of DDR3 RAM
- 60 GB SSD drive
- SUSE Linux Enterprise for SAP Applications
The big one (which is really bigger) is a Lenovo P51 which contains:
- a i7-7820HQ Processor (4 cores)
- 64 GB of DDR3 RAM
- a 500 GB + 1TB SSD drives
- Windows 10 Pro
I use the big one to spin multiple virtual machines to play with larger data set or long running processes.
But don’t get scared, I won’t expect you to have something like the big one, I will always keep in mind to share some minimum & maximum requirements with you if a bigger box is required.
So, the Intel NUC will be my “go to” choice.
If you really don’t have that kind of hardware (even the small one), you will still be able to leverage some of the cloud options.
Which SAP HANA, express edition?
What you will need is a SAP HANA, express edition Server only instance. As simple as that (for the moment)!
If you have a Server + Apps, that’s fine but we won’t leverage the application service for now.
The content I’ll produce will be based on version 2.0 SPS02.
Many of you may already have their SAP HANA, express edition running either locally, as virtual machine, in the cloud or as a container, and that’s great and let’s see if you can use this one.
For those who don’t have an instance running, I invite you to visit the SAP HANA, express edition product page on the SAP Developer Center. There, you will get all the informations to help you decide where you can run it and get your instance up and running in a nutshell.
Once you have your instance running, you can run the following tutorial: Prepare your SAP HANA, express edition instance for Machine Learning.
You will get to choose what SQL query tool you will plan to work with. I have created content that address more or less every connectivity options (feel free to prove me wrong);
I have a personal preference for the SAP HANA Tools for Eclipse as I also use Eclipse for my Java development projects.
If you plan on using Eclipse, make sure you use either Neon or Oxygen, especially if you want to use the Docker image.
Off course, you will need some text editor, and may be Excel. But that will be it for the moment.
Later, we might start playing with SAP Predictive Analytics or the R studio. In the meantime, let’s keep it simple.
Now, you should have your HXE tenant ready to run algorithms. Next week, we will start uploading some datasets.
(Remember sharing && giving feedback is caring!)
UPDATE: Here are the links to all the Machine Learning in a Box weekly blogs:
- Introducing “Project: Machine Learning in a Box”
- Machine Learning in a Box (part 2) : Project Methodologies
- Recap Machine Learning in a Box (part 2) : Project Methodologies
- Machine Learning in a Box (part 3) : Algorithms Learning Styles
- Machine Learning in a Box (part 4) : Get your environment up and running
- Machine Learning in a Box (part 5) : Upload Machine Learning Datasets
- Machine Learning in a Box (part 6) : SAP HANA R Integration
- Machine Learning in a Box (part 7) : Jupyter Notebook
- Machine Learning in a Box (part 8) : SAP HANA EML and TensorFlow Integration
- Machine Learning in a Box (part 9) : Build your first Machine Learning application
- Machine Learning in a Box (part 10) : JupyterLab