I know...

Vitaliy-R · ‎03-10-2020

Today (March 9th) is the Day of Polish Statistics celebrated here in Poland. What does it have in common with Docker containers? Well, to demonstrate today's topic I will use Jupyter -- a software very popular among nowadays data scientists (formerly known as statisticians ;-))

So far I used Jupyter primarily as an integral part of SAP Data Intelligence. But with the recent Updates for the Data Scientist, building SAP HANA embedded Machine Learning scenarios from Python or... from christoph.morgen I wanted to run hana_ml 1.0.8 for Python on my own local Jupyter instance.

What's more, I wanted:

to experiment with Jupyter and different Python libraries and their versions, so containers were ideal for that to avoid messing up with software and configurations on my laptop,

to be able to keep so-called notebooks created with Jupyter on my laptop, when I am deleting and recreating containers.

Let’s have a look at how to share files between containers and the host computer. It is done in Docker using the bind mounts. Docker recommends primarily using volumes -- a different mount type, but for our today's exercise, these bind mounts do the job: sharing artifacts between a development environment on the Docker host and a container.

Jupyter delivers several ready-to-use images...

...on Docker Hub: https://hub.docker.com/u/jupyter/, but for our needs, the one even with minimal software preinstalled should be sufficient: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-minimal-notebook.

Run the container with a bind-mount

To save our time I know already that:

Jupyter exposes its web notebook UI on the port 8888,

These ready-to-use containers have a work folder /home/jovyan/work/.

The local folder I already created on my notebook with MacOS is ~/Projects/Notebooks and I want to mount it as /home/jovyan/work/myprojects in the container.

Let me create a file helloContianer.txt in that source folder first...

touch ~/Projects/Notebooks/helloContianer.txt

...and then run the new container.

docker run -p 8888:8888 --mount type=bind,source=$HOME/Projects/Notebooks,target=/home/jovyan/work/myprojects --name myjupyter01 jupyter/minimal-notebook

As you can see I am using the option --mount of the docker run command. --mount syntax is self-descriptive with three mandatory attributes: a type (with valuebind for our today's needs) plus source and target.

Docker documentation says "When you use a bind mount, a file or directory on the host machine is mounted into a container." In our case the content of a directory ~/Projects/Notebooks (empty for now) is visible inside the container in the folder /home/jovyan/work/myprojects.

The source directory on the host must exist before it is used. Otherwise, Docker does not automatically create it for you, but generates an error.

Open Jupyter

As you can see the container with Jupyter returned the following text, when starting:

...

[I 23:34:22.656 NotebookApp] The Jupyter Notebook is running at:

[I 23:34:22.656 NotebookApp] http://b092c37f0b79:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37

[I 23:34:22.656 NotebookApp]  or http://127.0.0.1:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37

[I 23:34:22.656 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

[C 23:34:22.661 NotebookApp]



    To access the notebook, open this file in a browser:

        file:///home/jovyan/.local/share/jupyter/runtime/nbserver-7-open.html

    Or copy and paste one of these URLs:

        http://b092c37f0b79:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37

     or http://127.0.0.1:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37

Because this is inside the container we need to replace in the URL http://b092c37f0b79:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37:

the hostname b092c37f0b79 with the localhost, and

the port 8888 with the port used on the host (which in my case is 8888 as well).

Let me open http://localhost:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37 (in your case the token should be different) in the web browser. I will be redirected to the Jupyter files exploration if the URL is correct.

And then following the path I can get to work/myprojects folder and click on helloContianer.txt to open it for editing.

Let's type Hello from the container! and save the file in Jupyter UI...

...then check the content of the file on the host computer with cat ~/Projects/Notebooks/helloContianer.txt.

To stop the container...

...you can either press Ctrl+C in the terminal where you executed docker run... command, or click [Quit] in the Jupyter UI and close the web browser window with it.

And the next time you start it...

...I would suggest starting it in the attached and interactive modes.

docker start -ai myjupyter01

This way you see all the messages from Jupyter into the terminal and can stop it with Ctrl+C when needed.

I know...

...we just scratched the surface this time without doing more useful exercises with Jupyter (or even better with Jupyter Lab), like connecting it to OrientDB or SAP HANA. Neither we dug into more details of different types of mounts in Docker.

2020-03-11 Update: I did publish a new post, where I use this container with Jupyter to Quickly load Covid-19 data with hana_ml and see with DBeaver.

Hopefully, we do all these before the World Statistics Day which is celebrated on October 20th ... every five years. So far it was celebrated in 2010 and 2015, and the next one will be -- you guessed! -- this year! We still have 7 more months to practice our data science skills in general and statistics in particular!

Stay hungry. Stay foolish.

-Vitaliy (aka @Sygyzmundovych)

Understanding containers (part 05): shared files between the host and containers

Jupyter delivers several ready-to-use images...

Run the container with a bind-mount

Open Jupyter

To stop the container...

And the next time you start it...

I know...

Get Your SAP HANA Idea Incubator Badge Today!

SCN Mission - SAP HANA Quiz Challenge is now retired

Share your #HANAStory and Win