Today (March 9th) is the Day of Polish Statistics celebrated here in Poland. What does it have in common with Docker containers? Well, to demonstrate today’s topic I will use Jupyter — a software very popular among nowadays data scientists (formerly known as statisticians ;-))
So far I used Jupyter primarily as an integral part of SAP Data Intelligence. But with the recent Updates for the Data Scientist, building SAP HANA embedded Machine Learning scenarios from Python or R from Christoph Morgen I wanted to run
hana_ml 1.0.8 for Python on my own local Jupyter instance.
What’s more, I wanted:
- to experiment with Jupyter and different Python libraries and their versions, so containers were ideal for that to avoid messing up with software and configurations on my laptop,
- to be able to keep so-called notebooks created with Jupyter on my laptop, when I am deleting and recreating containers.
Let’s have a look at how to share files between containers and the host computer. It is done in Docker using the bind mounts. Docker recommends primarily using volumes — a different mount type, but for our today’s exercise, these bind mounts do the job: sharing artifacts between a development environment on the Docker host and a container.
Jupyter delivers several ready-to-use images…
…on Docker Hub: https://hub.docker.com/u/jupyter/, but for our needs, the one even with minimal software preinstalled should be sufficient: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-minimal-notebook.
Run the container with a bind-mount
To save our time I know already that:
- Jupyter exposes its web notebook UI on the port
- These ready-to-use containers have a work folder
The local folder I already created on my notebook with MacOS is
~/Projects/Notebooks and I want to mount it as
/home/jovyan/work/myprojects in the container.
Let me create a file
helloContianer.txt in that source folder first…
…and then run the new container.
docker run -p 8888:8888 --mount type=bind,source=$HOME/Projects/Notebooks,target=/home/jovyan/work/myprojects --name myjupyter01 jupyter/minimal-notebook
As you can see I am using the option
--mount of the
docker run command.
--mount syntax is self-descriptive with three mandatory attributes: a
type (with value
bind for our today’s needs) plus
Docker documentation says “When you use a bind mount, a file or directory on the host machine is mounted into a container.” In our case the content of a directory
~/Projects/Notebooks (empty for now) is visible inside the container in the folder
The source directory on the host must exist before it is used. Otherwise, Docker does not automatically create it for you, but generates an error.
As you can see the container with Jupyter returned the following text, when starting:
... [I 23:34:22.656 NotebookApp] The Jupyter Notebook is running at: [I 23:34:22.656 NotebookApp] http://b092c37f0b79:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37 [I 23:34:22.656 NotebookApp] or http://127.0.0.1:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37 [I 23:34:22.656 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [C 23:34:22.661 NotebookApp] To access the notebook, open this file in a browser: file:///home/jovyan/.local/share/jupyter/runtime/nbserver-7-open.html Or copy and paste one of these URLs: http://b092c37f0b79:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37 or http://127.0.0.1:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37
Because this is inside the container we need to replace in the URL
- the hostname
- the port
8888with the port used on the host (which in my case is
Let me open
http://localhost:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37 (in your case the
token should be different) in the web browser. I will be redirected to the Jupyter files exploration if the URL is correct.
And then following the path I can get to
work/myprojects folder and click on
helloContianer.txt to open it for editing.
Hello from the container! and save the file in Jupyter UI…
…then check the content of the file on the host computer with
To stop the container…
…you can either press
Ctrl+C in the terminal where you executed
docker run... command, or click
[Quit] in the Jupyter UI and close the web browser window with it.
And the next time you start it…
…I would suggest starting it in the attached and interactive modes.
docker start -ai myjupyter01
This way you see all the messages from Jupyter into the terminal and can stop it with
Ctrl+C when needed.
…we just scratched the surface this time without doing more useful exercises with Jupyter (or even better with Jupyter Lab), like connecting it to OrientDB or SAP HANA. Neither we dug into more details of different types of mounts in Docker.
2020-03-11 Update: I did publish a new post, where I use this container with Jupyter to Quickly load Covid-19 data with hana_ml and see with DBeaver.
Hopefully, we do all these before the World Statistics Day which is celebrated on October 20th … every five years. So far it was celebrated in 2010 and 2015, and the next one will be — you guessed! — this year! We still have 7 more months to practice our data science skills in general and statistics in particular!
-Vitaliy (aka @Sygyzmundovych)