Technical Articles
Understanding containers (part 05): shared files between the host and containers
Today (March 9th) is the Day of Polish Statistics celebrated here in Poland. What does it have in common with Docker containers? Well, to demonstrate today’s topic I will use Jupyter — a software very popular among nowadays data scientists (formerly known as statisticians ;-))
So far I used Jupyter primarily as an integral part of SAP Data Intelligence. But with the recent Updates for the Data Scientist, building SAP HANA embedded Machine Learning scenarios from Python or R from Christoph Morgen I wanted to run hana_ml
1.0.8 for Python on my own local Jupyter instance.
What’s more, I wanted:
- to experiment with Jupyter and different Python libraries and their versions, so containers were ideal for that to avoid messing up with software and configurations on my laptop,
- to be able to keep so-called notebooks created with Jupyter on my laptop, when I am deleting and recreating containers.
Let’s have a look at how to share files between containers and the host computer. It is done in Docker using the bind mounts. Docker recommends primarily using volumes — a different mount type, but for our today’s exercise, these bind mounts do the job: sharing artifacts between a development environment on the Docker host and a container.
Jupyter delivers several ready-to-use images…
…on Docker Hub: https://hub.docker.com/u/jupyter/, but for our needs, the one even with minimal software preinstalled should be sufficient: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/selecting.html#jupyter-minimal-notebook.
Run the container with a bind-mount
To save our time I know already that:
- Jupyter exposes its web notebook UI on the port
8888
, - These ready-to-use containers have a work folder
/home/jovyan/work/
.
The local folder I already created on my notebook with MacOS is ~/Projects/Notebooks
and I want to mount it as /home/jovyan/work/myprojects
in the container.
Let me create a file helloContianer.txt
in that source folder first…
touch ~/Projects/Notebooks/helloContianer.txt
…and then run the new container.
docker run -p 8888:8888 --mount type=bind,source=$HOME/Projects/Notebooks,target=/home/jovyan/work/myprojects --name myjupyter01 jupyter/minimal-notebook
As you can see I am using the option --mount
of the docker run
command. --mount
syntax is self-descriptive with three mandatory attributes: a type
(with valuebind
for our today’s needs) plus source
and target
.
Docker documentation says “When you use a bind mount, a file or directory on the host machine is mounted into a container.” In our case the content of a directory ~/Projects/Notebooks
(empty for now) is visible inside the container in the folder /home/jovyan/work/myprojects
.
The source directory on the host must exist before it is used. Otherwise, Docker does not automatically create it for you, but generates an error.
Open Jupyter
As you can see the container with Jupyter returned the following text, when starting:
...
[I 23:34:22.656 NotebookApp] The Jupyter Notebook is running at:
[I 23:34:22.656 NotebookApp] http://b092c37f0b79:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37
[I 23:34:22.656 NotebookApp] or http://127.0.0.1:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37
[I 23:34:22.656 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 23:34:22.661 NotebookApp]
To access the notebook, open this file in a browser:
file:///home/jovyan/.local/share/jupyter/runtime/nbserver-7-open.html
Or copy and paste one of these URLs:
http://b092c37f0b79:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37
or http://127.0.0.1:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37
Because this is inside the container we need to replace in the URL http://b092c37f0b79:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37
:
- the hostname
b092c37f0b79
with thelocalhost
, and - the port
8888
with the port used on the host (which in my case is8888
as well).
Let me open http://localhost:8888/?token=6bbfe71d8485565d07ab8462853e2cbb447635c6739dbc37
(in your case the token
should be different) in the web browser. I will be redirected to the Jupyter files exploration if the URL is correct.
And then following the path I can get to work/myprojects
folder and click on helloContianer.txt
to open it for editing.
Let’s type Hello from the container!
and save the file in Jupyter UI…
…then check the content of the file on the host computer with cat ~/Projects/Notebooks/helloContianer.txt
.
To stop the container…
…you can either press Ctrl+C
in the terminal where you executed docker run...
command, or click [Quit]
in the Jupyter UI and close the web browser window with it.
And the next time you start it…
…I would suggest starting it in the attached and interactive modes.
docker start -ai myjupyter01
This way you see all the messages from Jupyter into the terminal and can stop it with Ctrl+C
when needed.
I know…
…we just scratched the surface this time without doing more useful exercises with Jupyter (or even better with Jupyter Lab), like connecting it to OrientDB or SAP HANA. Neither we dug into more details of different types of mounts in Docker.
2020-03-11 Update: I did publish a new post, where I use this container with Jupyter to Quickly load Covid-19 data with hana_ml and see with DBeaver.
Hopefully, we do all these before the World Statistics Day which is celebrated on October 20th … every five years. So far it was celebrated in 2010 and 2015, and the next one will be — you guessed! — this year! We still have 7 more months to practice our data science skills in general and statistics in particular!
-Vitaliy (aka @Sygyzmundovych)