Python with packages depending on NumPy – Part I
The standard Cloud Foundry python buildpack allows you to deploy your application by specifying your dependencies on the following file
You may add numpy here and it will install it without any issues. However, if you enter any packages that depend on numpy as a dependency, the buildpack will fail, like so:
Flask==1.0.2 numpy matplotlib scikit-fuzzy==0.4.0
Downloading https://files.pythonhosted.org/packages/09/36/4938f2...eb5ff/scikit-fuzzy-0.4.0.tar.gz (994kB) Complete output from command python setup.py egg_info: To install scikit-fuzzy from source, you will need numpy. Install numpy with pip: pip install numpy Or use your operating system package manager.
Apparently the buildpack isn’t prepared to install pre-requirements such as numpy. The above output is from python buildpack version 1.6.25. As of of this writing the most recent version is 1.6.28. But even by specifying the latest version this behavior is pretty much the same.
Some folks have forked an older version of this buildpack and never bothered to update their forks to incorporate the changes made after.
So my quest was on how do we provide the usage of such dependencies on the standard buildpack.
Solution A – Miniconda to the rescue
After a lot of digging around I came across a documentation that was not known to me – regarding the ability of python buildpack to run your application just like you would with Anaconda or Miniconda. However, the documentation was lacked detailed information. All it said was to use the file environment.yml.
Finnaly I got things working after more investigation:
For your convenience, I’ve shared here my repository:
So the solution would involve the following steps:
- Remove the files ‘requirements.txt’ and ‘runtime.txt’ from your project
- Create a new file with the name ‘environment.yml’
- Add the following contents:
name: flask-skfuzzy channels: - conda-forge dependencies: - python - pip - pytest - icu - flask - numpy - nomkl - scipy - matplotlib - scikit-fuzzy
NOTE: In my experience I had to add icu to avoid library linking errors during startup. The nomkl is also there in an effort to reduce the disk size occupied by the conda’s environment in the cloud.
This will work just like what you are used to doing with the following conda command:
conda env create -f environment.yml -n py3
This feels less “error-prone” and will probably make your application work just like if it was running on the conda environment created by the above command.
It also feels the correct way to publish python applications, as you don’t need to think about the various differences a cloud environment has related to your own local environment.
Once miniconda downloads all dependencies and created the environment locally, you can check if the application runs by enabling the created environment and make it run:
conda activate py3 python app.py
In example you cloned from my github repo, you should see something like:
* Serving Flask app 'server' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: on * Restarting with stat * Debugger is active! * Debugger PIN: xxx-xxx-xxx * Running on all addresses. WARNING: This is a development server. Do not use it in a production deployment. * Running on http://localhost:5000/ (Press CTRL+C to quit)
The result would be a bunch of graphs:
Once your miniconda environment works locally you move to test it on Cloud Foundry.
Create a manifest.yml file to start using the newest buildpack:
--- applications: - name: flask-skfuzzy memory: 4G disk_quota: 4G buildpacks: - https://github.com/cloudfoundry/python-buildpack stack: cflinuxfs3 env: #BP_DEBUG: "True"
NOTE: I’ve included the environment variable BP_DEBUG here for your convenience. If you should face issues deploying your application, un-comment this variable and it will start giving lots of information so you can understand what is going on.
I’ve used 4G of memory and disk quota for the initial deploy. If you feel like tunning this later to the minimum requirement, please do so.
Once you do a cf push, you will start noticing that the buildpack will work differently.
First, it will load miniconda:
Waiting for API to complete processing files... Staging app and tracing logs... Cell 08338ca7-...-42e442150e8a creating container for instance 05553d49-...-8b658c06bdd5 Cell 08338ca7-...-42e442150e8a successfully created container for instance 05553d49-...-8b658c06bdd5 Downloading app package... Downloaded app package (6.8K) -----> Download go 1.11.4 -----> Running go build supply /tmp/buildpackdownloads/c011a0fe8e55069cdbeb0a3d00e21875 ~ ~ -----> Python Buildpack version 1.6.28 -----> Supplying conda -----> Installing miniconda2 4.5.12 Download [https://repo.continuum.io/miniconda/Miniconda2-4.5.12-Linux-x86_64.sh] -----> Installing Miniconda PREFIX=/tmp/contents148130992/deps/0/conda installing: python-2.7.15-h9bab390_6 ... installing: ca-certificates-2018.03.07-0 ... installing: conda-env-2.6.0-1 ... installing: libgcc-ng-8.2.0-hdf63c60_1 ... ...
NOTE: Don’t worry about python here being on version 2.7. Miniconda 126.96.36.199 is built on top of it. And this python version isn’t the one CF will use as runtime for your app.
Miniconda will then figure out what is needed for the environment based on your environment.yml:
-----> Installing Dependencies -----> Installing conda environment from environment.yml Solving environment: ...working... done Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done
After is builds the environment, it will present a list of tarballs that will be removed from the filesystem. These are the programs that will be available to your environment:
python-3.6.3-1.tar.bz2 19.1 MB ... numpy-1.12.1-py36_blas_openblash1522bff_1001.tar.bz2 3.8 MB scipy-1.2.0-py36_blas_openblash1522bff_1201.tar.bz2 18.2 MB ... pip-19.0.2-py36_0.tar.bz2 1.8 MB ... setuptools-40.8.0-py36_0.tar.bz2 626 KB flask-1.0.2-py_2.tar.bz2 66 KB --------------------------------------------------- Total: 217.9 MB
NOTE: Checkout the total file size of each tarball – huge right?. Think about how much disk size this will take after they are uncompressed.
After the software is installed on the conda environment, you will see the buildpack removing files that are no longer needed and prepare to run the start command:
Proceed ([y]/n)? removing conda-env-2.6.0-1 removing nomkl-3.0-0 removing blas-1.1-openblas -----> Done -----> Running go build finalize /tmp/buildpackdownloads/c011a0fe8e55069cdbeb0a3d00e21875 ~ ~ Exit status 0 Uploading droplet, build artifacts cache... Uploading droplet... Uploading build artifacts cache... Uploaded build artifacts cache (242.3M) Uploaded droplet (293M) Uploading complete
Now Cloud Foundry attempts to start your application. When it succeeds it wil show you some statistical data:
Waiting for app to start... name: flask-skfuzzy requested state: started routes: flask-skfuzzy.cfapps.eu10.hana.ondemand.com last uploaded: Fri 15 Feb 14:30:53 DST 2019 stack: cflinuxfs3 buildpacks: python type: web instances: 1/1 memory usage: 512M start command: python app.py state since cpu memory disk details #0 running 2019-02-15T16:31:43Z 1.2% 98.6M of 512M 1G of 2G
NOTE: Check the overall disk size of this application: 1G. That’s the reason, I had to use 2G on my manifest file. This is the reason why I added the nomkl module – I didn’t try to figure out if this was really helping or not. 1G to run a simple python application was simply unacceptable to me.
There has to be another way!
Solution 2 – Fork the Buildpack
Read it on part II of my next blog: