Python with packages depending on NumPy – Part I
The standard Cloud Foundry python buildpack allows you to deploy your application by specifying your dependencies on the following file
You may add numpy here and it will install it without any issues. However, if you enter any packages that depend on numpy as a dependency, the buildpack will fail, like so:
Flask==1.0.2 numpy matplotlib scikit-fuzzy==0.4.0
Downloading https://files.pythonhosted.org/packages/09/36/4938f2...eb5ff/scikit-fuzzy-0.4.0.tar.gz (994kB) Complete output from command python setup.py egg_info: To install scikit-fuzzy from source, you will need numpy. Install numpy with pip: pip install numpy Or use your operating system package manager.
Apparently the buildpack isn’t prepared to install pre-requirements such as numpy. The above output is from python buildpack version 1.6.25. As of of this writing the most recent version is 1.6.28. But even by specifying the latest version this behavior is pretty much the same.
Some folks have forked an older version of this buildpack and never bothered to update their forks to incorporate the changes made after.
So my quest was on how do we provide the usage of such dependencies on the standard buildpack.
Solution A – Miniconda to the rescue
After a lot of digging around I came across a documentation that was not known to me – regarding the ability of python buildpack to run your application just like you would with Anaconda or Miniconda. However, the documentation was lacked detailed information. All it said was to use the file environment.yml.
Finnaly I got things working after more investigation:
For your convenience, I’ve shared here my repository:
So the solution would involve the following steps:
- Remove the files ‘requirements.txt’ and ‘runtime.txt’ from your project
- Create a new file with the name ‘environment.yml’
- Add the following contents:
name: flask-skfuzzy channels: - conda-forge dependencies: - python - pip - pytest - icu - flask - numpy - nomkl - scipy - matplotlib - scikit-fuzzy
NOTE: In my experience I had to add icu to avoid library linking errors during startup. The nomkl is also there in an effort to reduce the disk size occupied by the conda’s environment in the cloud.
This will work just like what you are used to doing with the following conda command:
conda env create -f environment.yml -n py3
This feels less “error-prone” and will probably make your application work just like if it was running on the conda environment created by the above command.
It also feels the correct way to publish python applications, as you don’t need to think about the various differences a cloud environment has related to your own local environment.
Once miniconda downloads all dependencies and created the environment locally, you can check if the application runs by enabling the created environment and make it run:
conda activate py3 python app.py
In example you cloned from my github repo, you should see something like:
* Serving Flask app 'server' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: on * Restarting with stat * Debugger is active! * Debugger PIN: xxx-xxx-xxx * Running on all addresses. WARNING: This is a development server. Do not use it in a production deployment. * Running on http://localhost:5000/ (Press CTRL+C to quit)
The result would be a bunch of graphs:
Once your miniconda environment works locally you move to test it on Cloud Foundry.
Create a manifest.yml file to start using the newest buildpack:
--- applications: - name: flask-skfuzzy memory: 4G disk_quota: 4G buildpacks: - https://github.com/cloudfoundry/python-buildpack stack: cflinuxfs3 env: #BP_DEBUG: "True"
NOTE: I’ve included the environment variable BP_DEBUG here for your convenience. If you should face issues deploying your application, un-comment this variable and it will start giving lots of information so you can understand what is going on.
I’ve used 4G of memory and disk quota for the initial deploy. If you feel like tunning this later to the minimum requirement, please do so.
Once you do a cf push, you will start noticing that the buildpack will work differently.
First, it will load miniconda:
Waiting for API to complete processing files... Staging app and tracing logs... Cell 08338ca7-...-42e442150e8a creating container for instance 05553d49-...-8b658c06bdd5 Cell 08338ca7-...-42e442150e8a successfully created container for instance 05553d49-...-8b658c06bdd5 Downloading app package... Downloaded app package (6.8K) -----> Download go 1.11.4 -----> Running go build supply /tmp/buildpackdownloads/c011a0fe8e55069cdbeb0a3d00e21875 ~ ~ -----> Python Buildpack version 1.6.28 -----> Supplying conda -----> Installing miniconda2 4.5.12 Download [https://repo.continuum.io/miniconda/Miniconda2-4.5.12-Linux-x86_64.sh] -----> Installing Miniconda PREFIX=/tmp/contents148130992/deps/0/conda installing: python-2.7.15-h9bab390_6 ... installing: ca-certificates-2018.03.07-0 ... installing: conda-env-2.6.0-1 ... installing: libgcc-ng-8.2.0-hdf63c60_1 ... ...
NOTE: Don’t worry about python here being on version 2.7. Miniconda 188.8.131.52 is built on top of it. And this python version isn’t the one CF will use as runtime for your app.
Miniconda will then figure out what is needed for the environment based on your environment.yml:
-----> Installing Dependencies -----> Installing conda environment from environment.yml Solving environment: ...working... done Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done
After is builds the environment, it will present a list of tarballs that will be removed from the filesystem. These are the programs that will be available to your environment:
python-3.6.3-1.tar.bz2 19.1 MB ... numpy-1.12.1-py36_blas_openblash1522bff_1001.tar.bz2 3.8 MB scipy-1.2.0-py36_blas_openblash1522bff_1201.tar.bz2 18.2 MB ... pip-19.0.2-py36_0.tar.bz2 1.8 MB ... setuptools-40.8.0-py36_0.tar.bz2 626 KB flask-1.0.2-py_2.tar.bz2 66 KB --------------------------------------------------- Total: 217.9 MB
NOTE: Checkout the total file size of each tarball – huge right?. Think about how much disk size this will take after they are uncompressed.
After the software is installed on the conda environment, you will see the buildpack removing files that are no longer needed and prepare to run the start command:
Proceed ([y]/n)? removing conda-env-2.6.0-1 removing nomkl-3.0-0 removing blas-1.1-openblas -----> Done -----> Running go build finalize /tmp/buildpackdownloads/c011a0fe8e55069cdbeb0a3d00e21875 ~ ~ Exit status 0 Uploading droplet, build artifacts cache... Uploading droplet... Uploading build artifacts cache... Uploaded build artifacts cache (242.3M) Uploaded droplet (293M) Uploading complete
Now Cloud Foundry attempts to start your application. When it succeeds it wil show you some statistical data:
Waiting for app to start... name: flask-skfuzzy requested state: started routes: flask-skfuzzy.cfapps.eu10.hana.ondemand.com last uploaded: Fri 15 Feb 14:30:53 DST 2019 stack: cflinuxfs3 buildpacks: python type: web instances: 1/1 memory usage: 512M start command: python app.py state since cpu memory disk details #0 running 2019-02-15T16:31:43Z 1.2% 98.6M of 512M 1G of 2G
NOTE: Check the overall disk size of this application: 1G. That’s the reason, I had to use 2G on my manifest file. This is the reason why I added the nomkl module – I didn’t try to figure out if this was really helping or not. 1G to run a simple python application was simply unacceptable to me.
There has to be another way!
Solution 2 – Fork the Buildpack
Read it on part II of my next blog:
Python with packages depending on Numpy – part II
I tried your blog, but not able to deploy the app to SAP CF. I have attached the DEBUG messages while deployment.
Hi Pankaj Kumar,
Were you able to revolve all dependencies locally.
When you download a miniconda distribution, it is likely to come with a specific version of python.
However, the environment.yml I shared contains specific python version with its dependencies - like icu = 56 and python 3.6. At the time I wrote this blog, there wasn't a numpy wheel compatible with python 3.9.
Therefore, try using the following dependency versions on the environment file and make sure miniconda is able to resolve all dependencies. It cannot fail, or else you won't be able to push it to cloud foundry.
To test your application locally, you may start it by enabling the conda environment with the following command:
Check my github repo for a sample application running numpy here:
Also, if your conda environment didn't succeed during the first deploy (mine did due to increased memory consumption), try deleting the app before you try pushing it again.
NOTE: I have made some updates on this blog.
Thanks a lot for your swift response. I am not using flask. I have a different architecture.
I am calling python from my nodejs app via child_process. In python file I am using numpy. It works well in BAS, but after deployment to SAP CF, when I execute my nodejs endpoint, it throws me error related to numpy as "ImportError: No module named numpy".
I have tried using environment.yml but it doesn't deploy to CF. I don't want to use python as rest in my nodejs.
your help in this would be highly appreciated.
Hi Pankaj Kumar,
I understand what you mean. However, to the best of my knowledge there isn't a buildpack that has been constructed in such a way that it is capable of running both python and NodeJS code at the same time. Furthermore, I think this would defeat one or more rules on the 12-factor approach for microservices.
I don't really see a good reason why not to use a separate app in which you would make a rest call to a python application. Perhaps the memory consumption of your application approach running both runtimes on a single box would be similar or close to that of two applications running on diego cells in Cloud Foundry. In addition to it, your microservice would be capable of servicing more requests when you scale out. Or you could even think about things like message queue and resilience for self restoration without service disruption.
Otherwise, you will end up trying to build yourself a numpy enabled buildpack that would also install nodejs to support your app - which would be way more difficult to maintain than just simply having two apps deployed.
Thanks Ivan! I finally resolved it by building two separate apps for Node and Python resp., and then calling rest(using flask) exposed python service from node module.