Skip to Content
Technical Articles
Author's profile photo Ivan Mirisola

Python with packages depending on NumPy – Part I

Overview

The standard Cloud Foundry python buildpack allows you to deploy your application by specifying your dependencies on the following file

requirements.txt

Flask==1.0.2

You may add numpy here and it will install it without any issues. However, if you enter any packages that depend on numpy as a dependency, the buildpack will fail, like so:

requirements.txt

Flask==1.0.2
numpy
matplotlib
scikit-fuzzy==0.4.0
Downloading https://files.pythonhosted.org/packages/09/36/4938f2...eb5ff/scikit-fuzzy-0.4.0.tar.gz (994kB)
Complete output from command python setup.py egg_info:
To install scikit-fuzzy from source, you will need numpy.
Install numpy with pip:
pip install numpy
Or use your operating system package manager.

Apparently the buildpack isn’t prepared to install pre-requirements such as numpy. The above output is from python buildpack version 1.6.25. As of of this writing the most recent version is 1.6.28. But even by specifying the latest version this behavior is pretty much the same.

Some folks have forked an older version of this buildpack and never bothered to update their forks to incorporate the changes made after.

So my quest was on how do we provide the usage of such dependencies on the standard buildpack.

Solution A – Miniconda to the rescue

After a lot of digging around I came across a documentation that was not known to me – regarding the ability of python buildpack to run your application just like you would with Anaconda or Miniconda. However, the documentation was lacked detailed information. All it said was to use the file environment.yml.

Finnaly I got things working after more investigation:

For your convenience, I’ve shared here my repository:

https://github.com/ivanmir/fuzzy

So the solution would involve the following steps:

  1. Remove the files ‘requirements.txt’ and ‘runtime.txt’ from your project
  2. Create a new file with the name ‘environment.yml’
  3. Add the following contents:
name: flask-skfuzzy

channels:
- conda-forge

dependencies:
- python
- pip
- pytest
- icu
- flask
- numpy
- nomkl
- scipy
- matplotlib
- scikit-fuzzy

NOTE: In my experience I had to add icu to avoid library linking errors during startup. The nomkl is also there in an effort to reduce the disk size occupied by the conda’s environment in the cloud.

This will work just like what you are used to doing with the following conda command:

conda env create -f environment.yml -n py3

This feels less “error-prone” and will probably make your application work just like if it was running on the conda environment created by the above command.

It also feels the correct way to publish python applications, as you don’t need to think about the various differences a cloud environment has related to your own local environment.

Once miniconda downloads all dependencies and created the environment locally, you can check if the application runs by enabling the created environment and make it run:

conda activate py3
python app.py

In example you cloned from my github repo, you should see something like:

 * Serving Flask app 'server' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: on
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: xxx-xxx-xxx
 * Running on all addresses.
   WARNING: This is a development server. Do not use it in a production deployment.
 * Running on http://localhost:5000/ (Press CTRL+C to quit)

The result would be a bunch of graphs:

Once your miniconda environment works locally you move to test it on Cloud Foundry.

Create a manifest.yml file to start using the newest buildpack:

---
applications:
- name: flask-skfuzzy
  memory: 4G
  disk_quota: 4G
  buildpacks:
    - https://github.com/cloudfoundry/python-buildpack
  stack: cflinuxfs3
  env:
    #BP_DEBUG: "True"

NOTE: I’ve included the environment variable BP_DEBUG here for your convenience. If you should face issues deploying your application, un-comment this variable and it will start giving lots of information so you can understand what is going on.

I’ve used 4G of memory and disk quota for the initial deploy. If you feel like tunning this later to the minimum requirement, please do so.

Once you do a cf push, you will start noticing that the buildpack will work differently.

First, it will load miniconda:

Waiting for API to complete processing files...

Staging app and tracing logs...
   Cell 08338ca7-...-42e442150e8a creating container for instance 05553d49-...-8b658c06bdd5
   Cell 08338ca7-...-42e442150e8a successfully created container for instance 05553d49-...-8b658c06bdd5
   Downloading app package...
   Downloaded app package (6.8K)
   -----> Download go 1.11.4
   -----> Running go build supply
   /tmp/buildpackdownloads/c011a0fe8e55069cdbeb0a3d00e21875 ~
   ~
   -----> Python Buildpack version 1.6.28
   -----> Supplying conda
   -----> Installing miniconda2 4.5.12
          Download [https://repo.continuum.io/miniconda/Miniconda2-4.5.12-Linux-x86_64.sh]
   -----> Installing Miniconda
          PREFIX=/tmp/contents148130992/deps/0/conda
          installing: python-2.7.15-h9bab390_6 ...
          installing: ca-certificates-2018.03.07-0 ...
          installing: conda-env-2.6.0-1 ...
          installing: libgcc-ng-8.2.0-hdf63c60_1 ...
...

NOTE: Don’t worry about python here being on version 2.7. Miniconda 2.4.5.12 is built on top of it. And this python version isn’t the one CF will use as runtime for your app.

Miniconda will then figure out what is needed for the environment based on your environment.yml:

   -----> Installing Dependencies
   -----> Installing conda environment from environment.yml
          Solving environment: ...working... done
          Preparing transaction: ...working... done
          Verifying transaction: ...working... done
          Executing transaction: ...working... done

After is builds the environment, it will present a list of tarballs that will be removed from the filesystem. These are the programs that will be available to your environment:

          python-3.6.3-1.tar.bz2                      19.1 MB
...
          numpy-1.12.1-py36_blas_openblash1522bff_1001.tar.bz2     3.8 MB
          scipy-1.2.0-py36_blas_openblash1522bff_1201.tar.bz2    18.2 MB
...
          pip-19.0.2-py36_0.tar.bz2                    1.8 MB
...
          setuptools-40.8.0-py36_0.tar.bz2             626 KB
          flask-1.0.2-py_2.tar.bz2                      66 KB

          ---------------------------------------------------
          Total:                                     217.9 MB

NOTE: Checkout the total file size of each tarball – huge right?. Think about how much disk size this will take after they are uncompressed.

After the software is installed on the conda environment, you will see the buildpack removing files that are no longer needed and prepare to run the start command:

          Proceed ([y]/n)?
          removing conda-env-2.6.0-1
          removing nomkl-3.0-0
          removing blas-1.1-openblas
   -----> Done
   -----> Running go build finalize
   /tmp/buildpackdownloads/c011a0fe8e55069cdbeb0a3d00e21875 ~
   ~
   Exit status 0
   Uploading droplet, build artifacts cache...
   Uploading droplet...
   Uploading build artifacts cache...
   Uploaded build artifacts cache (242.3M)
   Uploaded droplet (293M)
   Uploading complete

Now Cloud Foundry attempts to start your application. When it succeeds it wil show you some statistical data:

Waiting for app to start...

name:              flask-skfuzzy
requested state:   started
routes:            flask-skfuzzy.cfapps.eu10.hana.ondemand.com
last uploaded:     Fri 15 Feb 14:30:53 DST 2019
stack:             cflinuxfs3
buildpacks:        python

type:            web
instances:       1/1
memory usage:    512M
start command:   python app.py
     state     since                  cpu    memory          disk       details
#0   running   2019-02-15T16:31:43Z   1.2%   98.6M of 512M   1G of 2G

NOTE: Check the overall disk size of this application: 1G. That’s the reason, I had to use 2G on my manifest file. This is the reason why I added the nomkl module – I didn’t try to figure out if this was really helping or not. 1G to run a simple python application was simply unacceptable to me.

There has to be another way!

Solution 2 – Fork the Buildpack

Read it on part II of my next blog:

Python with packages depending on Numpy – part II

Enjoy!

Assigned Tags

      5 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Pankaj Kumar
      Pankaj Kumar

      Hi Ivan,

       

      I tried your blog, but not able to deploy the app to SAP CF.  I have attached the DEBUG messages while deployment.

      Please advise!

      Author's profile photo Ivan Mirisola
      Ivan Mirisola
      Blog Post Author

      Hi Pankaj Kumar,

      Were you able to revolve all dependencies locally.

      When you download a miniconda distribution, it is likely to come with a specific version of python.

      However, the environment.yml I shared contains specific python version with its dependencies - like icu = 56 and python 3.6. At the time I wrote this blog, there wasn't a numpy wheel compatible with python 3.9.

      Therefore, try using the following dependency versions on the environment file and make sure miniconda is able to resolve all dependencies. It cannot fail, or else you won't be able to push it to cloud foundry.

      name: flask-skfuzzy
      
      channels:
      - conda-forge
      
      dependencies:
      - python=3.10.*
      - pip
      - pytest
      - icu == 68.*
      - flask
      - numpy
      - nomkl
      - scipy
      - matplotlib
      - scikit-fuzzy

       

      To test your application locally, you may start it by enabling the conda environment with the following command:

      conda activate py3
      python app.py

      Check my github repo for a sample application running numpy here:

      https://github.com/ivanmir/fuzzy

      Also, if your conda environment didn't succeed during the first deploy (mine did due to increased memory consumption), try deleting the app before you try pushing it again.

      NOTE: I have made some updates on this blog.

      Best regards,
      Ivan

       

      Author's profile photo Pankaj Kumar
      Pankaj Kumar

      Hi Ivan,

       

      Thanks a lot for your swift response. I am not using flask. I have a different architecture.

      I am calling python from my nodejs app via child_process. In python file I am using numpy. It works well in BAS, but after deployment to SAP CF, when I execute my nodejs endpoint, it throws me error related to numpy as "ImportError: No module named numpy".

      I have tried using environment.yml but it doesn't deploy to CF. I don't want to use python as rest in my nodejs.

      your help in this would be highly appreciated.

      thanks

      Pankaj

      Author's profile photo Ivan Mirisola
      Ivan Mirisola
      Blog Post Author

      Hi Pankaj Kumar,

      I understand what you mean. However, to the best of my knowledge there isn't a buildpack that has been constructed in such a way that it is capable of running both python and NodeJS code at the same time. Furthermore, I think this would defeat one or more rules on the 12-factor approach for microservices.

      I don't really see a good reason why not to use a separate app in which you would make a rest call to a python application. Perhaps the memory consumption of your application approach running both runtimes on a single box would be similar or close to that of two applications running on diego cells in Cloud Foundry. In addition to it, your microservice would be capable of servicing more requests when you scale out. Or you could even think about things like message queue and resilience for self restoration without service disruption.

      Otherwise, you will end up trying to build yourself a numpy enabled buildpack that would also install nodejs to support your app - which would be way more difficult to maintain than just simply having two apps deployed.

      Best regards,
      Ivan

      Author's profile photo Pankaj Kumar
      Pankaj Kumar

      Thanks Ivan! I finally resolved it by building two separate apps for Node and Python resp., and then calling rest(using flask) exposed python service from node module.

       

      Thanks

      Pankaj