HANA AutoML library

Let’s assume you have to prepare machine learning model for classification or regression task.
All your data already in HANA, or in flat(csv) file.
Everything you need – (This library is an open-source research project and is not part of any official SAP products.)

This is joke, but hana_automl goes through all(not yet) AutoML steps and makes Data Science work easier.

This library based on python and made on top of other awesome libs:

  • hana_ml
  • Optuna
  • BayesianOptimization
  • Streamlit

For installation – you need just

pip3 install Cython
pip3 install hana_automl

After installation – it is quite easy to start:

from hana_automl.utils.scripts import setup_user
from hana_ml.dataframe import ConnectionContext

cc = ConnectionContext(address='address', user='user', password='password', port=39015)

# replace with credentials of user that will be created or granted a role to run PAL.
setup_user(connection_context=cc, username='user_new', password="password_new")

setup_user – is additional method if you need to create new user for experiments.

After that – you need fit/predict and waiting…

from hana_automl.automl import AutoML

model = AutoML(cc)
  file_path='path to training dataset', # it may be HANA table/view, or pandas DataFrame
  steps=10, # number of iterations
  target='target', # column to predict
  time_limit=120 # time limit in seconds


model.predict( file_path='path to test dataset', id_column='ID', verbose=1 )

You can find all documentation here –

Also, it is possible to run all this steps not from python, but from UI with help of streamlit

This UI looks like this:  Streamlit client

To start Ui you need 3 steps:

  1. Clone repository: git clone
  2. Install dependencies: pip3 install -r requirements.txt
  3. Run GUI: streamlit run ./

Ok, why you have to try?

Have a look on this example –

APL – is awesome, but with strong focus on speed, for more accurate models you need some time and PAL. So, hana_automl could help.

Also, it is possible to make not just simple model, but blending of models. To enable ensemble, just pass ensemble=True to function when creating AutoML model.

There is a big potential for improvement and contribution is very welcome!

If you have any ideas –

P.S. this is project of  @While-true-codeanything and @dan0nchik – very talented students…

Don’t wait – have a try on your dataset and share your results…

