Skip to Content
Technical Articles
Author's profile photo Sangeetha Krishnamoorthy

FedML – The Federated Machine Learning Libraries for Hyperscalers 2.0

According to Gartner, more than 75% of midsize and large organizations use two or more public cloud providers today and have plans to expand.” 

Multi cloud strategy helps companies solve issues around cost, security, and regulatory environments, while still providing consumption flexibility and ensuring that enterprises avoid vendor lock-in. 

The Multi Cloud Challenge  

One of the side-effects of a multi cloud strategy is the need for businesses to extract their business data out to cheap cloud storages to lay the foundation for Analytics as well as data science experiments on the respective multi cloud platforms. 

This forced data replication is also due to the fact that Predictive modelling & building of machine learning models work seamlessly when the data resides on the respective platform’s native cloud storages and currently there is a lack of cross cloud data access as well.

This inadvertently brings in the need for expensive ETL and data pipelines to move the data across systems which leads to data inconsistency issues. As well as taking away the time and focus of data scientists, as they are the ones who end up tackling data sourcing issues. 

Why FedML?  

Github: https://github.com/SAP-samples/datasphere-fedml

SAP FedML or Federated Machine learning libraries help avoid the extraction and migration of training data from business systems to hyperscaler ML platforms to build & train machine learning models.  

The library applies the data federation architecture of SAP Datasphere and provides functions that enable businesses and data scientists to build, train and deploy machine learning models on hyperscalers, thereby eliminating the need for replicating or migrating data out from its original source. 

FedML%20Solution%20Diagram

FedML Solution Diagram

FedML  2.0 library is now available free for use with AWS Sagemaker, Azure Machine Learning and Google Vertex AI platforms. 

In Version 1.0, FedML had support for automating data sourcing and training of models in respective hyperscalers.

What’s NEW in 2.0? 

 With FedML 2.0, here are the updated features: 

  • Support for pip installing the library from PyPI repo. 
  • Support for deploying the models in native hyperscaler platform.. 
  • Support for deploying the model in SAP BTP Kyma platform. 
  • Support or inferencing / predicting from both native hyperscaler deployment as well as Kyma deployment. 
  • Support for writing inferenced results back to SAP Datasphere. 

In a nutshell, FedML 2.0 now allows the data scientists to completely automate the end-to-end flow from data sourcing to model training, deployment, prediction and to persist the results back in SAP Datasphere too, all with just a few lines of code.  

How do I install & use FedML ? 

Please find sample notebooks and documentation to use FedML with respective hyperscaler here in this github 

Please follow the blogs below for trying out FedML in respective cloud platforms. 

Federated Machine Learning using SAP Datasphere and Amazon SageMaker 2.0 

Federated Machine Learning using SAP Datasphere and Google Vertex AI 2.0

Federated Machine Learning using SAP Datasphere and Azure Machine Learning 2.0

Please also find FedML released recently for Databricks ML platform here:

Using FedML library with SAP Datasphere and Databricks

What’s FedML’s Value Proposition? 

 FedML helps the organization realize value by eliminating the need to do data duplication for the purpose of machine learning. Thereby saving costs and having their data scientists focus solely on the training of machine learning models, thus giving them instant access to multiple data sources.  

This helps the organization avoid vendor lock-in and aids them with reduction of their hyperscaler storage costs, and adherence to GDPR policies, as data migration is eliminated. It also enables instant access to cross-cloud data sources, combined with SAP Business data managed through SAP Datasphere’s unified semantic models. 

For more information about this topic or to ask a question, please contact us at paa@sap.com 

Assigned Tags

      6 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Priyanshu Srivastava
      Priyanshu Srivastava

      Hi Sangeetha,

      Thank you. It is very insightful. May I please also know if FedML can also be tried out in SAP AICore?

      Regards,

      Priyanshu

      Author's profile photo Sangeetha Krishnamoorthy
      Sangeetha Krishnamoorthy
      Blog Post Author

      Hello Priyanshu,

      Thank you for reaching out . FedML is designed to be used directly on hyperscaler machine learning platform s (eg: AWS Sagemaker, GCP Vertex AI etc. ) and for the use cases where the model training and deployment happens on the native hyperscaler environments.

      SAP AI core is coupled closely with SAP BTP and together with SAP AI Lauchpad,  helps integrating AI capabilities in SAP solutions.  They both serve different purposes. Hope that helps,

      Thanks,

      Sangeetha

      Author's profile photo Suresh Kumar Raju
      Suresh Kumar Raju

      Hi Sangeetha,

      Thanks for the blog post, Quite interesting

      My question is, How could one bring additional python runtime dependencies at the time of Training and Serving, for instance, I need a particular python library to pre-process my data before feeding it to Scikit-Learn Training, Basically, how could I prepare my runtime environment with all my required dependencies?

      Author's profile photo Sangeetha Krishnamoorthy
      Sangeetha Krishnamoorthy
      Blog Post Author

      Hi Suresh Kumar Raju ,

      Thanks for your question. Yes, FedML libraries provides flexibility to bring in additional runtime dependencies. Please refer to individual library documentations for the specifics.

      As a example, FedML-Azure provides flexibility to create an environment with any python library dependency included, the same environment with the installed dependencies will be used for both training and serving.  Please consult this documentation for FedML-Azure, for example,  that shows how to create an environment with any python dependencies included  :  https://github.com/SAP-samples/data-warehouse-cloud-fedml/blob/main/Azure/docs/fedml_azure.md#create_environment

      For any further details or to discuss further, please reach out to us at paa@sap.com.

      Author's profile photo ELISA MASETTI
      ELISA MASETTI

      Hello Sangeetha,

      so if I have understood correctly the "federated ML" SAP is referring to is not the same we find in literature right?

      Thank you

      Elisa

      Author's profile photo Sangeetha Krishnamoorthy
      Sangeetha Krishnamoorthy
      Blog Post Author

      Hello Elisa, Yes, our library helps with  ML on hyperscalers with "federated data" (both SAP and non-SAP) via SAP Datasphere. Thanks.