One of my favourites subjects is Building resilient and high performance services. I’ve written about running Akka Actors on that previous post, now I’ll present the Netflix open source softwares can be used to build services on top of SAP HANA Cloud Platform.

Let’s consider the following features as the main for resilient services:

  1. A failure on a dependency shouldn’t break the flow of service
  2. We must be able to track what’s happening on our environment right now, through metrics published by services
  3. Automatic corrective actions should be taken on dependency failure
  4. Ability to provide and maintain an acceptable level of services in the face of faults and challenges to normal operation

To illustrate a service and your dependencies and consequently the points of failures, let see the next image:

Service Dependencies.png

In this scneario we have a service that have dependencies with 3 different systems:

  • SAP ERP: Using interface RFC, SAP Gateway, PI, etc.
  • SAP HANA: JDBC or REST Services
  • 3rd Party: REST Service (Credit Card System, Supply Chain, etc)

If we don’t take care on architecting the service a failure on a dependency could break the entire service and let users without response, the only certain we have is that some dependency will fail some time and our service must be prepared to deal with it.

Suppose the our service will confirm an order, so, the user information must be retrieved,  the products must be verified, a payment must be authorized and the offers based on user activities must be presented for users.

The next image shows the impact of a failure on SAP HANA service dependency (red dependency C) on entire service flow. After a failure on this dependency many user request will fail and the users will stay without response.

ServiceDepencyWithFailure.png

When architecting the service, we could define that a failure at offer suggestions dependency (SAP HANA Dependency C) shouldn’t impact on other services and if occur a failure a Dependency A (Retrieve user information from ERP) a cached value could be used, with this definition we will reduce the impact for users of service on ocurrences of failures in service dependencies.

To handle it, the Netflix guys has implemented a fantastic library called Hystrix (https://github.com/Netflix/Hystrix). As you can see at Github page, Hystrix will cover 3 important topics:

  1. Latency and Fault Tolerancy
  2. Realtime Operations
  3. Concurrency

The Purpouse of Hystrix are:

  • Give protection from and control over latency and failure from dependencies (typically accessed over network) accessed via 3rd party client libraries.
  • Stop cascading failures in a complex distributed system.
  • Fail fast and rapidly recover.
  • Fallback and gracefully degrade when possible.
  • Enable near real-time monitoring, alerting and operational control.

Implementing a HystrixCommand

Each dependency of previous image should be wrapped on a HystrixCommand, it will be enriched with fault and latency tolerance, statistics and performance metrics capture, circuit breaker and bulkhead functionality.

The snnipet of code bellow shows a pseudo implementation of a HystrixCommand that will get user informations.

public class GetUserDetailCommand extends HystrixCommand<User>{
    private final String userId;
    public GetUserDetailCommand(String userId) {
         super(HystrixCommandGroupKey.Factory.asKey("ExampleGroup"));
              this.userId= userId;
    }
    @Override
    protected User run() {
        User user = SAPGatewayService.getUserById(userId);
             return user;
    }
}

Where implementing a HystrixCommand we can define a Fallback that will be executed when the run method fail, for our example we’ll get the user from a cache system

public class GetUserDetailCommand extends HystrixCommand<User>{
    private final String userId;
    public GetUserDetailCommand(String userId) {
         super(HystrixCommandGroupKey.Factory.asKey("ExampleGroup"));
              this.userId= userId;
    }
    @Override
    protected User run() {
        User user = SAPGatewayService.getUserById(userId);
             return user;
    }
    @Override
    protected User getFallback() {
        User user = CacheSystem.getUserById(userId);
        return user;
    }
  
}


This way on service failure, an user previously stored on cache will be retrieved until that the SAP Gateway get working again.

Calling a HystrixCommand

A Hystrix Command can be called synchronous or asynchronous way

  • Synchronous

     To call a service synchronously you’ll use the execute method of a HystrixCommand

User user = new GetUserDetailCommand("IS001").execute()

It will block until the body of run method terminate

  • Asynchronous

     To call a service asynchronously you’ll use the queue method of a HystrixCommand

Future<User> userFuture = new GetUserDetailCommand("IS001").queue();

In this case, a Future will be returned and the blocking will occur just when get method is  called on the Future object.

User user = userFuture.get();

Just to keep this post simple, I won’t mention the other way of execution (Reactive), it will wait until next posts. The Github wiki of Hystrix is very detailed, you can get a lot of interesting information like Threads and Thread Pool, Flow Chart and Circuit Breaker.

To complement Hystrix, we have the Hystrix Dashboard that show graphically all HystrixCommand with all details about executions, see the next image for a sample:

/wp-content/uploads/2013/09/dashboard_annoted_circuit_640_278081.png

Image source: https://raw.github.com/wiki/Netflix/Hystrix/images/dashboard-annoted-circuit-640.png

The above image explains each information of graph.

I’ve created a simple application simulating a Order Confirmation process, Hystrix Dashboard will generate a visualization like next image. You can get the application on my Github (https://github.com/isaias/netflixosshc/), the instructions to run, are there.

If you have a cluster, you’ll need another project of Netflix called Turbine (https://github.com/Netflix/Turbine)

The great news is that all piece of software are fully compatible with SAP HANA Cloud Platform.

See the Dashboard of my application (I’ve  used the Apache AB and JMeter to generate load for service)

Hystrix-Dashboard.png

So, did you know that Netflix team produces Open Source Software that can be used on SAP HANA Cloud Platform? Do you think that would useful use it on your company?

My SAP HANA Cloud account is Trial, so, just one application can be running, but if you want to point the Hystrix Dashboard to it, I’ll keep it running for some days.

The application link to stream is : https://netflixosshcs0004616922trial.hanatrial.ondemand.com/netflixosshc/hystrix.stream

Enjoy it.

To report this post you need to login first.

2 Comments

You must be Logged on to comment or reply to a post.

Leave a Reply