Zain Khan Achieving High Availability Using Modern Practices · Zain Khan

Achieving High Availability Using Modern Practices

 ·   3 min read

The bank’s customers expect ours services to be available 24/7 i.e. have high availability. High availability (HA) is the ability of a system to operate continuously without failing for a designated period of time. This paper explores modern architectural best practices to enable HA.

Greenfield applications

HA should not be an infrastructure dependency. Applications must provide application level fault tolerance and resilience. This can be achieved by:

1. Stateless user session architecture

State in this context refers to state specific to a particular user session. Ensure your business logic doesn’t depend on storing session state in memory and/or on disk between operations. If session state needs to be stored it should be done client side via token based authentication rather than session based authentication. Complex applications may wish to use the token to store a session ID and store session state server side in a separate database or caching layer.

This type of configuration enables HA as multiple business logic processes/containers/servers etc can be run simultaneously behind a load balancer; if one instance fails another one seamlessly takes over with zero down time. The load balancer should be configured with no persistence, sticky sessions, session affinity etc - this is an indication of a truly stateless architecture.

2. Store application state in a persistence tier

Session state is specific to a single user session, application state applies to all users and sessions. If the application needs to store state this should be done via a separate persistence tier; usually this is a database. A bunch of business logic processes/containers/servers can connect to the database to read and write application state. Delegating state to the database enables the business logic to be easily horizontally scailable and hence fault tolerant.

This type of configuration enables HA as multiple business logic processes/containers/servers etc can be run simultaneously behind a load balancer; if one instance fails another one seamlessly takes over with zero down time. The load balancer should be configured with no persistence, sticky sessions, session affinity etc - this is an indication of a truly stateless architecture.

3. Appropriate software architecture pattern

There are a variety of software architecture patterns (e.g. monolith, microkernal, event-driven, microservices etc). One isn’t better than the other. Each is suited to a a different use case. Careful consideration needs to be taken which picking a pattern. Monolithic software architecture pattern can be used to develop a small, low critically, internal application. However microservice based software architecture is a better approach for a large scale customer facing application where HA is important.

4. Utilize cloud technologies for hosting

Both private and pubic cloud provide PaaS and CaaS offerings to host applications. PaaS and CaaS are designed to provide high availability if applications are configured appropriately. Building and maintaining infrastructure that is HA to host code is taken care of by PaaS and CaaS, developers can focus on delivering business value.

Legacy applications

For legacy applications the following approach should be used (in this order):

1. Refactor the application

Refactored the application using techniques in the “Greenfield” section above. For example refactor to stateless microservices on PaaS with a database peristance tier.

2. Containerise

Where the application can be containerised we should look to use CaaS over IaaS. The CaaS platform has the added benefit of being able to provide resilience.

For third party apps, the vendor may provide a container image for the app or the app can be containerised using NWG capabilities. The vendor must support the containerised version of the app.

For legacy applications that are stateful (i.e. store session or application state where the business logic is run) the CaaS deployment should be configured to use a Persistent Volume to store state with a “replica set” value of 1 and appropriate probes configured for application level monitoring.

3. ECP IaaS Fallback

Where no other option is available, the application can be deployed onto IaaS. IaaS will provide protection against physical host and guest OS failure but this cannot be considered a HA setup as there is no application level protection.