rss resume / curriculum vitae linkedin linkedin gitlab github twitter mastodon instagram
Software Architecture in Go: Circuit Breaker, Cloud Design Pattern for Reliability
Jul 09, 2021

Disclaimer: This post includes Amazon affiliate links. If you click on one of them and you make a purchase I’ll earn a commission. Please notice your final price is not affected at all by using those links.

Welcome to another post part of the series covering Quality Attributes / Non-Functional Requirements, this time I’m talking about a Cloud Design Pattern to improve reliability called: Circuit Breaker.



What is a Circuit Breaker?

A Circuit Breaker is a pattern that prevents to execute an operation likely to fail and keeps track of the failures to avoid wasting resources. In a Microservice Architecture, for example, this pattern is useful when dealing with remote resources, specially in cases where those services are outside of the bounded context and therefore may be maintained by a different team.

Take the following example, we have a service we maintain, called Service A, our service depends on two external ones: Service 1 and Service 2, we use those services to augment data, let’s assume Service 1 fails, should we fail? Whether we decide to do so or not depends on the business logic and the data we get fetching from that service. In this case our requirements indicate, although not ideal, it’s still ok to proceed when Service 1 goes down however this brings another question: how can we determine when to come back to Service 1 and start requesting data from it again?

Circuit Breaker

This is when the Circuit Breaker pattern comes into place because it allows us to not only detect failures, but also keep track of those during a period of time to determine when is the best time to retry. This pattern is usually implemented as a state machine defining three different states:

  • Closed,
  • Open, and
  • Half-Open

Depending on the implementation you may see other attributes or states added, but this is how most of the times it’s implemented.

Circuit Breaker Implementation

How can we implement the Circuit Pattern in Go?

The code used for this post is available on Github.

There are a few different implementations of the Circuit Braker in Go, for this example I’m using mercari/go-circuitbreaker because in my opinion is one of the packages that provide more flexibility and options as well as it integrates nicely with the context package.

The concrete implementation for this example uses the Elasticsearch Repository we defined for our To-Do Microservice, I updated it to add support for the Circuit Breaker:

circuitbreaker.New(&circuitbreaker.Options{
			ShouldTrip:  circuitbreaker.NewTripFuncConsecutiveFailures(3),
			OpenTimeout: time.Minute * 2,
			OnStateChange: func(oldState, newState circuitbreaker.State) {
				logger.Info("state changed",
					zap.String("old", string(oldState)),
					zap.String("new", string(newState)),
				)
			},
		}

The configuration above indicates it should trip the wire (or open it) after 3 consecutive failures, and it should leave it open for 2 minutes before retrying again. The method using it does the following specifically:

// By searches Tasks matching the received values.
func (t *Task) By(ctx context.Context, args internal.SearchArgs) (_ internal.SearchResults, err error) {
	// XXX: Instrumentation ommited for brevity
	if !t.cb.Ready() {
		return internal.SearchResults{}, internal.NewErrorf(internal.ErrorCodeUnknown, "service not available")
	}

	defer func() {
		err = t.cb.Done(ctx, err)
	}()

	res, err := t.search.Search(ctx, args)
	// XXX: Error handling ommitted for brevity

	return res, nil
}

The key part of this change is the fact that we are:

  1. Checking if the Circuit Breaker is open,
  2. Updating the status of the Circuit Breaker after calling it

When to call it depends on the configuration we defined above when we instantiated the circuitbreaker type.

Conclusion

The Circuit Breaker is a Cloud Design Pattern meant to improve the Reliability of our services in the context of when we are depending on services that we don’t necessarily maintain, so if there is the need to connect to a different service and perhaps that one is failing maybe we shouldn’t be calling it after it continuously fails for a while, but at the same time we also need a way to retry in case it’s back to normal. The Circuit Breaker pattern handles all of that complexity and allows us to define those rules so we don’t waste resources when trying to interact with external services, more importantly it improves our user’s experience because now our service can return data back as soon as possible.

If you’re looking to sink your teeth into more Software Architecture-related topics I recommend the following books:


Back to posts