Circuit breaker and monitoring of a gRPC service in Ruby (Part 1)
gRPC is a well suited framework for a micro-services architecture, where there are small independent components talking to each other over the network. When I started building multiple components communicating with each other over gRPC, I started realizing there were needs for certain basic tools around a gRPC call (both server side and client side), which were not available out of the box. For example, things like instrumentation of gRPC calls, or a simple health API. I realized I was having to write these components every time I built a new service. This brought me to the idea of building a grpc-commons, which will be a set of basic tools required in a micro-services architecture. In this post, I will be focusing on a couple of things - circuit breaker & monitoring.
Circuit Breaker
A common requirement in an architecture, where independent modules are separate services over the network. Difference between a normal method call and a RPC call over the network is that a RPC call can fail, or become unresponsive. From Martin Fowler's post,
The basic idea behind the circuit breaker is very simple. You wrap a protected function call in a circuit breaker object, which monitors for failures. Once the failures reach a certain threshold, the circuit breaker trips, and all further calls to the circuit breaker return with an error, without the protected call being made at all. Usually you'll also want some kind of monitor alert if the circuit breaker trips.
This is how a circuit breaker state diagram usually looks:
Monitoring
Monitoring is crucial to any software system. I will not be talking about general monitoring systems in this post. I will discuss points of monitoring in a gRPC call that could be setup without much developer effort.
In Part 1, I will try to explain the concept of having a grpc-commons
and the implementation. Will focus on monitoring in Part 2.
grpc-commons: Under the hood
I have mostly been writing gRPC services in Ruby recently. grpc-commons is currently, a set of tools which does the above (and more in future) which would be helpful for a gRPC server / client written in Ruby. I have not packaged them separately (tooling for client & server), but will explain them separately here.
If you look at the generated ruby file for a gRPC service:
All the GRPC related methods are inherited from GRPC::GenericService
.
And this is how we call a service defined like above:
stub = Snip::UrlSnipService::Stub.new('0.0.0.0:50052', :this_channel_is_insecure)
req = Snip::SnipRequest.new(url: 'http://shiladitya-bits.github.io')
resp_obj = stub.snip_it(req)
To know more about implementation details of a basic gRPC client-server call in Ruby, head over to my post (here)[https://shiladitya-bits.github.io/Building-Microservices-from-scratch-using-gRPC-on-Ruby/].
I wanted the easiest way to plug in my tools on the client side. the Snip::UrlSnipService::Stub
class is what gets called when you make the RPC call. So, I went ahead and overrode a new module GrpcCommons::GenericService
from GRPC::GenericService
module GrpcCommons
module GenericService
def self.included klass
klass.class_eval do
include GRPC::GenericService
# This is an override for the same method in GRPC::GenericService
def self.rpc_stub_class
stub_claz = super
instance_meths = stub_claz.instance_methods(false)
alt_stub_claz = Class.new(stub_claz) do
instance_meths.each do |meth|
define_method(meth) do |*args|
tracker_name = "#{self.class.name[0..self.class.name.rindex("::")]}:#{meth}"
stoplight_lambda = Stoplight(tracker_name) {
super(*args)
}.with_cool_off_time(ENV['GRPC_COMMONS_COOLOFF_INTERVAL'].to_i)
response = StatsD.measure(tracker_name) do
stoplight_lambda.run
end
response
end
end
end
alt_stub_claz
end
end
end
end
end
If you see, I am wrapping each RPC method defined in the Service class (whenever any Service class includes GrpcCommons::GenericService
module instead of GRPC::GenericService
, the Stub
class gets overriden by the above implementation). Each RPC call will go through a circuit breaker check (using Stoplight gem for this). And the actual run
method goes through a StatsD call for instrumentation. If you are not aware of StatsD and how it helps in monitoring your apps, read this blog by Datadog. I will try to connect the dots in a future post about how I have been doing my monitoring in the 2nd part.
Now, let's use this tool given in our gRPC client. This is how a modified service definition file will look like after replacing the GenericService
module:
Make sure the grpc-commons
gem is included in your Gemfile. Full code is available in a sample grpc app (snip), which I have been discussing in previous posts as well.
Now, you are set to use the new wrapper (with Stoplight and StatsD tracking). There is nothing you need to change in the way the gRPC call is made. Just changing your service definition to use the new module does all the work under the hood!
# The client call is exactly same as before!
stub = Snip::UrlSnipService::Stub.new('0.0.0.0:50052', :this_channel_is_insecure)
req = Snip::SnipRequest.new(url: 'http://shiladitya-bits.github.io')
resp_obj = stub.snip_it(req)
We are done with a basic setup of using these common tools for circuit breaking and monitoring. Will continue implementation details on each in Part 2.