I’ve been using Polly for a number of years now. For the most part, my usage of the library has been to solve some quite basic problems; commonly to implement retry logic when calling external services for example. In this post, I want to explore a requirement I had when using Polly within a library that would be shared with various other internal projects.
The library acts as an “SDK” of sorts, wrapping an API we have written. This enables our consuming services to reference the package if they need to consume the API, avoiding repetition of the code required to interact with it.
The scenario that this post will focus on is how we can capture diagnostic information during policy execution for use in application monitoring.
Let’s take a simple example. We want to record application metrics about the number of retries that each attempt to call a third party service requires. I want to be able to get this information after the execution of the code that is wrapped in the policy is complete. In my case, some of my consuming applications will record this data to a third-party service called DataDog. We can then track and monitor this metric over time to understand if calls to the API are degraded.
I don’t want to force my SDK library to depend on DataDog directly since I can’t assume all consumers will need, nor want to record metrics. Instead, I want to capture data during policy execution which my SDK can pass back as part of a result object to the caller. Consuming projects can then choose to use that information if they need to, or disregard it if they don’t.
NOTE: One thing I should highlight at this point is that the Polly team are actively planning work on a diagnostics feature for Polly. Once that work is completed and becomes available then solving this requirement will become simpler.
The solution which I’ve come up with in the meantime is to utilise the Polly Context. Essentially the Context allows you to pass in a set of objects which can then be accessed during policy execution. The context includes a (lazily-initialised) dictionary to store any data/objects that you want.
The way I chose to implement my requirement was to set up the context and attach it to the policy executing the retry around the HTTP request.
Creating the policy
NOTE: Our library makes use of the new HttpClientFactory feature in .NET Core 2.1, so the examples here will mostly focus on that use case.
First, we’ll define a policy which will execute our HTTP request and utilise the Context to record the retries.
Update – 26-07-2018
Since publishing this post I’ve discussed this sample with Dylan Reisenberger who expertly suggested this can be simplified if we instead use the built-in retryCount to set the value on our context.
We no longer need to increment our own count and can set/update the value for the “retrycount” key with less code.
Here we’ve used the HttpPolicyExtensions to help create a policy which will retry any transient errors that occur when making the request.
The WaitAndRetryAsync method, as one of its overloads, accepts an Action delegate, which as one of its arguments includes the Context object. In the preceding example, I try to access an item with the key “retrycount” from the Context dictionary. Using pattern matching we can check that the value is an integer and if so, assign it to a local variable called retries.
In the case where this value is available, we can then increment the retries value and assign it back into the context against the retrycount key.
Executing the policy
Before executing code wrapped in the policy, we need to create a Context to pass to the execution. Creating a Context is as simple as allocating one and adding an entry in its internal dictionary.
In the preceding code, we’ve created a context and added a retrycount integer to it, initialised with a value of zero. This can then be used to track the number of attempts made during an execution of a retry based policy.
With a Context object created, we can go ahead and pass it into the policy execution. The standard way to do this is to pass it as an argument to the Execute or ExecuteAsync method when utilising the policy. For example:
However, in my case, I am using the new HttpClientFactory feature. When using HttpClientFactory, clients are defined in the ConfigureServices method with any required Polly policies being added using the various Polly extension methods on the IHttpClientBuilder. See my previous post for more detail of how to use Polly with IHttpClientFactory.
With HttpClientFactory, we don’t directly execute the policy. That is done for us within the handlers.
To support the use of the context with HttpClientFactory, an extension method on the HttpRequestMessage is provided called SetPolicyExecutionContext. This accepts a Polly Context object which it then adds it to the request properties (a Dictionary<string, object>). During execution, the handler can access the context from the request and pass it into the policy it is executing.
We’ll use that approach in this case. Firstly we create the request and then call the SetPolicyExecutionContext to apply our context object:
We can then get a client from the HttpClientFactory. There are various ways to achieve this which I’ve covered in my HttpClientFactory series. For this example we manually use the factory to create a fresh client:
This client has had the retry policy added to it when defining it in the ConfigureServices method…
After the policy has executed, the retrycount can be accessed from the original reference to context object that we attached to the request.
At this point, I can add the retryCount value to an object which my library passes back to the caller. The caller can use that information if it needs to in order to log events or record DataDog metrics. I won’t include that code here.
Hopefully, this post demonstrates how easy it is to use the Polly context to pass data into and back out of the execution of policies. This is proving useful for my current scenarios as it allows general policies to be defined centrally which can then be used in multiple places. Remember, in future versions of Polly we can expect some new diagnostic functionality, perhaps in the form of events, which we can hook into to give a richer insight into details such as the number of executed retry attempts. For now, this quite straightforward approach is a solution which I’m pretty happy with.