Working with Polly – Using the Context to Obtain the Retry Count for Diagnostics Exploring a basic use case the for Polly Context object to track diagnostic data

I’ve been using Polly for a number of years now. For the most part, my usage of the library has been to solve some quite basic problems; commonly to implement retry logic when calling external services for example. In this post, I want to explore a requirement I had when using Polly within a library that would be shared with various other internal projects.

The library acts as an “SDK” of sorts, wrapping an API we have written. This enables our consuming services to reference the package if they need to consume the API, avoiding repetition of the code required to interact with it.

The scenario that this post will focus on is how we can capture diagnostic information during policy execution for use in application monitoring.

The Requirement

Let’s take a simple example. We want to record application metrics about the number of retries that each attempt to call a third party service requires. I want to be able to get this information after the execution of the code that is wrapped in the policy is complete. In my case, some of my consuming applications will record this data to a third-party service called DataDog. We can then track and monitor this metric over time to understand if calls to the API are degraded.

I don’t want to force my SDK library to depend on DataDog directly since I can’t assume all consumers will need, nor want to record metrics. Instead, I want to capture data during policy execution which my SDK can pass back as part of a result object to the caller. Consuming projects can then choose to use that information if they need to, or disregard it if they don’t.

NOTE: One thing I should highlight at this point is that the Polly team are actively planning work on a diagnostics feature for Polly. Once that work is completed and becomes available then solving this requirement will become simpler.

The solution which I’ve come up with in the meantime is to utilise the Polly Context. Essentially the Context allows you to pass in a set of objects which can then be accessed during policy execution. The context includes a (lazily-initialised) dictionary to store any data/objects that you want.

The way I chose to implement my requirement was to set up the context and attach it to the policy executing the retry around the HTTP request.

Creating the policy

NOTE: Our library makes use of the new HttpClientFactory feature in .NET Core 2.1, so the examples here will mostly focus on that use case.

First, we’ll define a policy which will execute our HTTP request and utilise the Context to record the retries.


Update – 26-07-2018

Since publishing this post I’ve discussed this sample with Dylan Reisenberger who expertly suggested this can be simplified if we instead use the built-in retryCount to set the value on our context. 

We no longer need to increment our own count and can set/update the value for the “retrycount” key with less code.


Here we’ve used the HttpPolicyExtensions to help create a policy which will retry any transient errors that occur when making the request.

The WaitAndRetryAsync method, as one of its overloads, accepts an Action delegate, which as one of its arguments includes the Context object. In the preceding example, I try to access an item with the key “retrycount” from the Context dictionary. Using pattern matching we can check that the value is an integer and if so, assign it to a local variable called retries.

In the case where this value is available, we can then increment the retries value and assign it back into the context against the retrycount key.

Executing the policy

Before executing code wrapped in the policy, we need to create a Context to pass to the execution. Creating a Context is as simple as allocating one and adding an entry in its internal dictionary.

In the preceding code, we’ve created a context and added a retrycount integer to it, initialised with a value of zero. This can then be used to track the number of attempts made during an execution of a retry based policy.

With a Context object created, we can go ahead and pass it into the policy execution. The standard way to do this is to pass it as an argument to the Execute or ExecuteAsync method when utilising the policy. For example:

However, in my case, I am using the new HttpClientFactory feature. When using HttpClientFactory, clients are defined in the ConfigureServices method with any required Polly policies being added using the various Polly extension methods on the IHttpClientBuilder. See my previous post for more detail of how to use Polly with IHttpClientFactory.

With HttpClientFactory, we don’t directly execute the policy. That is done for us within the handlers.

To support the use of the context with HttpClientFactory, an extension method on the HttpRequestMessage is provided called SetPolicyExecutionContext. This accepts a Polly Context object which it then adds it to the request properties (a Dictionary<string, object>). During execution, the handler can access the context from the request and pass it into the policy it is executing.

We’ll use that approach in this case. Firstly we create the request and then call the SetPolicyExecutionContext to apply our context object:

We can then get a client from the HttpClientFactory. There are various ways to achieve this which I’ve covered in my HttpClientFactory series. For this example we manually use the factory to create a fresh client:

This client has had the retry policy added to it when defining it in the ConfigureServices method…

After the policy has executed, the retrycount can be accessed from the original reference to context object that we attached to the request.

At this point, I can add the retryCount value to an object which my library passes back to the caller. The caller can use that information if it needs to in order to log events or record DataDog metrics. I won’t include that code here.

Summary

Hopefully, this post demonstrates how easy it is to use the Polly context to pass data into and back out of the execution of policies. This is proving useful for my current scenarios as it allows general policies to be defined centrally which can then be used in multiple places. Remember, in future versions of Polly we can expect some new diagnostic functionality, perhaps in the form of events, which we can hook into to give a richer insight into details such as the number of executed retry attempts. For now, this quite straightforward approach is a solution which I’m pretty happy with.

HttpClientFactory in ASP.NET Core 2.1 (Part 5): Logging Exploring the default request and response logging and how to replace the logging implementation

In the 2.1 release of IHttpClientFactory, the ASP.NET team included some built-in logging of the HTTP calls made via HttpClients created by the factory. This can be useful for the diagnosis of failures, as well as to understand the time taken to complete HTTP calls to external services. 

In this post, I want to explore what is available in the default logging, how we can control what gets logged, how the logging is implemented and finally, how we can replace the logging with our implementation.

There’s quite a bit of technical detail in this post. I hope it proves useful and interesting for those working with the IHttpClientFactory. Click here, if you want to jump to the section on customising the logging by replacing the default implementation.

What’s in the logs?

IHttpClientFactory includes two levels of logging. At information level, the time taken to process and send the request is included. This can be useful for monitoring slow responding external services for example. Here’s an example of the console output when information level logging is enabled:

info: System.Net.Http.HttpClient.MyClient.LogicalHandler[100]
  Start processing HTTP request GET https://api.github.com/repos/aspnet/docs/branches

info: System.Net.Http.HttpClient.MyClient.ClientHandler[100]
  Sending HTTP request GET https://api.github.com/repos/aspnet/docs/branches

info: System.Net.Http.HttpClient.MyClient.ClientHandler[101]
  Received HTTP response after 682.9818ms - OK

info: System.Net.Http.HttpClient.MyClient.LogicalHandler[101]
  End processing HTTP request after 693.1094ms - OK

If you require a deeper level of detail regarding the requests, this is available at trace level. With trace level logging enabled, details about the request and response headers will also be included in the log messages. Here’s an example from a request with trace logging enabled:

info: System.Net.Http.HttpClient.MyClient.LogicalHandler[100]
 Start processing HTTP request GET https://api.github.com/repos/aspnet/docs/branches

trce: System.Net.Http.HttpClient.MyClient.LogicalHandler[102]
  Request Headers:
  Accept: application/vnd.github.v3+json
  User-Agent: HttpClientFactory-Sample

info: System.Net.Http.HttpClient.MyClient.ClientHandler[100]
  Sending HTTP request GET https://api.github.com/repos/aspnet/docs/branches

trce: System.Net.Http.HttpClient.MyClient.ClientHandler[102]
  Request Headers:
  Accept: application/vnd.github.v3+json
  User-Agent: HttpClientFactory-Sample

info: System.Net.Http.HttpClient.MyClient.ClientHandler[101]
  Received HTTP response after 795.6954ms - OK

trce: System.Net.Http.HttpClient.MyClient.ClientHandler[103]
  Response Headers:
  Server: GitHub.com
  Date: Sun, 08 Jul 2018 09:44:09 GMT
  Status: 200 OK
  X-RateLimit-Limit: 60
  X-RateLimit-Remaining: 58
  X-RateLimit-Reset: 1531046594
  Cache-Control: public, max-age=60, s-maxage=60
  Vary: Accept
  ETag: "f0452653b55e5fef139a58372e3a7bf3"
  X-GitHub-Media-Type: github.v3; format=json
  Access-Control-Expose-Headers: ETag, Link, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
  Access-Control-Allow-Origin: *
  Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
  X-Frame-Options: deny
  X-Content-Type-Options: nosniff
  X-XSS-Protection: 1; mode=block
  Referrer-Policy: origin-when-cross-origin, strict-origin-when-cross-origin
  Content-Security-Policy: default-src 'none'
  X-Runtime-rack: 0.029792
  X-GitHub-Request-Id: DCD6:3C9D:688D222:D064A9D:5B41DCE9
  Content-Type: application/json; charset=utf-8
  Content-Length: 2642

info: System.Net.Http.HttpClient.MyClient.LogicalHandler[101]
  End processing HTTP request after 818.4525ms - OK

trce: System.Net.Http.HttpClient.MyClient.LogicalHandler[103]
  Response Headers:
  Server: GitHub.com
  Date: Sun, 08 Jul 2018 09:44:09 GMT
  Status: 200 OK
  X-RateLimit-Limit: 60
  X-RateLimit-Remaining: 58
  X-RateLimit-Reset: 1531046594
  Cache-Control: public, max-age=60, s-maxage=60
  Vary: Accept
  ETag: "f0452653b55e5fef139a58372e3a7bf3"
  X-GitHub-Media-Type: github.v3; format=json
  Access-Control-Expose-Headers: ETag, Link, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
  Access-Control-Allow-Origin: *
  Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
  X-Frame-Options: deny
  X-Content-Type-Options: nosniff
  X-XSS-Protection: 1; mode=block
  Referrer-Policy: origin-when-cross-origin, strict-origin-when-cross-origin
  Content-Security-Policy: default-src 'none'
  X-Runtime-rack: 0.029792
  X-GitHub-Request-Id: DCD6:3C9D:688D222:D064A9D:5B41DCE9
  Content-Type: application/json; charset=utf-8
  Content-Length: 2642

As you can see and hopefully would expect, at trace level, the details are quite verbose. This can be useful to gain a more complete understanding of the headers during development. It is not recommended that you enable this in production since it will not only quickly fill logs, but it may expose secure data such as authorisation tokens for example.

Each log message includes an event ID so that you can quickly filter out the events you are interested in. There are two loggers in use by default. There is an outer “LogicalHandler” logger which wraps the entire handler pipeline. This will allow for the timing of the entire pipeline to be included in the logs. Additionally, as this will run first in the pipeline, it will log the request headers as they appear before the request passes through any other handlers. The other handlers in the pipeline may modify those headers. Using the trace level logging it’s possible to capture that information to your logs which can be useful if you need to diagnose failures.

The inner logger has the suffix, “ClientHandler” in the category name. This will be the innermost handler and therefore be the last custom handler to run before the request is sent over the network. As a result, this will be able to record the final request headers as they appear before the request is sent over the wire.

For reference, the event IDs included by these loggers are as follows:

Outer “LogicalHandler”

100 RequestPipelineStart
101 RequestPipelineEnd
102 RequestPipelineRequestHeader
103 RequestPipelineResponseHeader

Inner “ClientHandler”

100 RequestStart
101 RequestEnd
102 RequestHeader
103 ResponseHeader

How does IHttpClientFactory logging work?

The logging in IHttpClientFactory is applied to the pipeline just before HttpMessageHandlerBuilder.Build() method is called to return the final HttpMessageHandler pipeline.

This is achieved with the help of HttpMessageHandlerBuilder filters which are applied by the DefaultHttpClientFactory implementation. An interface named IHttpMessageHandlerBuilderFilter exists which can be implemented in order to define filters. By default, there is one implementation of this interface called LoggingHttpMessageHandlerBuilderFilter which is registered within DI. It’s possible to record more than one implementation against the interface. As long as the implementations are registered in DI, each one will be executed when building the pipeline.

The code in the LoggingHttpMessageHandlerBuilderFilter implementation of the Configure method is responsible for creating the two loggers and passing them to logging handlers, which themselves are implementations of the DelegatingHandler abstract base class. The code for LoggingHttpMessageHandlerBuilderFilter.Configure method is as follows:

When the delegate chain is called, the next(builder) call (line 11) will execute the next delegate (an Action<HttpMessageHandlerBuilder>) to ensure the entire handler pipeline is configured.

Then, the two loggers are created, using the name from the builder. This will be the name given to the named client or the type of the typed client.

The outer handler is added at index 0 to the AdditionalHandlers list so that it surrounds all other handlers and is the first to execute. The inner handler is added to the end of the AdditionalHandlers list, so it will be the last to execute before the internal handlers responsible for making the HTTP request.

Each of these logging handlers is responsible for logging their messages before and after the SendAsync calls to the other handlers. Using this approach a timer can be started before the SendAsync call and used afterwards to record the total HTTP request time as well as the overall handler pipeline execution time. I won’t copy the code for those handlers here as they are quite long. Instead, if you are interested you can view them on GitHub.

The outer “LogicalHandler” source can be viewed at here.

This class creates a logging scope as well as recording the log messages. Optionally, if Trace logging is enabled, it will iterate over the request and response headers, recording those to the logger also. The pattern used in this class is an example of the LoggerMessage approach to provide caching of the logging delegates for better performance. You can read more about this approach in the official documentation at docs.microsoft.com. It’s a little outside the scope of this post to go any deeper here.

The inner “ClientHandler” logger uses a very similar approach to record its log messages and the source for that can be viewed here.

Configuring the logging output

As with all logging which uses the Microsoft.Extensions.Logging library, you can control the log messages that are generated using configuration.

In the appsettings.json file, you can control and filter the logging which is recorded. The default production JSON file looks like this:

In this configuration, only warning messages and higher will be logged and as a result, no logging from the IHttpClientFactory handlers will be included. To enable logging we can add an extra log level configuration setting. If you read the “How does it work” section above you’ll recall that the loggers which are used to log the HTTP request log messages are defined with the category name beginning with “System.Net.Http”.

An option, therefore, is to enable the Information or Trace log levels for the System namespace:

However, this will also enable informational messages from all other components in the System root namespace. Therefore it may be better to configure the logging by limiting the configuration to “System.Net.Http.HttpClient” so that you only add in the messages from the HTTP requests through clients created via the HttpClientFactory:

We can take this filtering a step further and filter down to a specific named or typed client. For a typed client, the name will make the name of the registered type.

Let’s imagine we are only interested in logging the requests via a typed client named MyClient. Also, perhaps we only want the raw timing of the HTTP request itself. In this example we can enable logging just for the ClientHandler of our MyClient:

Customising the log messages

There may be cases where you want to add additional logging around HTTP requests through the clients managed by IHttpClientFactory. An option, in this case, is to introduce an extra handler into the pipeline. In part 3 of this series, I explored adding additional outgoing middleware handlers to your client configuration. Using that handler you can inspect the requests and responses, logging any data as necessary.

If you want to replace the default logging entirely to fully customise the message output, the recommended approach from the team is to replace the default implementation of the IHttpMessageHandlerBuilderFilter interface. In fact, this section of the post was inspired by an issue on the IHttpClientFactory GitHub repository. Let’s take a look at how we replace the logging so that we can record a correlation ID into the console messages. We want to replace the default implementation here since we don’t want additional log messages.

First, we’ll need to create a new implementation of IHttpMessageHandlerBuilderFilter:

I’ve pretty much copied the default implementation for this filter. The main difference is that for simplicity I’m only using one outer logger for this example. We create a logger and then add in a new CustomLoggingScopeHttpMessageHandler to the handler pipeline.

The CustomLoggingScopeHttpMessageHandler class is as follows:

There’s quite a lot to this class, but most of that is the static Log class and its methods. I won’t go into those too deeply here since they following the LoggerMessage advice for more performant logging that you can read in the docs. For the most part, I took the exiting LoggingScopeHttpMessageHandler method and tweaked it for my needs.

The first point to focus on here is that whole operation is wrapped in a logging scope. Before and after the SendAsync method is called on the base, we use the static Log methods to record the log events (lines 20 and 22)

Within the Log class, a few private delegates are defined to format the expected log messages.

Both _beginRequestPipelineScope and _requestPipelineStart accept a string which will be the correlation ID. They use the value to record the correlation ID into the scope properties as well as on the request started message.

A simple helper method has been added which parses a HTTP request for the expected correlation ID header and if present, returns it. The BeginRequestPipelineScope and RequestPipelineStart both use this method to extract the correlation ID.

The final step now that we have our filter implementation is to register it in DI, replacing the existing default filter applied by the HttpClientFactory library.

Inside the Startup class, ConfigureServices method we can call the replace extension on the ServiceCollection to swap out the default implementation with our one:

The replace method will find the first registered service of IHttpMessageHandlerBuilderFilter and replace that registration with this new one, where our CustomLoggingFilter is the implementation.

Now, when we run the application, the console logs include our correlation ID:

info: System.Net.Http.HttpClient.MyClient.LogicalHandler[100]
      Start processing HTTP request GET https://api.github.com/repos/aspnet/docs/branches [Correlation: 447c8d6b-e280-4538-bd31-56d508266b5b]

info: System.Net.Http.HttpClient.MyClient.LogicalHandler[101]
      End processing HTTP request - OK

As a side note, this filter approach is a great way to add common cross-cutting concerns for your whole application. It’s possible to register additional filters, each of which could add their own common handlers onto all clients created via HttpClientFactory.

Summary

In this post, we looked at the type of information available to us from the built-in logging, included as part of the HttpClientFactory library. We looked at how we can use log configuration to control which log messages we see and also looked at how logging has been implemented within the library. Finally, we explored using the IHttpMessageHandlerBuilderFilter interface to replace the default logging filter. I hope this has been helpful. I’ll be keeping an eye on the progress for the 2.2 release where we may see more logging, including some Polly integration making its way into the product.

Other Posts in this Series

Part 1 – An introduction to HttpClientFactory
Part 2 – Defining Named and Typed Clients
Part 3 – Outgoing request middleware with handlers
Part 4 – Integrating with Polly for transient fault handling
Part 5 – This post