IHttpClientFactory Patterns: Using Typed Clients from Singleton Services Exploring an approach to reuse transient typed clients within singleton services

I’ve been following IHttpClientFactory for some time and have created a number of blog posts on the various features based on sample applications. Since then; with the release of ASP.NET Core 2.1 now available, I’ve begun using IHttpClientFactory in real production applications. In this post, I want to start covering some patterns I’ve begun to apply as I develop my applications.

This time I want to look at an issue regarding the scope of typed clients. This is based on something I’ve now directly encountered and that was raised as a comment on an earlier post by a reader named Yair.

When defining typed clients in your ConfigureServices method, the typed service is registered with transient scope. This means that a new instance is created by the DI container every time one is needed. The reason this occurs is that a HttpClient instance is injected into the typed client instance. That HttpClient instance is intended to be short lived so that the HttpClientFactory can ensure that the underlying handlers (and connections) are released and recycled.

This works in cases where you plan to consume the typed service from another transient service. A common place to use these in ASP.NET Core will be places such as Controllers for example. That works as expected since the controller is created by the framework for each request.

However, what if you want to use the typed client within a singleton service? That presents a problem. If this was allowed it would be created and injected once, and then held onto by the singleton service. This is not the behaviour we want.

Originally, when asked by Yair about how to use HttpClientFactory in singleton services I suggested instead using the named client approach and then injecting the IHttpClientFactory directly to your singleton service. From there, you can call CreateClient on the factory within methods on that singleton service, so that for each invocation, a new HttpClient is created for only a short lifetime.

The problem is that if we don’t get the typed client behaviour where we can encapsulate the work necessary to interact with a third party API as a service. Instead, we would have to make a service that depends on the IHttpClientFactory as I suggested above and then pass that into the necessary places in our code.

Having now tackled this in a real project where I am essentially building an SDK around an internal API that I’ve developed I have reassessed the options. What I ended up doing was using the typed client approach, but also providing my own factory which can return instances of that typed client. This is actually very simple to do by leveraging the DI service provider directly.

Here is an example of a typed client and its interface:

I won’t dive deeply into how this service works. It’s a typed client and you can read my earlier post about named and typed clients for more information.

In short; this service expects to have a HttpClient instance injected when it is created by the DI container. It then wraps the logic needed to call various endpoints of a remote API. Within this service, we can include the code needed to validate the response and deserialise the returned content from the request.

We can register the typed client in our ConfigureServices method as follows:

At this point, we have a typed client which can be consumed from other transient services and controllers. To make this accessible to singleton services in our application we can add a basic factory.

Here we have created a basic interface for an IConfigurationServiceFactory which defines a single GetConfigurationService method.

The implementation takes an IServiceProvider in its constructor, which will be injected by DI. With access to the service provider, we can use it to return an instance of IConfigurationService from the GetConfigurationService method. As this is a transient typed client, a new instance will be returned each time this method is called.

In ConfigureServices, we can register the factory with DI as a singleton:

As this is a singleton, we can consume this from any class where we need access to an instance of the typed client, even if that class is registered with singleton scope in DI.

The important thing here is that we don’t create an instance of the IConfigurationService and hold onto it for the lifetime of the singleton service. We can hold the IConfigurationServiceFactory and then we must use that whenever a method needs to get access to the IConfigurationService.

I’m sure there may be other ways to achieve similar results but I’m fairly happy with this approach for now.

Related Posts

Part 1 – An introduction to HttpClientFactory
Part 2 – IHttpClientFactory – Defining Named and Typed Clients
Part 3 – IHttpClientFactory – Outgoing request middleware with handlers
Part 4 – IHttpClientFactory – Integrating with Polly for transient fault handling

How to record metrics to DataDog from ASP.NET Core Recording metrics and events to DataDog in development and production in AWS ECS.

I’m currently architecting and building a new microservices based system at work. A priority of mine has been to learn from the experience of building our first microservices project by putting a greater focus on logging and monitoring. We freely acknowledge that we didn’t get this as good as we would have liked in our first implementation. Logging is hugely important in a microservices design. Tracking down failures and bugs can be very difficult without good logging and even more importantly, a way to search and view those logs.

But logging isn’t what I want to cover in this post. Alongside logging, something we felt we were missing in our previous project was detailed performance and usage metrics. We can gain some insight into the state of the system using built-in AWS metrics for our ECS services and also by monitoring, again via CloudWatch things such as SQS queue metrics. This is all reasonably useful but at times it would be great to know a bit more about the services themselves.

With this new project, I’ve started instrumenting it with DataDog so that the application can record useful metrics that allow us to track it’s overall health, as well as understand how things are performing and what’s actually being used. The goal is to build up a much better profile of the application so that we can more easily see whether changes that we have deployed have improved the performance or whether they have degraded it. Having proper baselines of the application performance is key to being able to make such determinations.

The choice of DataDog was made, simply because our systems team use it already for monitoring the overall AWS environment. After a few quick enquiries, I found that recording metrics from the application code would be possible thanks to the availability of a C# library from DataDog called “dogstatsd-csharp-client”. My only concern is that the library seems to be a little stalled, with no commits since Nov 30th 2017. There is also a third party library called DatadogSharp which I may investigate in the future. Their readme suggests it is more performant than DataDog’s own library. Putting that aside for now; I decided I’d go ahead and prototype based on the DataDog library and revisit these concerns if they proved founded, once I was underway. I have built a small wrapper library over the static DogStatsd methods so I have a layer of abstraction if I decided to change the underlying client. It also allows me to centralise the use of some common tags which I wanted to include on all of my metrics.

In this post, I want to focus on the slightly higher level concepts by discussing how I got this working during development and more recently, deploying to a prototype in production running in AWS ECS.

Development

The DogStatsd client sends messages over UDP to an agent server which will collect these and eventually send them up to the DataDog service. You can read more about that flow on the DataDog DogStatsD page. Therefore, to test my code during development, I needed to have an agent server running somewhere.

In my case, I opted for a Docker image which I could run locally. After a bit of Googling, I ended up at the DogStatsD6 Docker Image GitHub repository. This image is hosted on the Docker Hub in the DataDog DogStatsD repository.

To begin working with this locally, I created a simple docker-compose file. I’ve not included some other elements I was running here for logging via the ELK stack as those are not necessary for this example. My docker-compose file looked like this:

The above example specifies a dogstatsd service which will be started using the DogStatsD6 image. By default, this DogStatsD agent server will be listening on UDP port 8125 for data. I expose that port to port 8125 on my host (development) machine. For the DogStatsD server to be able to send its data to DataDog we need to provide an API key. This can be done by passing the key into the DD_API_KEY environment variable.

At this point, I could run a docker-compose up -d command to start an instance of the DogStatsD container.

Sending Metrics and Events from an ASP.NET Core based API

As I mentioned earlier, I’ve created a wrapper library for sending DataDog metrics and events via the methods from the dogstatsd-csharp-client library. I won’t be covering that here as I want to keep this post focused on the core principles. Instead, let’s look at how we can begin with recording metrics and events to DataDog.

The first step, before we can send data to DataDog is to add the Nuget package for the dogstatsd-csharp-client library.

You can do this by searching for it via the NuGet package manager, or as I did, by editing the csproj file directly and adding the package reference.

<PackageReference Include="DogStatsD-CSharp-Client" Version="3.1.0" />

Once the library is added we must configure the client. All of the functionality for this client library is implemented as static methods. The first method we will call is the DogStatsd.Configure method. This takes a StatsdConfig object as a parameter. We’ll add a small static method in our Startup.cs class as follows:

We build up the configuration object using a server name string which we will pass into this helper method. This will be the hostname or IP of the server running the DogStatsd server. We also set the port it will be listening on. Finally, we can add a prefix which will be added to the front of any metric and event names. This is useful to make it easier to search for them in the DataDog application.

We then call the Configure method, passing in the config object.

Next, we need to call this method. We can do that from our Configure method in the Startup class. This is called once at application startup so is an appropriate place to do this. First, we’ll load the DogStatsd server name from ASP.NET Core configuration as this allows us to easily set it per environment. We’ll then call our ConfigureDataDog method, passing that server hostname through:

We’ll need to include a section for this configuration in our appsettings.json file:

During development, we can set this to the localhost IP address (127.0.01). Since we are running are DogStatsd server within a container and exposing its port through to the host we can access it this way. In production, we can then set an appropriate production hostname for the DogStatsd server.

At this point, everything is ready to go. We can now use the static DogStatsd methods to send metrics and events from our code.

The client library readme includes examples of the methods we can call but as a basic example, here’s some code we can call from our application to record a metric whenever a profiles endpoint is called.

DogStatsd.Increment("all_profiles_endpoint_call");

At this stage, we have added basic metrics to our application and have a DogStatsd server running locally in a container to collect and forward those metrics to DataDog.

Production

The final stage is to get this working in production. I felt there might be a few options here but the one I wanted to try was to run the DogStatsd server as a sidecar container. We use AWS Elastic Container Service (ECS) to run our production services as Docker containers.

To achieve this we will add a second container to our task definition for our ECS service. ECS services are the logical structure that represents a unit of scale and deployment for containers in ECS. The service can run multiple containers which generally is not something you need or want to do since it limits your scaling options. However, for this requirement, running a second supporting container, it is a reasonable use case. This container is directly used only by our main microservice and should scale with that service.

The final ECS task definition looks like this:

The new section in my case is the dogstatsd container definition near the bottom (lines 39-51). The task definition takes one or more container definitions which define the containers that will be started as part of the tasks for this service.

The dogstatsd one is quite simple. We tell it to use the same image as we used in development, “datadog/dogstatsd”, which will be pulled from the Docker Hub. We can pass in the API Key via an environment variable. We can mark this container as non-essential since we don’t want its failure to cascade to causing our service to restart unexpectedly.

The next change is to add the links to the main API server container definition (lines 35-37). This ensures that it will have a network link with the dogstatsd container. Finally, we can add an environment variable for the API container which sets the hostname for the DogStatsD server. This will override the 127.0.0.1 setting from our appsettings.json file. Since we have linked the containers we can reference the DogStatsd server by its container name.

With these changes made it was then a case of registering this new task definition and updating our ECS service. When the service starts tasks, it will start an instance of the DogStatsd container which our application containers can use to record their metrics and events.

Summary

It’s still early days, but this is working quite nicely in the production environment. I will be reviewing it over time as I learn more about the various elements at play. This is likely not the end of the story or our journey with DataDog. Hopefully, this is enough information to get you started if you are following a similar path. 

HttpClientFactory in ASP.NET Core 2.1 (Part 4) Integrating with Polly for transient fault handling

In the previous post in this series, I introduced the concept of outgoing middleware using DelegatingHandlers registered with named and typed clients. While that approach is available, the ASP.NET team hope that for most scenarios, we won’t need to resort to manually building our own handlers. In some cases, the built-in features of the library may provide the functionality we need. For example, it is sometimes useful to wrap requests within timing code to track how long they take to execute. This is now built into IHttpClientFactory as part of its default logging. In other cases, third-party integration may provide the functionality you require. For example, a common cross-cutting concern is handling transient faults during HTTP requests. In this case, rather than crafting our own retry logic, it’s much better to use a library such as Polly.

Polly is a popular transient fault handling library which provides a mechanism to define policies which can be applied when certain failures occur. One of the more commonly used policies is the retry policy. This allows you to wrap some code which, should a failure occur, will be retried; multiple times if necessary. This is very useful in situations where your application needs to communicate with external services. There is the ever-present risk when communicating with services over a transport such as HTTP that a transient fault will occur. A transient fault may prevent your request from being completed but is also likely to be a temporary problem. This makes retrying a sensible option in those cases.

As well as retries, Polly offers a number of other types of policy, many of which you may want to combine with retry to build up sophisticated ways to deal with failures. I will cover a few of the more general examples in this post, but if you want more comprehensive coverage I recommend you check out the Polly wiki.

The ASP.NET team have worked closely with Dylan and Joel, the primary maintainers of Polly, to include an integration pattern to make applying Polly policies to HttpClient instances really straightforward.

Before we can work with the Polly integrations we need to add a package reference to our project. The general IHttpClientFactory functionality lives inside the Microsoft.Extensions.Http package which is included as a dependency in the Microsoft.AspNetCore.App 2.1 meta package. This is a new meta package in ASP.NET Core 2.1 which doesn’t include third-party dependencies. Therefore, in order to use the Polly extensions for IHttpClientFactory we need to add the Microsoft.Extensions.Http.Polly package to our project.

After doing so in a basic project the csproj file will look something like this:

Applying a Policy

The Microsoft.Extensions.Http.Polly package includes an extension method called AddPolicyHandler on the IHttpClientBuilder that we can use to add a handler which will wrap all requests made using an instance of that client in a Polly policy. The IHttpClientBuilder is returned when we define a named or typed client.

We can then use the extensions in our ConfigureServices method…

In this example, we’re defining a client named “github” and we’ve used the AddPolicyHandler method to pass in a timeout policy. The policy you provide here must be an IAsyncPolicy<HttpResponseMessage>. This policy will timeout any requests after 10 seconds.

Reusing Policies

When using Polly, where possible, it is a good practice to define policies once and share them in cases where the same policy should be applied. This way, to change the rules for a policy, those changes only need to be made in one place. Also, it ensures that the policy is allocated only once. Certainly, policies such as the circuit breaker need to be shared if multiple callers expect to run through the same circuit breaker instance. 

For this example, we’ll declare the timeout policy from the last example once and share it with two named clients…

We’ll look at another option for policy reuse a little later in this post when we explore using a PolicyRegistry.

Transient Fault Handling

When dealing with HTTP requests, the most common scenarios we want to handle are transient faults. As this is a common requirement, the Microsoft.Extensions.Http.Polly package includes a specific extension that we can use to quickly setup policies that handle transient faults.

For example, to add a basic retry when a transient fault occurs for requests from a named client we can register the retry policy as follows:

In this case, all requests made through the client will retry when certain failure conditions are met. The AddTransientHttpErrorPolicy method takes a Func<PolicyBuilder<HttpResponseMessage>, IAsyncPolicy<HttpResponseMessage>>. The PolicyBuilder here will be preconfigured to handle HttpRequestExceptions, any responses returning a 5xx status code and also any responses with a 408 (request timeout) status code. This should be suitable for many situations. If you require the policy to apply under other conditions, you will need to use a different overload to pass in a more specific policy.

Be aware; when performing retries we need to consider idempotency. Retrying a HTTP GET is a pretty safe operation. If we’ve made the call and not received any response, we can safely retry the call without any danger. However, consider what might happen if we retry a HTTP POST request. In that case, we have to be more careful since it’s possible that your original request was actually received, but the response we received suggested a failure. In that case, retrying could lead to duplication of data, or corruption of the data stored in the downstream system. Here, you need to have more knowledge of what the downstream service will do if it receives the same request more than once. Is retrying a safe operation? When you own the downstream service, it is easier to control this. You might, for example, use some unique identifier to prevent duplicate POSTs.

When you have less control of the downstream system or you know that a duplicate POST might have negative consequences, you will need to control your policy more carefully. An option that might be suitable is to define different named/typed clients. You could create one for those requests that have no side effects and another for those that do. You can then use the correct client for the action being taken. However, this might become a little difficult to manage. A better option is to use an overload of AddPolicyHandler which gives us access to the HttpRequestMessage so that policies can be applied conditionally. That overload looks like this:

AddPolicyHandler(Func<HttpRequestMessage, IAsyncPolicy<HttpResponseMessage>> policySelector)

You’ll note that the policySelector delegate here has access to the HttpRequestMessage and is expected to return an IAsyncPolicy<HttpResponseMessage>. We don’t have access to a PolicyBuilder setup to handle transient faults as we did in our earlier example. If we want to handle the common transient errors, we’ll need to define the expected conditions for our policy. To make this easier, the Polly project includes a helper extension that we can use that sets up a PolicyBuilder ready to handle the common transient errors. To use the extension method we need to add the Polly.Extensions.Http package from Nuget.

We can then call HttpPolicyExtensions.HandleTranisentHttpError() to get a PolicyBuilder that is configured with the transient fault conditions. We can use that PolicyBuilder to create a suitable retry policy which can then be conditionally applied when the request is a HTTP GET. In this example, any other HTTP methods use the NoOp policy.

Using a PolicyRegistry

The last example I want to cover in this post is a basic demonstration of how policies can be applied from a policy registry. To support policy reuse, Polly provides the concept of a PolicyRegistry which is essentially a container for policies. These can be defined at application startup by adding policies into the registry. The registry can then be passed around and used to access the policies by name.

The extensions available on the IHttpClientBuilder also support adding Polly based handlers to a client using a registry.

First, we must register a PolicyRegistry with DI. The Microsoft.Extensions.Http.Polly package includes some extension methods to make this simple. In the above example, I call the AddPolicyRegistry method which is an extension on the IServiceCollection. This will create a new PolicyRegistry and add register it in DI as the implementation for IPolicyRegistry<string> and IReadOnlyPolicyRegistry<string>. The method returns the policy so that we have access to add policies to it.

In this example, we’ve added two timeout policies and given them names. Now when registering a client we can call the AddPolicyHandlerFromRegistry method available on the IHttpClientBuilder. This takes the name of the policy we want to use. When the factory creates instances of this named client, it will add the appropriate handler, wrapping calls in the “regular” retry policy which will be retrieved from the registry.

Summary

As a long time user of Polly, I’m very happy to see the integration being added with IHttpClientFactory. Together these libraries make it really easy to get up and running with HttpClient instances that are able to handle transient faults seamlessly. The examples I’ve shown are quite basic and general, but I hope they give the idea of how policies can be used and registered. For more detailed Polly documentation and examples, I recommend you check out the Polly wiki. It was great being involved in some of the early discussions with both the ASP.NET and Polly teams when this integration was being designed as I was able to suggest the usefulness of the policy registry extensions.

Other Posts in this Series

Part 1 – An introduction to HttpClientFactory
Part 2 – Defining Named and Typed Clients
Part 3 – Outgoing request middleware with handlers
Part 4 – This post