I’ve written a few blog posts about HttpClientFactory and HttpClient and these are proving popular topics. There exists some confusion about how HttpClient should be used and there is a degree of mystery around how it actually works.
In this post, I want to begin a potential series where we will explore the internals of HttpClient together and learn a little more about how it works under the hood. In this particular post, I want to start by explaining how a request is actually sent by HttpClient. We won’t dive too deep here as I want to focus on the higher-level flow and components involved. If there’s interest (let me know on Twitter or in the comments below), we can head deeper into the internals in future posts.
Please note: As much of the code involved is marked internal, it is always subject to change. This flow is correct as at March 2019 against the latest merged code under the master branch of corefx. That said, I wouldn’t expect major parts of this to change too dramatically. You can always review the code yourself in the corefx repo.
Exploring the Flow of the HttpRequestMessage
The diagram below represents the general flow of an HTTP request sent via HttpClient. It assumes that the newer SocketsHttpHandler is enabled (the default option in .NET Core 2.1 and higher). Some of the handlers are optional and some flows are conditional based on various settings in the request and the handlers in the chain. Handlers identified with the dashed border are optional and may or may not be called for your requests.
As you can see, there are many layers to this flow:
When using HttpClient we often start by creating a HttpRequestMessage. This contains the content, metadata and configuration for the request we wish to send.
Once we have the HttpRequestMessage we can pass it as an argument to the HttpClient.SendAsync(…) method. There are a few overloads to this method and for the purpose of this post, we’ll just worry about the most simple one which accepts just the HttpRequestMessage.
HttpClient derives from HttpMessageInvoker. Within the SendAsync method of HttpClient, it calls the base.SendAsync(…) method on the HttpMessageInvoker.
As I covered in a previous post looking at the creation and disposal of HttpClient, you can optionally provide it with a chain of HttpMessageHandlers as an argument. If you don’t do this, a new HttpClientHandler will be created on your behalf which is the start of a default chain of handlers. It is expected that the handler chain which you provide should wrap the HttpClientHandler with one or more delegating handlers. If you’re using HttpClientFactory (and you should be) it will do this for you if you when creating HttpClient instances for any named or typed clients which have been configured with additional handlers.
In the flow above, if you provide a handler chain, the handlers will be called in the order in which they were chained together. Each handler will have a SendAsync method which can inspect/modify the request. The handler will then generally pass the HttpRequestMessage onto the next handler in the chain. Your chain may include one handler or many depending on your requirements.
Eventually, a handler in your chain will pass the request onto in the HttpClientHandler.SendAsync(…) method.
HttpClientHandler is responsible for choosing the next part of the flow depending on the runtime and configuration your application is running under. From .NET Core 2.1, the default behaviour is for Sockets based, managed code to be used to send the HTTP request over the network. This is the flow my diagram shows and which we’ll cover in this post. On older versions of .NET Core, the sending of HTTP requests is handled by unmanaged OS-level APIs. On Windows, there is a WinHttpHandler which calls into the WinHTTP API. On Linux and Mac, a CurlHandler will be used which calls into LibCurl.
When the Sockets handler is enabled, the HttpClientHandler will create a new instance of SocketsHttpHandler which it stores in a private field. Note that there is another possible flow here if any listeners are configured to listen to events from HttpHandlerDiagnosticListener. I’ll ignore this case for the purpose of keeping this post a little simpler to follow!
From SocketsHttpHandler.SendAsync(…), the flow becomes conditional based on the HttpConnectionSettings which have been configured on the SocketsHttpHandler. Appropriate internal handlers will be chained together, reflecting whether compression, authentication or redirects have been enabled/disabled. This happens once for the first request through the SocketsHttpHandler, after which point the handler chain is reused. Therefore I think it’s reasonable to say that the first request made via the SocketsHttpHandler will have a small additional overhead compared to subsequent requests.
The SendAsync(…) method on the optional handlers will be called if they are included in the handler chain. Eventually, the flow enters HttpConnectionHandler. Below here, the pool of socket connections is managed and used to send the HTTP requests. The pooling is controlled within the HttpConnectionPoolManager. Its SendAsync(…) method is called from the SendAsync(…) method on the HttpConnectionHandler.
The HttpConnectionPoolManager attempts to locate an existing pool for the connection. The exact details of this a probably worth leaving for a future post. In short, a key is created based on the request which needs to be sent and a ConcurrentDictionary of HttpConnectionPool objects is used to get an existing pool if one is available.
Once a HttpConnectionPool is found or created, its SendAsync method is called. There’s quite a detailed internal flow which happens inside the HttpConnectionPool at this point, but ultimately it ends up trying to get a suitable existing HttpConnection. If one is not available, a new HttpConnection is created.
The SendAsync(…) method is called on the HttpConnection and at this point, the HttpRequestMessage is converted into the raw HTTP request bytes and streamed to the socket. The response is received via the socket and parsed into the final HttpResponseMessage object which is returned back through the stack to the original calling code.
This has been a fairly high-level look at the flow of a HttpRequestMessage through HttpClient, the HttpMessageHandler chain and out over the network via the SocketsHttpHandler. For most developers, it’s not entirely necessary to know how all of these components are wired up but I find it interesting to learn about this kind of thing. I hope it proves useful to a few people who like me, want to understand in a little bit more detail how HttpClient works.
Let me know in the comments and on Twitter if further internal deep dives into more of the specifics are of interest to you? I’m sure I’ll be poking around some more and if there’s value for people, I’m happy to write up my notes as future blog posts.