Custom JSON Serialisation with System.Text.Json Converters

This post is my contribution to the .NET Advent calendar. Make sure you check out the other amazing posts on the lead up to Christmas!

At the time of writing, I am deep into work on some significant changes in the Elasticsearch .NET client. One of the changes is moving to System.Text.Json as the default serialiser used inside the client.

There are two “levels” of serialisation involved in the .NET Client. Firstly, we have the serialisation of our own types, the request and response models, along with the query DSL. For that, we will 100% rely on System.Text.Json. We also have to consider the serialisation of the consumer types, such as the model for the data being read from or written to Elasticsearch. By default, we will use System.Text.Json, however, consumers may opt to use a different serialiser such as Newtonsoft JSON.

With the 8.0 client, we are now generating most of the models from a common schema. This means that we can also generate custom serialisation logic which would otherwise be a lot of work to code and maintain manually.

In this post, I want to cover one of the more complex concepts I’ve had to handle regarding serialisation; aggregations.

NOTE: The final design for the types and converters shown in this post is still a work in progress. The current design is sufficient to illustrate custom serialisation techniques using System.Text.Json.

Elasticsearch Aggregations

Aggregations can be included in the JSON body of search requests to summarise and group data. Requests may include zero or more aggregations which Elasticsearch executes. The resulting aggregated data is then included in the JSON response. Example use cases include grouping a set of blog posts to get a count of posts within each category or aggregating data to understand the average load time for a web page over a specified time period.

Serialising Aggregations

Given that each aggregation in a request is uniquely named, a logical construct for modelling them on the request is to use a dictionary. The `AggregationDictionary` uses a string as the key and an `AggregationContainer` as the value. An aggregation container is our way to model the polymorphic nature of aggregations. The container can logically store any aggregation variants supported by Elasticsearch, which are then modelled with the appropriate properties.

We’ll concentrate on one approach to handling serialisation of the polymorphic AggregationContainer and its variant for this post. In a future post, we can discuss how to handle deserialisation which is a little more involved.

The definition for the AggregationContainer is very simple. It includes an internal property that will hold an instance of the variant supported by this container. In this case, all variants are expected to derive from the abstract AggregationBase type.

[JsonConverter(typeof(AggregationContainerConverter))]
public partial class AggregationContainer
{
	public AggregationContainer(AggregationBase variant) => Variant = variant ?? throw new ArgumentNullException(nameof(variant));

	internal AggregationBase Variant { get; }
}

This is where things start to get interesting when we consider serialising this type. We need to serialise the variant as the object in the JSON. To support this, a reasonably simple converter is needed. The serialisation side of this converter is not too complicated, but polymorphic deserialisation is a little more challenging. We’re focusing on serialisation for this post, so let’s dive into that.

Here is the converter class:

internal sealed class AggregationContainerConverter : JsonConverter<AggregationContainer>
{
	public override AggregationContainer Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
	{		
        // NOT COVERED IN THIS POST!
	}

	public override void Write(Utf8JsonWriter writer, AggregationContainer value, JsonSerializerOptions options)
	{
		if (value is null)
		{
			writer.WriteNullValue();
		}
		else if (value.Variant is not null)
		{
			var type = value.Variant.GetType();
			JsonSerializer.Serialize(writer, value.Variant, type, options);
		}
		else
		{
			throw new JsonException("Invalid container cannot be serialised");
		}
	}
}

Converters are a feature of System.Text.Json which allow us to customise how a type or property is read from and written as JSON. They must derive from JsonConverter<T> and implement the Read and Write methods.

The code above writes a null value if the AggregationContainer is null. If for some reason, an empty container has been created, it throws a JsonException. Otherwise, we serialise the variant. The serialise method supports passing in an existing Utf8JsonWriter and JsonSerializerOptions that allows us to continue serialising complex types into the main writer. The System.Text.Json serialise method is generic and the type is used when serialising the object. By default, this means it would try to serialise the AggregationBase type directly. That base type looks like this:

public abstract class AggregationBase
{
	protected AggregationBase(string name) => Name = name;

	[JsonIgnore]
	public Dictionary<string, object>? Meta { get; set; }

	[JsonIgnore]
	public string? Name { get; internal set; }
	
	// Other code omitted for brevity
}

This is a problem for us, we want to serialised the derived type, not just treat it as this abstract base type. Because both properties are marked as JsonIgnore, an empty object would be created using the default behaviour of System.Text.Json.

During serialisation, we can control this as I have done in the custom converter code above. We first get the actual type of the object. With this in hand, we can call an overload of Serialize which accepts the type we want to use during serialisation. This will ensure our aggregation is serialised fully.

We’ll use a simple ‘min’ aggregation to look deeper at the custom serialisation we need.

[JsonConverter(typeof(MinAggregationConverter))]
public partial class MinAggregation : AggregationBase
{
	public MinAggregation(string name, Field field) : base(name) => Field = field;


	public MinAggregation(string name) : base(name)
	{
	}

	public string? Format { get; set; }

	public Field? Field { get; set; }

	public Missing? Missing { get; set; }

	public Script? Script { get; set; }
}

The min aggregation type includes several properties that represent options for this aggregation. It also includes members defined on the base class, such as the Meta property. You’ll notice that this type also includes a custom converter, identified on the type by adding the JsonConverter attribute.

For each of the 50+ aggregation types, the code generator can produce a corresponding converter. The custom converters contain the logic to property format the aggregation in the request.

internal sealed class MinAggregationConverter : JsonConverter<MinAggregation>
{
	public override MinAggregation Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
	{
		// NOT COVERED IN THIS POST!
	}

	public override void Write(Utf8JsonWriter writer, MinAggregation value, JsonSerializerOptions options)
	{
		writer.WriteStartObject();
		writer.WritePropertyName("min");
		writer.WriteStartObject();

		if (!string.IsNullOrEmpty(value.Format))
		{
			writer.WritePropertyName("format");
			writer.WriteStringValue(value.Format);
		}

		if (value.Field is not null)
		{
			writer.WritePropertyName("field");
			JsonSerializer.Serialize(writer, value.Field, options);
		}

		if (value.Missing is not null)
		{
			writer.WritePropertyName("missing");
			JsonSerializer.Serialize(writer, value.Missing, options);
		}

		if (value.Script is not null)
		{
			writer.WritePropertyName("script");
			JsonSerializer.Serialize(writer, value.Script, options);
		}

		writer.WriteEndObject();

		if (value.Meta is not null)
		{
			writer.WritePropertyName("meta");
			JsonSerializer.Serialize(writer, value.Meta, options);
		}

		writer.WriteEndObject();
	}
}

This time, the converter is more involved. It directly uses the Utf8JsonWriter to write out the required JSON tokens. It begins by writing a start object token, the ‘{‘ character. It then writes a property where the value identifies the specific aggregation being written. This aligns with the aggregation name used by Elasticsearch. Another object is started, which will contain the aggregation fields. Each of these is only written if a value has been set on the aggregation instance.

Meta-information for aggregations is not included in the main aggregation object, but at the outer object level. In the code above, this is handled by frst ending the inner object, then writing the meta value, before the final end object token. This custom formatting would not be possible with the default System.Text.Json behaviour, which serialises all properties inside a single object.

To see the result of this custom serialisation, let’s create a basic search request with a simple min aggregation. In the Elasticsearch .NET client, this can be achieved with the following object initialiser code.

var request = new SearchRequest("my-index")
{
	Size = 0,
	Query = new TermQuery 
	{ 
		Field = Field<Project>(p => p.Type), 
		Value = "project"
	},	
	Aggregations = new MinAggregation("min_last_activity", Field<Project>(p => p.LastActivity))
	{
		Format = "yyyy",
		Meta = new Dictionary<string, object> { { "item_1", "value_1" } }
	};
};

When the client transport layer begins serialising the request, System.Text.Json will use the appropriate custom converters to handle serialisation. In this example, the final JSON is as follows.

{
    "aggregations": {
        "min_last_activity": {
            "min": {
                "format": "yyyy",
                "field": "lastActivity"
            },
            "meta": {
                "item_1": "value_1"
            }
        }
    },
    "query": {
        "term": {
            "type": {
                "value": "project"
            }
        }
    },
    "size": 0
}

As we can see, the min aggregation is included from the AggregationDictionary. Its properties have been serialised as part of the inner object. The meta information is written within the outer object to align with the format Elasticsearch expects.

Summary

Custom converters are extremely powerful and allow us to fully control the (de)serialisation of types when using System.Text.Json. Many of the more complex components of the Elasticsearch .NET client for v8.0 require either manually crafted or code-generated converters. Using these techniques, I have been able to overcome the sometimes complex JSON requirements that support our move to depend on the System.Text.Json from Microsoft.


Have you enjoyed this post and found it useful? If so, please consider supporting me:

Buy me a coffeeBuy me a coffee Donate with PayPal

Steve Gordon

Steve Gordon is a Pluralsight author, 6x Microsoft MVP, and a .NET engineer at Elastic where he maintains the .NET APM agent and related libraries. Steve is passionate about community and all things .NET related, having worked with ASP.NET for over 21 years. Steve enjoys sharing his knowledge through his blog, in videos and by presenting talks at user groups and conferences. Steve is excited to participate in the active .NET community and founded .NET South East, a .NET Meetup group based in Brighton. He enjoys contributing to and maintaining OSS projects. You can find Steve on most social media platforms as @stevejgordon