ASP.NET MVC Core: HTML Encoding a JSON Request Body How to HTML encode deserialized JSON content from a request body

On a recent REST API project built using ASP.NET Core 1.0 we wanted to add some extra security around the inputs we were accepting. Specifically around JSON data being sent in the body of POST requests. The requirement was to ensure that we HTML encode any of the deserialized properties to prevent an API client sending in HTML and script tags which would then be stored in the database. Whilst we HTML escape all JSON we output as standard, we wanted to limit this extra security vector since we never expect to accept HTML data.

Initially I assumed this would be something simple we can configure but the actual solution proved a little more fiddly than I first expected. A lot of Googling did not yield many examples of similar requirements. The final code, may not be the best way to achieve the goal but seems to work and is the best we could come up with. Feel free to send any improved ideas through!

Defining the Requirement

As I said above; we wanted to ensure that any strings that JSON.NET deserializes from our request body into our model classes do not get bound with any un-encoded HTML or script tags in the values. The requirement was to prevent this by default and enable it globally so that other developers do not have to explicitly remember to set attributes, apply any code on the model or write any validators to encode the data. Any manual steps or rules like that can easily get forgotten so we wanted a locked down by default approach. The expected outcome was that the values from the properties on any deserialized models is sanitised and HTML encoded as soon as we get access to objects in the controllers.

For example the following JSON body should result in the title property being encoded once bound onto our model.

{
	"Title" : "<script>Something Nasty</script>",
	"Description" : "A long description.",
	"Reference": "REF12345"
}

The resulting title once deserialized and bound should be “&lt;script&gt;Something Nasty&lt;/script&gt;”.

Here’s an example of a Controller and input model that might be bound up to such data which I’ll use during this blog post.

public class Thing
{
	public string Title { get; set; }
	public string Description { get; set; }
	public string Reference { get; set; }
}

[HttpPost("CreateThing")]
public IActionResult CreateThing([FromBody] Thing theThing)
{
	// Save the thing to the database here
}

In the case of the default binding and JSON deserialization if we debug and break within the CreateThing method the Title property will have a value of “<script>Something Nasty</script>”. If we save this directly into our database we risk someone later consuming this and potentially rendering it out directly into a browser. While the risk is pretty edge case we wanted to cover it off.

Defining a Custom JSON ContractResolver

The first stage in the solution was to define a custom JSON.net contract resolver. This would allow us to override the CreateProperties method and apply HTML encoding to any string properties.

A simplified example of the final contract resolver looks like this:

using Newtonsoft.Json;
using Newtonsoft.Json.Serialization;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text.Encodings.Web;

namespace JsonEncodingExample
{    
    public class HtmlEncodeContractResolver : DefaultContractResolver
    {
        protected override IList<JsonProperty> CreateProperties(Type type, MemberSerialization memberSerialization)
        {
            var properties = base.CreateProperties(type, memberSerialization);

            foreach (var property in properties.Where(p => p.PropertyType == typeof(string)))
            {
                var propertyInfo = type.GetProperty(property.UnderlyingName);
                if (propertyInfo != null)
                {
                    property.ValueProvider = new HtmlEncodingValueProvider(propertyInfo);
                }
            }

            return properties;
        }

        protected class HtmlEncodingValueProvider : IValueProvider
        {
            private readonly PropertyInfo _targetProperty;

            public HtmlEncodingValueProvider(PropertyInfo targetProperty)
            {
                this._targetProperty = targetProperty;
            }
            
            public void SetValue(object target, object value)
            {
                var s = value as string;
                if (s != null)
                {
                    var encodedString = HtmlEncoder.Default.Encode(s);
                    _targetProperty.SetValue(target, encodedString);
                }
                else
                {
                    // Shouldn't get here as we checked for string properties before setting this value provider
                    _targetProperty.SetValue(target, value);
                }
            }

            public object GetValue(object target)
            {
                return _targetProperty.GetValue(target);
            }
        }
    }
}

Stepping through what this does – On the override of CreateProperties first we call CreateProperties on the base DefaultContractResolver which returns a list of the JsonProperties. Without going too deep into the internals of JSON.net this essentially checks all of the serializable members on the object being deserialized/serialized and adds JsonProperies for them to a list.

We then iterate over the properties ourselves, looking for any where the type is a string. We use reflection to get the PropertyInfo for the property and then set a custom IValueProvider for it.

Our HtmlEncodingValueProvider implements IValueProvider and allows us to manipulate the value before it is set or retrieved. In this example we use the default HtmlEncoder to encode the value when it is set.

Wiring Up The Contract Resolver

With the above in place we need to set things up so that when our JSON request body is deserialized it is done using our HtmlEncodeContractResolver instead of the default one. This is where things get a little complicated and while the approach below works, I appreciate there may be a better/easier way to do this.

In our case we specifically wanted to only use the HtmlEncodeContractResolver for deserialization of any JSON from a request body, and not be used for any JSON deserialization that occurs elsewhere in the application. The way I found that we could do this was to replace the JsonInputFormatter which MVC uses to handle incoming JSON with a version using our new HtmlEncodeContractResolver. The way we can do this is on the MvcOptions that is accessible when adding the MVC service in our ConfigureServices method.

Rather than heap all of the code into startup class, we chose to create a small extension method for the MvcOptions class which would wrap the logic we needed. This is how our extension class ended up.

public static class MvcOptionsExtensions
{
    public static void UseHtmlEncodeJsonInputFormatter(this MvcOptions opts, ILogger<MvcOptions> logger, ObjectPoolProvider objectPoolProvider)
    {
        opts.InputFormatters.RemoveType<JsonInputFormatter>();

        var serializerSettings = new JsonSerializerSettings
        {
            ContractResolver = new HtmlEncodeContractResolver()
        };

        var jsonInputFormatter = new JsonInputFormatter(logger, serializerSettings, ArrayPool<char>.Shared, objectPoolProvider);

        opts.InputFormatters.Add(jsonInputFormatter);
    }
}

First we remove the input formatter of type JsonInputFormatter from the available InputFormatters. We then create a new JSON.net serializer settings, with the ContractResolver set to use our HtmlEncodeContractResolver. We can then create a new JsonInputFormatter which requires two parameters in its constructor, an ILogger and an ObjectPoolProvider. Both will be passed in when we use this extension method.

Finally with the new JsonInputFormatter created we can add it to the InputFormatters on the MvcOptions. MVC will then use this when it gets some JSON data on the body and we’ll see that our properties are nicely encoded.

To wire this up in the application we need to use our options from the configure service method as follows…

public void ConfigureServices(IServiceCollection services)
{
    var sp = services.BuildServiceProvider();
    var logger = sp.GetService<ILoggerFactory>();
    var objectPoolProvider = sp.GetService<ObjectPoolProvider>();

    services
        .AddMvc(options =>
            {
                options.UseHtmlEncodeJsonInputFormatter(logger.CreateLogger<MvcOptions>(), objectPoolProvider);
            });
}

As we now have an extension it’s quite easy to call this from the AddMvc options. It does however require those two dependencies. As we are in the ConfigureServices method our DI is not fully wired up. The best approach I could find was to create an intermediate ServiceProvider which will allow us to resolve the types currently registered. With access the ServiceProvider I can ask it for a suitable ILoggerFactory and ObjectPoolProvider.

Update: Andrew Lock had written a followup post which describes a cleaner way to configure these dependencies. I recommend you check it out.

I use the LoggerFactory to create a new ILogger and pass the ObjectPoolProvider in directly.

If we run our application and send in our demo JSON object I defined earlier the title on the resulting Thing class will be “&lt;script&gt;Something Nasty&lt;/script&gt;”.

Summary

Getting to the above code took a little trial and error since I couldn’t find any official documentation suggesting approaches for our requirement on the ASP.NET core docs. I suspect in reality our requirement is fairly rare, but for a small amount of work it does seem reasonable to sanitise JSON input to encode HTML when you never expect to receive it. By escaping everything on the way out the chances of our API returning un-encoded, un-escaped data were very low indeed. But we don’t know what another consumer of the data may do in the future. By storing the data HTML encoded in the database we have hopefully avoided that risk.

If there is a better way to modify the JsonInputFormatter and wire it up, I’d love to know. We took the best approach we could find at the time but accept there could be intended methods or flows we could have used instead. Hopefully this was useful and even if your requirement differs, perhaps you will have requirements where overriding an input or output formatter might be a solution.