ASP.NET MVC Core: HTML Encoding a JSON Request Body How to HTML encode deserialized JSON content from a request body

On a recent REST API project built using ASP.NET Core 1.0 we wanted to add some extra security around the inputs we were accepting. Specifically around JSON data being sent in the body of POST requests. The requirement was to ensure that we HTML encode any of the deserialized properties to prevent an API client sending in HTML and script tags which would then be stored in the database. Whilst we HTML escape all JSON we output as standard, we wanted to limit this extra security vector since we never expect to accept HTML data.

Initially I assumed this would be something simple we can configure but the actual solution proved a little more fiddly than I first expected. A lot of Googling did not yield many examples of similar requirements. The final code, may not be the best way to achieve the goal but seems to work and is the best we could come up with. Feel free to send any improved ideas through!

Defining the Requirement

As I said above; we wanted to ensure that any strings that JSON.NET deserializes from our request body into our model classes do not get bound with any un-encoded HTML or script tags in the values. The requirement was to prevent this by default and enable it globally so that other developers do not have to explicitly remember to set attributes, apply any code on the model or write any validators to encode the data. Any manual steps or rules like that can easily get forgotten so we wanted a locked down by default approach. The expected outcome was that the values from the properties on any deserialized models is sanitised and HTML encoded as soon as we get access to objects in the controllers.

For example the following JSON body should result in the title property being encoded once bound onto our model.

{
	"Title" : "<script>Something Nasty</script>",
	"Description" : "A long description.",
	"Reference": "REF12345"
}

The resulting title once deserialized and bound should be “&lt;script&gt;Something Nasty&lt;/script&gt;”.

Here’s an example of a Controller and input model that might be bound up to such data which I’ll use during this blog post.

public class Thing
{
	public string Title { get; set; }
	public string Description { get; set; }
	public string Reference { get; set; }
}

[HttpPost("CreateThing")]
public IActionResult CreateThing([FromBody] Thing theThing)
{
	// Save the thing to the database here
}

In the case of the default binding and JSON deserialization if we debug and break within the CreateThing method the Title property will have a value of “<script>Something Nasty</script>”. If we save this directly into our database we risk someone later consuming this and potentially rendering it out directly into a browser. While the risk is pretty edge case we wanted to cover it off.

Defining a Custom JSON ContractResolver

The first stage in the solution was to define a custom JSON.net contract resolver. This would allow us to override the CreateProperties method and apply HTML encoding to any string properties.

A simplified example of the final contract resolver looks like this:

using Newtonsoft.Json;
using Newtonsoft.Json.Serialization;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text.Encodings.Web;

namespace JsonEncodingExample
{    
    public class HtmlEncodeContractResolver : DefaultContractResolver
    {
        protected override IList<JsonProperty> CreateProperties(Type type, MemberSerialization memberSerialization)
        {
            var properties = base.CreateProperties(type, memberSerialization);

            foreach (var property in properties.Where(p => p.PropertyType == typeof(string)))
            {
                var propertyInfo = type.GetProperty(property.UnderlyingName);
                if (propertyInfo != null)
                {
                    property.ValueProvider = new HtmlEncodingValueProvider(propertyInfo);
                }
            }

            return properties;
        }

        protected class HtmlEncodingValueProvider : IValueProvider
        {
            private readonly PropertyInfo _targetProperty;

            public HtmlEncodingValueProvider(PropertyInfo targetProperty)
            {
                this._targetProperty = targetProperty;
            }
            
            public void SetValue(object target, object value)
            {
                var s = value as string;
                if (s != null)
                {
                    var encodedString = HtmlEncoder.Default.Encode(s);
                    _targetProperty.SetValue(target, encodedString);
                }
                else
                {
                    // Shouldn't get here as we checked for string properties before setting this value provider
                    _targetProperty.SetValue(target, value);
                }
            }

            public object GetValue(object target)
            {
                return _targetProperty.GetValue(target);
            }
        }
    }
}

Stepping through what this does – On the override of CreateProperties first we call CreateProperties on the base DefaultContractResolver which returns a list of the JsonProperties. Without going too deep into the internals of JSON.net this essentially checks all of the serializable members on the object being deserialized/serialized and adds JsonProperies for them to a list.

We then iterate over the properties ourselves, looking for any where the type is a string. We use reflection to get the PropertyInfo for the property and then set a custom IValueProvider for it.

Our HtmlEncodingValueProvider implements IValueProvider and allows us to manipulate the value before it is set or retrieved. In this example we use the default HtmlEncoder to encode the value when it is set.

Wiring Up The Contract Resolver

With the above in place we need to set things up so that when our JSON request body is deserialized it is done using our HtmlEncodeContractResolver instead of the default one. This is where things get a little complicated and while the approach below works, I appreciate there may be a better/easier way to do this.

In our case we specifically wanted to only use the HtmlEncodeContractResolver for deserialization of any JSON from a request body, and not be used for any JSON deserialization that occurs elsewhere in the application. The way I found that we could do this was to replace the JsonInputFormatter which MVC uses to handle incoming JSON with a version using our new HtmlEncodeContractResolver. The way we can do this is on the MvcOptions that is accessible when adding the MVC service in our ConfigureServices method.

Rather than heap all of the code into startup class, we chose to create a small extension method for the MvcOptions class which would wrap the logic we needed. This is how our extension class ended up.

public static class MvcOptionsExtensions
{
    public static void UseHtmlEncodeJsonInputFormatter(this MvcOptions opts, ILogger<MvcOptions> logger, ObjectPoolProvider objectPoolProvider)
    {
        opts.InputFormatters.RemoveType<JsonInputFormatter>();

        var serializerSettings = new JsonSerializerSettings
        {
            ContractResolver = new HtmlEncodeContractResolver()
        };

        var jsonInputFormatter = new JsonInputFormatter(logger, serializerSettings, ArrayPool<char>.Shared, objectPoolProvider);

        opts.InputFormatters.Add(jsonInputFormatter);
    }
}

First we remove the input formatter of type JsonInputFormatter from the available InputFormatters. We then create a new JSON.net serializer settings, with the ContractResolver set to use our HtmlEncodeContractResolver. We can then create a new JsonInputFormatter which requires two parameters in its constructor, an ILogger and an ObjectPoolProvider. Both will be passed in when we use this extension method.

Finally with the new JsonInputFormatter created we can add it to the InputFormatters on the MvcOptions. MVC will then use this when it gets some JSON data on the body and we’ll see that our properties are nicely encoded.

To wire this up in the application we need to use our options from the configure service method as follows…

public void ConfigureServices(IServiceCollection services)
{
    var sp = services.BuildServiceProvider();
    var logger = sp.GetService<ILoggerFactory>();
    var objectPoolProvider = sp.GetService<ObjectPoolProvider>();

    services
        .AddMvc(options =>
            {
                options.UseHtmlEncodeJsonInputFormatter(logger.CreateLogger<MvcOptions>(), objectPoolProvider);
            });
}

As we now have an extension it’s quite easy to call this from the AddMvc options. It does however require those two dependencies. As we are in the ConfigureServices method our DI is not fully wired up. The best approach I could find was to create an intermediate ServiceProvider which will allow us to resolve the types currently registered. With access the ServiceProvider I can ask it for a suitable ILoggerFactory and ObjectPoolProvider.

Update: Andrew Lock had written a followup post which describes a cleaner way to configure these dependencies. I recommend you check it out.

I use the LoggerFactory to create a new ILogger and pass the ObjectPoolProvider in directly.

If we run our application and send in our demo JSON object I defined earlier the title on the resulting Thing class will be “&lt;script&gt;Something Nasty&lt;/script&gt;”.

Summary

Getting to the above code took a little trial and error since I couldn’t find any official documentation suggesting approaches for our requirement on the ASP.NET core docs. I suspect in reality our requirement is fairly rare, but for a small amount of work it does seem reasonable to sanitise JSON input to encode HTML when you never expect to receive it. By escaping everything on the way out the chances of our API returning un-encoded, un-escaped data were very low indeed. But we don’t know what another consumer of the data may do in the future. By storing the data HTML encoded in the database we have hopefully avoided that risk.

If there is a better way to modify the JsonInputFormatter and wire it up, I’d love to know. We took the best approach we could find at the time but accept there could be intended methods or flows we could have used instead. Hopefully this was useful and even if your requirement differs, perhaps you will have requirements where overriding an input or output formatter might be a solution.

GitHub Contributor Tips and Tricks (Gitiquette) My thoughts of being a better GitHub citizen

I made by first GitHub pull request back in November 2015. After a code review I needed to make some changes to the code and the learning curve of Git/GitHub bit me. I had to close my PR and open a fresh one in order to get my code accepted since I couldn’t figure out rebasing. Now a year on, I have had over 100 pull requests accepted on GitHub and have even started helping with code reviews on the allReady project. The journey over the last year started with a steep curve as I got to grips with GitHub and to some extent Git itself.

Back in February in my post about contributing to allReady I briefly covered some rebasing steps useful when contributing on GitHub. In this blog post I wanted to share some further tips and tricks, plus offer my personal views and conventions I follow while attempting to be a good GitHub citizen. I can’t claim to be an expert in GitHub etiquette and I’m still learning as I go but hopefully my experiences can help others new to GitHub to work productively and reduce the pitch of the learning curve a little.

Contributing on GitHub

There is a great post from another allReady contributor and ASP.NET monster Dave Paquette at http://www.davepaquette.com/archive/2016/01/24/Submitting-Your-First-Pull-request.aspx. Dave has covered the main steps and commands needed to work with GitHub and Git in general when contributing to opensource projects. I suggest you read his post for a detailed primer of the commands and GitHub UI.

Git / GitHub Etiquette (Gitiquette)

Rather than repeat advice and steps that Dave has given, in this post I intend to try and share small tips and tricks that I hope allow you to get the best out of opensource and GitHub. These are by no means formal rules and will differ from project to project, so be aware of that when applying them. As an aside; I was quite proud of inventing the term gitiquette before I Googled it and found other people have already used the term!

Raising GitHub Issues

If you find a bug or have an enhancement suggestion for a project, the place to start is to raise an issue. Before doing so it’s good practice to search for any issue (open or closed) that might address a similar point. Managing issues is a large part of running a project on GitHub and it’s good etiquette not to overload the project owners by repeating something that has already been asked and answered. This frees up the owners to respond on new issues.


If you can’t find a similar issue then go ahead and raise one. Try to give it a short descriptive title, summarising the issue or area of the application you are referring to. Try to choose something easy and likely to be searched so that anyone who is checking for related issues can easily find it and avoid repeating a question/bug.


If your issue is more of a question than a bug it can be useful to mark it as such by putting “(question)” or “(discussion)” in the title. Each project may have their own convention around this, so check through some prior issues first and see if there is a preferred pattern in use.


In the content of the issue, try to be as detailed (but also specific) as possible to allow others to understand the bug. Include steps to reproduce the problem if it is a bug. Also try to include screenshots where applicable to share what you’re seeing. If your issue is more of a general question, give enough detail to allow others to understand the reason behind your question. For suggestions, try to back them up with examples of why you believe what you’re suggesting is a good idea to help others make informed arguments for or against.


Keep your language polite and constructive. Remember that a lot of opensource projects are driven by very small teams or individuals around their paid jobs and families. In my opinion, issues are not the right place to complain about things but a place to register problems with a view to getting them solved. “Speak” how you would like to be spoken to.


Once you’ve raised an issue remember that the project leaders may take a while to respond to it, especially if they are a small or one person team. Give them a reasonable period of time to respond and after that time has passed, consider adding a comment as a gentle reminder. Sometimes new issue alerts can be missed so it’s usually reasonable to prompt if the issue goes quiet or has no response.


Each project may have their own process to respond to and to triage issues so try to familiarise yourself with how other issues have been handled to get an idea of what to expect. In some cases issues may be tagged initially and then someone may respond with comments later on.


Once someone responds to your issue, try to answer any follow up questions they have as soon as possible.


If the project owner disagrees with your suggestion or that a bug is actually “by design” then respect their position. If you feel strongly, then politely present your point of view and discuss it openly but once a final decision is made you should respect it. On all of the issues I’ve seen I can’t think of any occasion where things have turned nasty. It’s a great community out there and respect plays a big part in why things function smoothly.


When you start to work on an issue it’s a good idea and helpful to leave a comment on the issue so that others can quickly see that it’s being worked. This avoids the risk of duplicated efforts and wasted time. In some cases the project owners may assign you to the issue as a way of formally tracking assignments.


If you want to work on an issue but are not sure about your potential solution, consider first summarising your thoughts within the issue’s comments and get input from the project owners and other contributors before starting. This can avoid you spending time going down a route not originally intended by the project owners.


If you feel an issue is quite large and might be better broken out into smaller units of work then this can be a reasonable approach. Check with the project how they prefer to handle this but often it’s reasonable to make a sub issue(s) for the work, referencing back to the master original issue so that GitHub provides links between them. Smaller PR’s are often easier to review and merge so it’s often reasonable to break work down like this and quite often preferred.


If someone has said that they are working on an issue but they haven’t updated it or submitted a PR in some time, first check their GitHub fork and see if you can find the work in progress. If it’s being updated and progressing then they are probably still working on the issue. If not, leave a comment and politely check if they are still working on it. It’s not unusual for people to start something with the best intentions and then life comes along and they don’t have time to continue.


If you stop working on an issue part of the way through and may not be able to pick it up for a while, update the issue so others know the status. If you think you will be able to get back to it, then try to give a time frame so that the project owners can determine if someone else might need to pick it up from you. If this is the case you might want to share a link to your fork and branch so others can see what you’ve done so far. Also remember that the project may accept a partial PR as a starting point to resolving the issue. If your code could be committed without breaking the main code base, offer that option.

Git Commits

Next I want to share a few thoughts on best practices to follow when making local commits while working on an issue:

Remember to make a branch from master before starting work on a new issue. Give it a useful name so you can find it again. I personally have ended up using the issue number for most of my branches. It’s a personal choice though, so use what works for you.


It’s not unreasonable to make many work in progress commits locally whilst working on an issue as you progress through it. It can though, be helpful to squash them down before submitting your final PR (see below for a couple of techniques I use to squash my commits).


Your final commit messages that will be included in your PR should be clear and appropriately worded. Try to summarise the commit in the first line and then provide additional detail after a line break and empty line.

GitHub Pull Requests

After you’ve finished working on an issue you’re ready for a pull request. Again I’m sharing my personal thoughts on PRs here and each person and project will differ in style and preference. The main thing is to respect the convention preferred by the project so that they can easily review and accept your work.

It’s a good practice to limit a PR to a single issue. Don’t try to solve too much in one request as it makes reviewing and testing it harder for the project owners. If you spot other bugs along the way then open new issues for them as you go. You can then include the fixes in separate PR(s).


Before submitting your PR I feel it’s good practice to squash or tidy your working commits into logical final commits. This makes reviewing your steps and the content of the PR a bit easier for reviewers. Lots of work in progress or very granular commits can make a PR harder to interpret during code review. Including too many commits will also make the commit history for the project harder to follow once the PR is merged. I aim for 2-3 commits per PR as a general rule (sometimes even a single commit makes sense). Certainly there are exceptions to this rule where it may not make sense to group the commits. See below for advice on how to squash a commit.


I try to separate my work on the actual application code and the unit tests into two separate commits. Personally I feel that it’s then easier for a reviewer to first review the commit with your application code, then to review any tests that you wrote. Instead of viewing the whole set of files changed and wading through it, a reviewer can view each commit independently.


Ensure the commits you do include in your PR are clearly described. This makes understanding what each commit is doing easier during review and once they are merged into the project.


If your PR is a work in progress, mention this on the comment and in the title so that the team know it’s not ready for final merge. Submitting a partial PR can be good if you want to check what you’ve started is in the right direction and to get technical guidance. For example you might use “Fixing Bug in Controller – WIP” as your title. Each project may have a style they would like you to apply so check past PRs and see if any pattern already exists. Once you’re ready for it to be formally reviewed and merged you can update the title and leave a new comment for the project owners.


When submitting a PR, try to describe clearly what you’ve added or changed so that before even reviewing the code the reviewer has an idea of what you’re addressing and how. The PR description should contain enough detail for a reviewer to understand what areas to test and if your work includes UI elements, consider including screenshot(s) to illustrate the changes.


Include a line in your comment stating Fixes #101 (where 101 is the issue number). This will link the PR to the issue to ensure the issue is closed when the PR is accepted. If your PR is a partial fix and should not close the issue, use something like Relates to #101 instead so that the master issue is not automatically closed.


Depending on the PR and project it may be good practice or even required to include unit tests to prove your changes are working.


Before submitting your PR it’s a good idea to ensure your code is rebased from a current version of the master branch. This ensures that your code will merge easily and allows you to test it against the latest master codebase. In my earlier post I included steps explaining how to rebase. This is only necessary if the master branch has had commits added since you branched from it.


Once you receive feedback on your PR don’t take any suggestions or critique personally. It can be hard receiving negative feedback, especially in an open forum such as GitHub but remember that the reviewer is trying to protect the quality and standards of their project. By all means share your point of view if you have reason to disagree with comments, but do so with the above in mind. In my experience feedback has always been valuable and valid.


Once you have received feedback, try to apply any changes in a timely fashion while the review is fresh. You can simply make relevant changes in your local branch, make a new commit and push it to your origin. The PR will update to include the pushed changes. It’s better not to rebase and merge the commit with the original commits so that a reviewer can quickly see the changes. They may ask you to rebase/squash it down after review and before final acceptance to keep the project history clean.

Reviewing GitHub Pull Requests

I’ve started helping to review some of the allReady pull requests in recent months. Here are a few things I try to keep in mind:

Try to include specific feedback using GitHub’s line comment feature. You can click the small plus sign new the line of code you are commenting on so the review is easy to follow. The recent review changes that GitHub have added make reviewing code even more clear.


Leave a general summary explaining any main points you have or questions about the code.


Think of the contributors feelings and remember to keep the comments constructive. Give explanations for why you are proposing any changes and be ready to accept that your opinion may not always be the most valid.


Try to review pull requests as soon as possible while the code is still fresh in the contributor’s mind. It’s harder to come back to an old PR and remember what you did and why. It’s also nice to give feedback promptly when someone has taken the time to contribute.


Remember to thank people for their contributions. People are generously giving their time to contribute code and this shouldn’t be forgotten.

Git Command Reference

That’s the end of my long list of advice. I hope some of it was valuable and useful. Before I close I wanted to share some steps to handle a couple of the specific Git related tasks that I mention in some of the point above.

How to Squash Commits

I wrote earlier about squashing your commits – reducing a large number of small commits into a single or set of combined commits. There are two main techniques I use when doing this which I’ll now cover. I normally do some of these operations in a visual tool such as SourceTree but for the sake of this post I’ll cover the actual Git commands.

Squashing to a single commit

In cases where you have two or more commits in your branch that you want to squash into a single final commit I find the quickest and safest approach is to do a soft reset back to the commit you branched from on master.

Here’s an example set of three commits on a new branch where I’m fixing an issue…

C:\Projects\HTBox-AllReady>git log --pretty=format:"%h - %an, %ar : %s" -3

be7fdfe - stevejgordon, 2 minutes ago : Final cleanup
2e36425 - stevejgordon, 2 minutes ago : Ooops, fixing typo!
85148d7 - stevejgordon, 2 minutes ago : Adding a new feature

All three commits combined result in a fix, but two of these are really just fixing up and tidying my work. As I’m working locally I want to squash these into a single commit before I submit a PR.

First I perform a soft reset using…

C:\Projects\HTBox-AllReady>git reset --soft HEAD~3

This removes the last three commits, the number after the ~ symbol controlling how many commits to reset. The –soft switch tells git to leave all of the changed files intact (we don’t want to lose the work we did!) It also leave the files marked as changes to be committed. I can then perform a new commit which incorporates all of the changes.

C:\Projects\Other\HTBox-AllReady>git commit -a -m "Adding a new feature"
[#123 2159822] Adding a new feature
1 file changed, 1 insertion(+), 1 deletion(-)

C:\Projects\Other\HTBox-AllReady>git status
On branch #123
nothing to commit, working directory clean

C:\Projects\Other\HTBox-AllReady>git log --pretty=format:"%h - %an, %ar : %s" -1

2159822 - stevejgordon, 20 seconds ago : Adding a new feature

We now have a single commit which includes all of our work. Viewing it visually in source tree we can see this single commit more clearly.

Example of branch after a Git squash

We can now push this and create a clean PR.

Interactive Rebasing

In cases where you have commits you want to combine but you want also to control which commits are squashed together we have to do something a bit more advanced. In this case we can use interactive rebasing as a very powerful tool to rewrite our commit history. Here’s an example of 4 commits in a new branch making up a fix for an issue. In this case I’d like to end up with two final commits, one for the main code and one for the unit tests.

C:\Projects\Other\HTBox-AllReady>git log --pretty=format:"%h - %an, %ar : %s" -4

fb33aaf - stevejgordon, 6 seconds ago : Adding a missed unit test
8704b06 - stevejgordon, 14 seconds ago : Adding unit tests
01f0876 - stevejgordon, 42 seconds ago : Oops, fixing a typo
2159822 - stevejgordon, 7 minutes ago : Adding a new feature

Use the following command to start an interactive rebase

git rebase -i HEAD~4

In this case I’m using ~4 to say that I want to include the last 4 commits in my rebasing. The result of this command opens the following in an editor (whatever you have git configured to use)

pick 2159822 Adding a new feature
pick 01f0876 Oops, fixing a typo
pick 8704b06 Adding unit tests
pick fb33aaf Adding a missed unit test

# Rebase 6af351a..fb33aaf onto 6af351a (4 command(s))
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
# x, exec = run command (the rest of the line) using shell
# d, drop = remove commit
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

In our case we want to squash the 2nd commit into the first and the 4th into the 3rd so we update the commit lines at the of the file to…

pick 2159822 Adding a new feature
squash 01f0876 Oops, fixing a typo
pick 8704b06 Adding unit tests
squash fb33aaf Adding a missed unit test

Saving and closing the file results in the next step

# This is a combination of 2 commits.
# The first commit's message is:

Adding a new feature

# This is the 2nd commit message:

Oops, fixing a typo

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Date: Fri Sep 30 08:10:19 2016 +0100
#
# interactive rebase in progress; onto 6af351a
# Last commands done (2 commands done):
# pick 2159822 Adding a new feature
# squash 01f0876 Oops, fixing a typo
# Next commands to do (2 remaining commands):
# pick 8704b06 Adding unit tests
# squash fb33aaf Adding a missed unit test
# You are currently editing a commit while rebasing branch '#123' on '6af351a'.
#
# Changes to be committed:
# modified: AllReadyApp/Web-App/AllReady/Controllers/AccountController.cs
#

This gives us the chance to reword the final commit message of the combined 1st and 2nd commits. I will use the # to comment out the typo message since I just want this commit to read as “Adding a new feature”. If you wanted to include both messages then you don’t need to make any changes. They will both be included on separate lines in your commit message.

# This is a combination of 2 commits.
# The first commit's message is:

Adding a new feature

# This is the 2nd commit message:

#Oops, fixing a typo

I then can do the same for the other squashed commit

# This is a combination of 2 commits.
# The first commit's message is:

Adding unit tests

# This is the 2nd commit message:

#Adding a missed unit test

Saving and closing this final file I see the following output in my command prompt.

C:\Projects\Other\HTBox-AllReady>git rebase -i HEAD~4
[detached HEAD a2e69d6] Adding a new feature
Date: Fri Sep 30 08:10:19 2016 +0100
1 file changed, 2 insertions(+), 2 deletions(-)
[detached HEAD a9a849c] Adding unit tests
Date: Fri Sep 30 08:16:45 2016 +0100
1 file changed, 2 insertions(+), 2 deletions(-)
Successfully rebased and updated refs/heads/#123.

When we view the visual git history in SourceTree we can now see we have only two final commits. This looks much better and is ready to push up.

Git branch after interactive rebase

When interactive rebasing you can even reorder the commits so you have full control to craft a final set of commits that represent the public commit history you want to include.

How to pull a PR locally for review

When starting to help review PRs I needed to learn how to pull down code from a pull request in roder to run it locally and test the changes. In the end it wasn’t too complicated. The basic structure of the command we can use looks like…

git fetch origin pull/ID/head:BRANCHNAME

Where origin is the name of the remote where the PR has been made. ID is the pull request id and branchname is the name of the new branch that you want to create locally. On allReady for example I’d use something like…

git fetch htbox pull/123/head:PR123

This pulls PR #123 from the htbox remote (this is how I’ve named the remote on my system which points to the main htbox/allready repo) into a local branch called PR123. I can then checkout the branch using git checkout PR123 to test and review the code.

In Summary

Hopefully some of these tips are useful. In a lot of cases I’m sure they are pretty much common sense. A lot of it is also personal preference and may differ from project to project. Hopefully if you’re just starting out as a GitHub contributor you can put some of them to use. GitHub for the most part is a great community and is a great place to learn and develop your skills.

I’d love to be able to acknowledge the numerous fantastic blog posts, and Stack Overflow threads that helped me along my way; unfortunately I didn’t keep notes of the sources as I learned things on my 1 year journey. A big thanks none-the-less to anyone who has contributed guidance on Git and GitHub. Every little bit of shared knowledge is a big help to developers who are learning the ropes. It truly is a journey as your contribute more and more on GitHub it gets easier! Happy GitHub-ing!

Exploring Entity Framework Core 1.0.0 RTM Changes Understanding a breaking change in the update method behaviour between RC1 and RTM

It’s been a while since my last post but finally I’ve found some time to get this one together, albeit a shorter post this time around.

Outside of my day job, when time permits I like to code for the open source charity project, allReady. This ASP.NET Core web application has been developed during the betas of .NET Core through to RC1. Recently with the help of the very knowledgeable Shawn Wildermuth, the project has been upgraded to run against the final 1.0.0 RTM version of .NET core.

In this post I’m going to talk about one specific change in Entity Framework Core 1.0.0 between RC1 and RTM which caused some breaks in our code.

Before diving into the issue, I need to briefly explain the structure of our code. We have been working to move a lot of our database logic in allReady into Mediatr handlers. This has proven to be a great way to separate the concerns and split up the logic. Our controllers can send messages (commands or queries) via Mediatr to perform actions against the database. The controller have no dependencies on the database layers and therefore are nice and slim. If you want to read more about how we’ve used this pattern, I covered Mediatr in my previous blog post. For this post, we’ll be looking at the code in a particular handler. The code we’re looking at is not handler specific, I point it out just in case the class seems a little confusing as to where it fits into our project. We’ll focus in on a few specific lines of code within the hander.

In a number of places within our code we need to handle the creation or update of an record stored in the database. For example, we have the concept of Itineraries. In allReady an itinerary represents a series of work items (requests) that are grouped together in order to be worked on by volunteers.

In our .NET Core RC1 code base we had the following handler:

public class EditItineraryCommandHandlerAsync : IAsyncRequestHandler<EditItineraryCommand, int>
{
	private readonly AllReadyContext _context;

	public EditItineraryCommandHandlerAsync(AllReadyContext context)
	{
		_context = context;
	}

	public async Task<int> Handle(EditItineraryCommand message)
	{
		try
		{
			var itinerary = await GetItinerary(message) ?? new Itinerary();

			itinerary.Name = message.Itinerary.Name;
			itinerary.Date = message.Itinerary.Date;
			itinerary.EventId = message.Itinerary.EventId;

			_context.Update(itinerary);
			await _context.SaveChangesAsync().ConfigureAwait(false);

			return itinerary.Id;
		}
		catch (Exception)
		{
			// There was an error somewhere
			return 0;
		}
	}

	private async Task<Itinerary> GetItinerary(EditItineraryCommand message)
	{
		return await _context.Itineraries
			.SingleOrDefaultAsync(c => c.Id == message.Itinerary.Id)
			.ConfigureAwait(false);
	}
}

This handler is called from both the create and edit POST actions on our itinerary controller and is intended to handle both scenarios. Within the Handle method we first try to retrieve an existing itinerary based on the Id of the itinerary object being passed in as part of our message. If this does not return an existing itinerary we null coalesce and create a new empty Itinerary object. We then set the properties of our itinerary object based on those coming in via the message (populated by the user in the front end admin page). Then we call Update on the EF context, passing in the Itinerary object and finally call SaveChangesAsync to apply the changes to the database.

This is where things broke for us after beginning to use the RTM version of the EF Core library. During RC1 and prior, the Update method would check the value of the key property on the model and if it was determined it couldn’t be an existing record (i.e. an Id of zero in our case) then the Update method marked the object as Added in the dbContext change tracking, otherwise it would be set as Modified.

Between RC1 and RTM, the Entity Framework team have tightened up on the behaviour of the Update method and made it perform a little more rigidly. It now only performs the action implied by its name. Any object passed in will be marked as modified, even those objects with an Id of zero. It’s up to the caller to call this method correctly.

The resulting exception thrown when calling SaveChangesAsync (after adding a new record) when running against RTM code is…

Database operation expected to affect 1 row(s) but actually affected 0 row(s). Data may have been modified or deleted since entities were loaded. See http://go.microsoft.com/fwlink/?LinkId=527962 for information on understanding and handling optimistic concurrency exceptions.

Essentially this tells us that we sent in an object marked as modified and EF therefore expected to get a count of 1 row being modified against the database. However, since the id on our new object is zero (this is a new record), it won’t match any existing records in the database and as such, no records are actually updated.

That explains the break we experienced. On reflection, the change makes sense as it avoids any assumptions being made by EF about our intentions. We’re calling update, so it marks the object as modified. We’re expected to use the Add method for new objects.

So, with the problem understood, what do we do about it? There were a number of possible options that I considered when putting in a fix for this issue. I won’t go into great detail here since ultimately I was directed to a very sensible and simple option which I’ll share in a few minutes, but at a high level we could have…

  1. Moved from a single shared handler to two separate handlers, one specifically for creating and one specifically for editing an Itinerary. In that case each handler would know whether to call either Add or Update explicitly. Note that Update in this case would not need to be called since the object is already tracked by the context after we get it from the database – more on that later!
  2. Added some logic within our own code to check if the Id is zero and if so, assume we want to Add the object to the context instead of Update.
  3. Utilised a context extension we have in the project which tries to determine whether to call Add or whether to call Update based on the EntityState of the object. This is similar to option 2, but would allow shared use of similar logic.

All three would have worked to some extent, although not without some possible further issues we’d have needed to address. However after opening an issue on the EF core GitHub repository to try to understand the change, Arthur Vickers suggested a much cleaner solution for our case.

Arthur proposed the following replacement code:

var itinerary = await GetItinerary(message) ?? _context.Add(new Itinerary()).Entity;

itinerary.Name = message.Itinerary.Name;
itinerary.Date = message.Itinerary.Date;
itinerary.EventId = message.Itinerary.EventId;

await _context.SaveChangesAsync().ConfigureAwait(false);

It’s a small but elegant change which actually only touches two lines.

First change:

var itinerary = await GetItinerary(message) ?? _context.Add(new Itinerary()).Entity;

What this now does is first try to get an itinerary as we had before. If we have this then the itinerary variable is set and we can change the values of the properties as required. It’s important to realise at this point that since we queried the db for this object, it’s now already being tracked by the context. As such, if we adjust properties on the object, EF will detect these changes and mark them as modified. We therefore would not need to call Update which has a different intended use case.

If we don’t get an existing record back from the query, then we’re working with a new record. In that case, the code above adds a new empty itinerary object to the context. Add returns an EntityEntry and we can use the .Entity property of that to return the actual entity object (an Itinerary). It is assigned to our local variable and we can then set its properties. Since the Add method on the context has already been called it is already being tracked by the context with an EntityState of added.

Second change:

We can remove the _context.Update(itinerary); line entirely. Since we now have a correctly tracked entity in the context after our first line (either modified or added) we don’t need to try and attach it at this stage. We have re-ordered the logic a little which makes things simpler and cleaner. We can just call SaveChangesAsync() which will send SQL commands to add or update as necessary, based on the change tracking information.

In Summary

This issue highlighted for me personally that I still need to think carefully about how EF works under the covers. I’ve tried to read a lot on EF Core and feel I have a better understanding of how it works at a medium-to-high level. In this case, our code took advantage of behaviour in EF RC1 which was in reality hiding a bit of an issue in our code. I don’t think the code was “bad” exactly, just that as we’ve explored, with a bit of thinking about the change tracking behaviour, we could improve our code. At the time of writing the original code using Update for both the add and edit scenario was valid, although perhaps a little naïve. We relied on EF correctly assessing our intention to mark the object with correct state.

When working with EF I think it’s important to have a basic understanding of how the change tracking works and what it does for us. If we query for a record via the context, than that record starts being tracked. We don’t need to expressly call update since the context is already aware of the object and the change tracker can manage any modified properties during SaveChanges.

Next steps

There is certainly more for me to personally learn about EF and its API in general. For example, in this case I learned about the Entity property that EF exposes on an EntityEntry. Beyond the basics EF core exposes many ways to manage the tracking of entities and those do warrant exploration and experimentation to find the right performance vs complexity balance for each scenario.

The above code still has room for improvement as well. One thing that stands out is that we are performing a db query to get an object, in order to update it and save it. This is slightly inefficient in our case. When building the edit page, we’ve already queried for the object to set the form fields in the UI. On our post we’re then querying again, purely to attach the object in it’s current state to the context. A pattern I’ve started using elsewhere for a more performant update is to manually attach an object and mark it’s properties as modified without the need to query it first. In this case, it may be unnecessarily complex in order to remove a pretty light db query, but as always, it’s worth considering.

My thanks go out to Arthur Vickers for his response to my EF issue. It’s extremely helpful being able to reach out to the team directly as we all learn the nuances of the changes in the .NET core libraries.

CQRS with Mediatr and ASP.NET Core Implementing basic CQRS with ASP.NET Core

I was first introduced to the Mediatr library when I started contributing to the allReady project. It is now being used quite extensively within that application. It has proven to be very useful in decoupling code and separating the concerns. Contributors to the project have recently worked through a good chunk of the codebase and moved many database commands and queries over to the Mediatr request/response pattern. This is allowing us to move away from a large data access wrapper to multiple handlers that clearly handle one function and which are much easier to maintain. This has led to smaller, more testable classes and made the code easier to read as a result.

CQRS Overview

Before going into Mediatr specifically I feel it’s worth briefly talking about Command Query Responsibility Segregation or CQRS for short. CQRS is a pattern that seeks to separate the code and models which perform query logic from the code and models which perform commands such as an insert or update. In each case the model to define the input and output usually differs. By separating the commands and queries it allows the input/output models to be more focused on the specific task they are performing. This makes testing the models simpler since they are less generalised and are therefore not bloated with additional code. Rather than returning an entire database model, a query response model will usually contain only a subset of a table’s fields and possibly data from many related objects, all needed to form a particular view. The input model for a query may be very small. Commands on the other hand will usually require larger input models which more closely map to a full database table and have slimmer response models. Commands may perform some business logic on the properties in order to validate the object before saving it into a database. By contrast the models used for a query will generally contain less business logic.

As with any pattern, there are pros and cons to consider. Some may feel that the complexity added by having to manage different models may outweigh the benefits of separating them. Also, as with all patterns, the concept can be taken too far and start to become a burden on productivity and readability of the code. Therefore the degree to which one uses the CQRS pattern should be governed by each use case. If it’s not providing value, then don’t use it!

Coming back to the allReady project; the approach taken there has been to separate the querying of data used to build the view models from the commands used to update the database. Queries occur far more often than commands, as each page load will need to build up a view model, often with calls to the database to pull in relevant data. By keeping the queries distinct from the commands we can manage the exact shape of the input as well as the size of the data being returned. Queries need to perform quickly since they have a direct effect on user experience and page loads times. Keeping the models as slim as possible and only querying for the required database columns can help the overall performance.

Back to Mediatr

The Mediatr library provides us with a messaging solution and is a nice fit to help us introduce some concepts from the CQRS pattern into our code. In allReady it has allowed the team to greatly simplify the controllers and in many cases they now have a single dependency on Mediatr which is injected by the built in ASP.NET Core dependency injection. The MVC actions use Mediatr to send messages for the data they need to populate the view (queries) or to perform actions that update the database (commands).

Mediatr has the concept of handlers which are responsible for dealing with a query or command message. A handler is setup to handle a particular message which will contain the input needed for the command or query. A query message will usually need only a few properties, perhaps just an id of the object to query for. A command message may contain a more complete object with all of the model’s properties that need to be updated by the handler.

Using Mediatr with ASP.NET Core

Using Mediatr in an ASP.NET Core project is pretty straightforward. There are a couple of steps required in order to set things up.

Firstly we need to bring in the Mediatr package from Nuget. The quickest way is to use the package manager console by issuing the command “Install-Package MediatR”. At the time of writing the current version is 2.0.2.

Now that we have Mediatr added to our project we need to register it’s classes with the ASP.NET Core Dependency Injection (DI) container. The exact way you do this will depend on which DI container you are using. I’m going to show how I’ve got it working in ASP.NET Core with the default container. I ended up pretty much following a great Gist that I found. It got me started with registering Mediatr and it’s delegate factories so all credit to the author.

Within the Startup.cs class ConfigureServices method I added the following code to register Mediatr.

services.AddScoped<IMediator, Mediator>();
services.AddTransient<SingleInstanceFactory>(sp => t => sp.GetService(t));
services.AddTransient<MultiInstanceFactory>(sp => t => sp.GetServices(t));
services.AddMediatorHandlers(typeof(Startup).GetTypeInfo().Assembly);

First I add the Mediatr component itself. There are also two delegate types for the Mediatr factories which must be registered. The final line calls an extension method which will look through the assembly and ensure that any class which is a type of IRequestHandler or IAsyncRequestHandler is registered. By reflecting through the assembly in this way we avoid having to manually map each handler in DI when we create it.

public static class MediatorExtensions
{
	public static IServiceCollection AddMediatorHandlers(this IServiceCollection services, Assembly assembly)
	{
		var classTypes = assembly.ExportedTypes.Select(t => t.GetTypeInfo()).Where(t => t.IsClass && !t.IsAbstract);

		foreach (var type in classTypes)
		{
			var interfaces = type.ImplementedInterfaces.Select(i => i.GetTypeInfo());

			foreach (var handlerType in interfaces.Where(i => i.IsGenericType && i.GetGenericTypeDefinition() == typeof(IRequestHandler<,>)))
			{
				services.AddTransient(handlerType.AsType(), type.AsType());
			}

			foreach (var handlerType in interfaces.Where(i => i.IsGenericType && i.GetGenericTypeDefinition() == typeof(IAsyncRequestHandler<,>)))
			{
				services.AddTransient(handlerType.AsType(), type.AsType());
			}
		}

		return services;
	}
}

The AddMediatorHandlers method first finds all class types in the assembly. It loops through each class and gets it’s interfaces. If any of the interfaces are an IRequestHandler or IAsyncRequestHandler then we add a transient mapping to the services collection.

If you need further details or samples for registering Mediatr with a different DI container I recommend you check out the wiki on Github which contains some setup guidance and links to samples.

Messages and Handlers

The pattern we’ve employed in allReady is to use the Mediatr handlers to return ViewModels needed by our actions. An action will send a message of the correct type to the Mediatr instance and expect a ViewModel in return. All of the logic to handle the DB queries which fetch the data needed to build up the view model are contained within the handler. We also use Mediatr to issue and handle commands for HTTP post/put/delete request actions. These actions will often need to update a record in the database. We send the created/updated object in the message and a handler picks it up, processes it and returns a success or failure result back to the action.

You can also chain Mediatr handlers by having a handler send out it’s own message which allows you to compose queries to get the data you need. For example if you have a handler which reads a user record from a database, this same user model may be needed as part of multiple view models. Rather than code the same database query each time within each handler, you can place your data access query inside a single handler. This handler can then return the user data to any other handler which sends a message for the user data. This allows us to adhere to the don’t repeat yourself principle by writing the code and logic only once. We can also test that logic to ensure that it works as expected and be confident that as everyone uses it they can expect consistent responses.

To create a request message in Mediatr you create a basic class marked as an implementation of the IRequest or IAsyncRequest interface. I try to use async methods for everything I do in ASP.NET Core so I’ll stick to async examples in this post. You can optionally specify the return type you expect from the handler. An async handler will return that object wrapped in a task which can be awaited.

Your message class will define all of the properties expected to be in the message. Here is an example of a basic message which will send an Id out and which expects the response from the handler to be a UserViewModel.

public class UserQuery : IAsyncRequest<UserViewModel>
{
	public int Id { get; set; }
}

With a request message defined we can now go ahead and create a handler that will respond to any messages of that type. We need to make our class implement the IRequestHandler or in my case IAsyncRequestHandler interface, defining the input and output types.

public class UserQueryHandlerAsync : IAsyncRequestHandler<UserQuery, UserViewModel>
{
    public async Task<UserViewModel> Handle(UserQuery message)
    {
        // Could query a db here and get the columns we need.
        
        viewModel = new UserViewModel();
        viewModel.UserId = 100;
        viewModel.Username = "sgordon";
        viewModel.Forename = "Steve";
        viewModel.Surname = "Gordon";

        return viewModel;
    }
}

This interface defines a single method named Handle which returns a Task of your output type. This expects your request message object as it’s parameter.

In my example I’m simply newing up a UserViewModel object, setting it’s properties and returning it. In the real world this would be where I query the database using Entity Framework and build up my view model from the resulting data.

I personally have been in the habit of keeping my request message and my response handler classes together in the same physical .cs file, but you can split them if you prefer. I’m normally keen on keeping one class to one file, but in this case since the two classes are very interrelated I’ve found it quicker to work when I can see both in the same file.

We now have everything wired up so finally it’s time to send a message from our controller.

public class UsersController : Controller
{
    private readonly IMediator _mediator;

    public UsersController(IMediator mediator)
    {
        if (mediator == null)
            throw new ArgumentNullException(nameof(mediator));

        _mediator = mediator;
    }

    [HttpGet]
    [Route("users/{userId}")]
    public async Task<IActionResult> UserDetails(int userId)
    {
        UserViewModel model = await _mediator.SendAsync(new UserQuery { Id = userId });

        if (model == null)
            return HttpNotFound();

        return View(model);
    }
}

The key things to highlight here are the controller’s constructor accepting an IMediatr object. This will be injected by the ASP.NET Core DI when the application runs. What’s very useful is that we can easily mock an IMediatr and it’s response which makes testing a breeze.

The UserDetails action itself expects a user id when it is called. This id gets bound from the route parameter by MVC.

The key line in the code above is where we send the mediator message. We do this by calling SendAsync on the IMediatr object. We send a UserQuery object with the Id property set. This message will now be managed by Mediatr. It will locate the suitable handler, pass it the request message and return the response to our action.

As you can see, this has made our controller very light. The only code left is a basic check to return an appropriate not found response if the response to our Mediatr request is null. That won’t ever be true in my example, but in a real world app if the database doesn’t find an object with the id provided I return null instead of a UserViewModel. This is exactly how I like a controller to be, it’s single responsibility is to send the client a HTTP response of some kind to the user’s request. It doesn’t and shouldn’t need to know about our database or have any concerns with building up it’s view model directly.

Testing

Being good citizens we should always consider the testing process. Testing when using Mediatr and a CQRS style pattern is very simple. My approach has been to ensure that each handler has appropriate unit tests around the handle method testing the logic within. To do this we can new up a Mediatr handler in our test class and then we can call the Handle method direct and run tests on the returned object to verify the result.

[Fact]
public async Task HandlerReturnsCorrectUserViewModel()
{ 
    var sut = new UserQueryHandlerAsync();
    var result = await sut.Handle(new UserQuery { Id = 100 });

    Assert.NotNull(result);
    Assert.Equal("Steve", result.Forename);
}

This is a bit of a contrived example, especially as my handler example really doesn’t perform any logic. However we can test for whatever is necessary on the returned result. You can check out the allReady code on Github to see some real examples of tests around the handlers used there. In those cases we often use an in memory Entity Framework DbContext object so that we can test the handler’s EF query returns the expected data from a known set of test data.

We can also test the controllers very easily by passing in a mock of the IMediatr.

[Fact]
public void UserDetails_SendsQueryWithTheCorrectUserId()
{
    const int userId = 1;
    var mediator = new Mock<IMediator>();
    var sut = new UserController(mediator.Object);

    sut.UserDetails(userId);

    mediator.Verify(x => x.SendAsync(It.Is<UserQuery>(y => y.EventId == userId)), Times.Once);
}

We create a mock IMediatr using Moq and pass that in when instantiating a controller. Here I’ve called the UserDetail action with an Id and verified that a query has been sent to the mediator containing that Id.

If necessary you can setup your IMediatr mock so that you define the data that is returned in response to a message. This can be useful if you want to validate your action’s behaviour to different responses. You can mock up the response object using code such as…

var user = new UserViewModel
{
    viewModel.UserId = 100,
    viewModel.Username = "sgordon",
    viewModel.Forename = "Steve",
    viewModel.Surname = "Gordon",
};

var mediator = new Mock<IMediator>();
mediator.Setup(x => x.SendAsync(It.IsAny<UserQuery>())).Returns(user);

If your controller performs any logic based on the returned object you can now easily specify the different scenarios to test that. Something I often do is to write a test that verifies that when the Mediatr response is null the action sends a HttpNotFound result. In a simple example that can be done in the following way…

[Fact]
public async Task UserDetailsReturnsHttpNotFoundResultWhenUserIsNull()
{
    var mediator = new Mock<IMediator>();

    var sut = new UserController(mediator);

    var result = await sut.UserDetails(It.IsAny<int>());

    Assert.IsType<HttpNotFoundResult>(result);
}

Summing Up

I’ve really taken to the pattern that Mediatr allows us to easily implement. It’s a personal choice of course but my view is that it keeps my controllers clean and allows me to create handlers that have a single responsibility. It keeps things nicely separated as nothing it too tightly bound together. I can easily change the behaviour of a handler and as long as it still returns the correct object type my controllers never care.

As I’ve shown the testing process is pretty nice and if we ensure each handler is tested as well as the controllers, then we have good coverage of the behaviours we expect from the classes. A big bonus is that it already supports ASP.NET Core and is pretty simple to setup with the built-in DI container.

Mediatr also supports a publisher/subscriber pattern which I’ve yet to need in my code. It’s something worth taking a look at though if you need multiple handlers to respond when an event occurs. It’s something that I plan to look into at some point.

I highly recommend trying out the Mediatr library and reviewing the pattern being used on the allReady project. It takes little time to setup and quickly become a comfortable flow when writing code. It’s made me think about what my models are involved in and helped me keep them focused and more robust.

NOTE: This post was written based on RC1 of ASP.NET Core and may not be current by the time RC2 and RTM are released.

Extending the ASP.NET Core 1.0 Identity SignInManager Adding basic user auditing to ASP.NET Core

So far I have written a couple of posts in which I dive into the code for the ASP.NET Core 1.0 Identity library. In this post I want to do something a little more practical and look at extending the default identity functionality. I’m working on a project at the moment which will be very reliant on a strong user management system. As I move forward with that and build up the requirements I will need to handle things not currently available in the Identity library. Something missing from the current Identity library is user security auditing, an important feature for many real world applications where compliance auditors may expect such information to be available.

Before going further, please note that this code is not final, production ready code. At this stage I want to prove my concept and meet some initial requirements that I have. I expect I’ll end up extending and refactoring this code as my project develops. Also, at the time of writing ASP.NET Core 1.0 is at release candidate 1. We can expect some changes in RC2 and RTM which may require this code to be adjusted. Feel free to do so, but copy and paste at your own risk!

At this stage in my project, my immediate requirement is to store successful login, failed login and logout events in an audit table within my database. I would like to collect the visitor IP address also. This data might be useful after some kind of security breach; for example to review who was logged into the system as well as where from. It would also allow for some analysis of who is using the application and how often / at what times of day. Such data may prove useful to plan upgrades or to encourage more use of the application. Remember that if you record this information, particularly within a public facing SaaS style application, you may well need to include details of what you’re data recording and why, in your privacy policy.

I could implement this auditing functionality within my controllers. For example I could update the Login action on the Account controller to write into an audit table directly. However I don’t really like that solution. If anyone implements a new controller/action to handle login or logout then they would need to remember to also add code to update the audit records. It makes the Login action method more responsible than it should be for performing the audit logic, when really this belongs deeper in the application.

If we take a look at the Login action on the Account controller we can see that it calls into an instance of a SignInManager. In a default MVC application this is setup in the dependency injection container by the call to AddIdentity within the Startup.cs class. The SignInManager provides the default implementations of sign in and sign out logic. Therefore this is a better candidate in which to override some of those methods to include my additional auditing code. This way, any calls to the sign in manager, from any controller/action will run my custom auditing code. If I need to change or extend my audit logic I can do so in a single class which is ultimately responsible for handling that activity.

Before doing anything with the SignInManager I needed to define a database model to store my audit records. I added a UserAudit class which defines the columns I want to store:

public class UserAudit
{
	[Key]
	public int UserAuditId { get; private set; }

	[Required]
	public string UserId { get; private set; }

	[Required]
	public DateTimeOffset Timestamp { get; private set; } = DateTime.UtcNow;

	[Required]
	public UserAuditEventType AuditEvent { get; set; }

	public string IpAddress { get; private set; }   

	public static UserAudit CreateAuditEvent(string userId, UserAuditEventType auditEventType, string ipAddress)
	{
		return new UserAudit { UserId = userId, AuditEvent = auditEventType, IpAddress = ipAddress };
	}
}

public enum UserAuditEventType
{
	Login = 1,
	FailedLogin = 2,
	LogOut = 3
}

In this class I’ve defined an Id column (which will be the primary key for the record), a column which will store the user Id string, a column to store the date and time of the audit event, a column for the UserAuditEventType which is an enum of the 3 available events I will be auditing and finally a column to store the user’s IP address. Note that I’ve made the UserAuditId a basic auto-generated integer for simplicity in this post, however in my final code I’m very likely going to use fluent mappings to make a composite primary key based on user id and the timestamp instead.

I’ve also included a static method within the class which creates a new audit event record by taking in the user id, event type and the ip address. For a class like this I prefer this approach versus exposing the property setters publically.

Now that I have a class which represents the database table I can add it to the entity framework DbContext:

public class ApplicationDbContext : IdentityDbContext<ApplicationUser>
{
	public DbSet<UserAudit> UserAuditEvents { get; set; }
}

At this point, I have a new table defined in code which needs to be physically created in my database. I will do this by creating a migration and applying it to the database. As of ASP.NET Core 1.0 RC1 this can be done by opening a command prompt from my project directory and then running the following two commands:

dnx ef migrations add “UserAuditTable”

dnx ef database update

This creates a migration which will create the table within my database and then runs the migration against the database to actually create it. This leaves me ready to implement the logic which will create audit records in that new table. My first job is to create my own SignInManager which inherits from the default SignInManager. Here’s what that class looks like before we extend the functionality:

public class AuditableSignInManager<TUser> : SignInManager<TUser> where TUser : class
{
	public AuditableSignInManager(UserManager<TUser> userManager, IHttpContextAccessor contextAccessor, IUserClaimsPrincipalFactory<TUser> claimsFactory, IOptions<IdentityOptions> optionsAccessor, ILogger<SignInManager<TUser>> logger)
		: base(userManager, contextAccessor, claimsFactory, optionsAccessor, logger)
	{
	}
}

I define my own class with it’s constructor inheriting from the base SignInManager class. This class is generic and requires the type representing the user to be provided. I also have to implement a constructor, accepting the components which the original SignInManager needs to be able to function. I pass these objects into the base constructor.

Before I implement the logic and override some of the SignInManager’s methods I need to register this custom SignInManager class with the dependency injection framework. After checking out a few sources I found that I could simply register this after the AddIdentity services extension in my StartUp.cs class. This will then replace the SignInManager previously registered by the Identity library.

Here’s what my ConfigureServices method looks like with this code added:

public void ConfigureServices(IServiceCollection services)
{
	// Add framework services.
	services.AddEntityFramework()
		.AddSqlServer()
		.AddDbContext<ApplicationDbContext>(options =>
			options.UseSqlServer(Configuration["Data:DefaultConnection:ConnectionString"]));

	services.AddIdentity<ApplicationUser, IdentityRole>()
		.AddEntityFrameworkStores<ApplicationDbContext>()
		.AddDefaultTokenProviders()
		.AddUserManager<AuditableUserManager<ApplicationUser>>();

	services.AddScoped<SignInManager<ApplicationUser>, AuditableSignInManager<ApplicationUser>>();

	services.AddMvc();

	// Add application services.
	services.AddTransient<IEmailSender, AuthMessageSender>();
	services.AddTransient<ISmsSender, AuthMessageSender>();
}

The important line is services.AddScoped<SignInManager<ApplicationUser>, AuditableSignInManager<ApplicationUser>>(); where I specificy that whenever a class requires a SignInManager<ApplicationUser> the DI container will return our custom AuditableSignInManager<ApplicationUser> class. This is where dependency injection really makes life easier as I don’t have to update multiple classes with concreate instances of the SignInManager. This one change in my startup.cs file will ensure that all dependant classes get my custom SignnManager.

Going back to my AuditableSignInManager I can now make some changes to implement the auditing logic I require.

public class AuditableSignInManager<TUser> : SignInManager<TUser> where TUser : class
{
	private readonly UserManager<TUser> _userManager;
	private readonly ApplicationDbContext _db;
	private readonly IHttpContextAccessor _contextAccessor;

	public AuditableSignInManager(UserManager<TUser> userManager, IHttpContextAccessor contextAccessor, IUserClaimsPrincipalFactory<TUser> claimsFactory, IOptions<IdentityOptions> optionsAccessor, ILogger<SignInManager<TUser>> logger, ApplicationDbContext dbContext)
		: base(userManager, contextAccessor, claimsFactory, optionsAccessor, logger)
	{
		if (userManager == null)
			throw new ArgumentNullException(nameof(userManager));

		if (dbContext == null)
			throw new ArgumentNullException(nameof(dbContext));

		if (contextAccessor == null)
			throw new ArgumentNullException(nameof(contextAccessor));

		_userManager = userManager;
		_contextAccessor = contextAccessor;
		_db = dbContext;
	}

	public override async Task<SignInResult> PasswordSignInAsync(TUser user, string password, bool isPersistent, bool lockoutOnFailure)
	{
		var result = await base.PasswordSignInAsync(user, password, isPersistent, lockoutOnFailure);

		var appUser = user as IdentityUser;

		if (appUser != null) // We can only log an audit record if we can access the user object and it's ID
		{
			var ip = _contextAccessor.HttpContext.Connection.RemoteIpAddress.ToString();

			UserAudit auditRecord = null;

			switch (result.ToString())
			{
				case "Succeeded":
					auditRecord = UserAudit.CreateAuditEvent(appUser.Id, UserAuditEventType.Login, ip);
					break;

				case "Failed":
					auditRecord = UserAudit.CreateAuditEvent(appUser.Id, UserAuditEventType.FailedLogin, ip);
					break;
			}

			if (auditRecord != null)
			{
				_db.UserAuditEvents.Add(auditRecord);
				await _db.SaveChangesAsync();
			}
		}

		return result;
	}

	public override async Task SignOutAsync()
	{
		await base.SignOutAsync();

		var user = await _userManager.FindByIdAsync(_contextAccessor.HttpContext.User.GetUserId()) as IdentityUser;

		if (user != null)
		{
			var ip = _contextAccessor.HttpContext.Connection.RemoteIpAddress.ToString();

			var auditRecord = UserAudit.CreateAuditEvent(user.Id, UserAuditEventType.LogOut, ip);
			_db.UserAuditEvents.Add(auditRecord);
			await _db.SaveChangesAsync();
		}
	}
}

Let’s step through the changes.

Firstly I specify in the constructor that I will require an instance of the ApplicationDbContext, since we’ll directly need to work with the database to add audit records. Again, constructor injection makes this nice and simple as I can rely on the DI container to supply the appropriate object at runtime.

I’ve also added some private fields to store some of the objects the class receives when it is constructed. I need to access the UserManager, DbContext and IHttpContextAccessor objects in my overrides.

The default SignInManager defines it’s public methods as virtual, which means that since I’ve inherited from it, I can now supply overrides for those methods. I do exactly that to implement my auditing logic. The first method I override is the PasswordSignInAsync method, keeping the signature the same as the original base method. I await and store the result of the base implementation which will actually perform the sign in logic. The base method returns a SignInResult object with the result of the sign in attempt. Now that I have this result I can use that to perform some audit logging.

I cast the user object to an IdentityUser so that I can access it’s ID property. Assuming this cast succeeds I can go ahead and log an audit event. I get the remote IP from the context, then I inspect the result and call it’s ToString method(). I use a switch statement to generate an appropriate call to the CreateAuditEvent method passing in the correct UserAuditEventType. If a UserAudit object has been created I then write it into the database via the DbContext that was injected into this class when it was constructed.

I have a very similar override for the SignOutAsync method as well. In this case though I have to get the user via the HttpContext and use the UserManager to get the IdentityUser based on their user id. I can then write a logout audit record into the database. Running my application at this stage and performing some logins, login attempts with an incorrect password and logging out I can check my database and see some records being stored in the database.

db

Summing Up

Whilst not yet fully featured, this blog post hopefully demonstrates the initial steps that we can follow to quite easily extend and override the ASP.NET Core Identity SignInManager class with our own implementation. I expect to be refactoring and extending this code further as my requirements determine.

For example, while the correct place to call the auditing logic is from the SignInManager, I will likely create an AuditManager class which should have the responsibility to actually create and write the audit records. If I do this then I will still need my overridden SignInManager class which would require an injected instance of the AuditManager. As my audit needs grow, so will my AuditManager class and some code will likely get reused within that class.

Including an extra class at this stage would have made this post a bit more complex and have taken me away from my initial goal of showing how we can extend the functionality of the SignInManager class. I hope that this post and the code samples prove useful to others looking to do similar extensions to the default behaviour.