In my previous post, I introduced this series in which I’ll share my experiences while learning about the new performance features in C# and the .NET Core (corefx) framework. In this post, I want to focus on benchmarking existing code and establishing baselines.
Why Benchmark Code?
The reason that I’ve started with benchmarking is that before we can and should start optimising code, we should first understand our current position. This is critical in order for us to validate that our changes are having the impact we desire and most importantly, not making our performance worse. In my experience, performance work is very much an iterative process or measuring, making small changes and measuring again to check the effect of the change.
Arguably, there are other places that I could have started in this series, perhaps with profiling, tracing or metrics gathering. All of these may be necessary in order to target the services which should be optimised and at the code level, the classes and methods which should be your sights. I’ve decided to skip past these higher level techniques for now, partly because they are areas I’m not fully confident in, certainly to a level where I am able to provide good guidance on them. Also, they are vast topics of their own which I feel would distract from my focus on the language and framework features.
For real-world scenarios, you’ll likely need to use such techniques to first narrow down places the places where you should spend time optimising. Sometimes good guesses can be made but whenever possible it best to be scientific in your endeavours and back up theories with actual data. I may one day come back to these broader areas but for now, I’ll assume that you have some idea of the code paths you want to improve. If you do want to learn more about profiling your code, I learned a lot from reading “Pro .NET Memory Management: For Better Code, Performance, and Scalability” by Konrad Kokosa.
Baselining is the process of establishing current performance under typical conditions for your code. In .NET, at the code level, there are a number of techniques which may work. Sometimes the use of a simple stopwatch will be a starting point to gather general timing data. Be aware that many conditions could affect your measurements and their accuracy. A benefit is that stopwatches are simple to use and can provide quick results. There’s nothing wrong in my opinion with gathering some basic data in this way, as long as the compromises in accuracy are understood.
Once you have narrowed your focus to particular areas of your code you start to get down to the method level. At this point, it’s useful to begin recording more accurate and specific benchmarks for your existing methods and code. This is where benchmarking should become your tool of choice. In C# we have a fantastic option in the form of Benchmark.NET. This library provides a vast array of benchmark tooling that can be used to measure and benchmark .NET Code. Benchmark .NET is now regularly used by the teams at Microsoft to measure their code.
What is a Benchmark?
A benchmark is simply a measurement or set of measurements relating to the execution of some code. Benchmarks allow you to compare the relative performance of code as you begin making efforts to improve performance. A benchmark can be quite wide in scope or as is often the case you may find yourself testing small changes in micro-benchmarks. The main thing is to ensure that you have a mechanism to compare proposed changes against the original code which then guide your optimisation work. It’s important to use data, not assumptions when optimising code.
How to Benchmark C# Code
Hopefully, by now you are sold on the concept of benchmarks so let’s start with a simple example. If you want to follow along, the complete code for this post is available on the “Benchmarks” branch of this sample repository.
Let’s imagine we have identified the following NameParser as an area of our application under heavy load and a potential performance bottleneck.
This code is a naïve implementation used to return the last name from an input string which is assumed to be the full name of a person. For the purposes of this demo, it assumes the last word, after any spaces represents the last name. This is very much a simplified example for now and it’s likely that the methods you’ll want to benchmark will be doing more complex work! Sometimes you’ll be able to directly reference and benchmark code from your existing codebase where the methods are small enough and publicly exposed. At other times, I’ve found myself creating benchmarks by copying relevant sections of code into my benchmark project in order to narrow the focus to particular lines of code. This is an area I need to give some more time to, to identify good practices around structuring my benchmarks.
The first step is to install the Benchmark.NET library. Typically, as you’ll likely already be doing for unit tests, you’ll create a separate project to hold your benchmarks. From this benchmarking project, you’ll reference your projects containing the code you want to benchmark. To keep my sample quite simple, I’ve left everything in a single project for now.
For general benchmarks, you will need just the main BenchmarkDotNet package from NuGet. I installed mine by adding it to my sample project using “dotnet add package BenchmarkDotNet –version 0.11.3” from the command line.
The next step is to create your benchmarks by creating a new class to contain them. The benchmark class will be run by Benchmark.NET and the results from any benchmark methods will be included in the output. Here’s my NameParserBenchmarks class.
The class itself is marked with an attribute from the BenchmarkDotNet.Attributes namespace. Benchmark.NET has the concept of diagnosers to control the things which are measured and included in the results. Without any additional diagnosers attached it will provide just timing data for the code being benchmarked. The memory diagnoser supports the additional measurement of allocations and GC collections which can be extremely helpful when optimising code.
I have a single method in the preceding code called GetLastName which benchmarks the existing GetLastName method in my NameParser class by calling it. I’ve marked this method with the Benchmark attribute so that it is executed and included in the results by Benchmark.NET. I can supply a value for the baseline property as I’ve done here to mark this particular method as my baseline. This is the existing code we’re measuring and this will be useful later as all other benchmarks will be compared in relation to this initial code.
To support the benchmark I’ve included a static string value of the name to be parsed in the benchmark. I’ve also included a static field holding a reference to a new NameParser instance. I don’t want to include these within the Benchmark method itself since I want to measure the performance and allocations of the GetLastName method in isolation.
The final step is to set up and trigger the runner for Benchmark.NET. In this sample, I’m running everything from a single project so I’ll update the Main method of the Program class.
The call to the generic BenchmarkRunner.Run method accepts the class for which any benchmarks should be run. By default, the results of the benchmarks will be logged to the console.
At this stage, we’re ready to run the benchmarks. For best results, it’s recommended that you do this on a device with as little else running as possible. Closing all other applications and killing unnecessary processes will yield the most stable results. On my development machine, once everything is closed I’ll trigger running the benchmarks from the command line.
Benchmarks should be run against release code to ensure all optimisations are included. From my projects directory, I’ll run “dotnet build -c Release” to create a release build.
Once the build completes I can navigate into the folder containing the built code: “cd bin/Release/netcoreapp2.2”
Finally, I can run the benchmark by running the built assembly using “dotnet BenchmarkAndSpanExample.dll” for my sample application.
The length of time that it’ll take to run your benchmarks will depend on your machine and the code under test. Benchmark.NET performs a number of stages to warm-up the code and ensure that multiple iterations are run to provide consistent statistical data. It uses a pilot stage to work out the optimal number of iterations to run, although you can configure this if you need to.
Interpreting the Results
Once it completes you should have the summary results written to your console window. If you prefer, various outputs are generated in the BenchmarkDotNet.Artifacts folder under the location where you have run the application. This includes a HTML version of the summary which can be more easily shared.
The summary of my machine looks like this:
For each benchmarked method you’ll have a row with the result data. Here I have a single line for my benchmark of the GetLastName method. It’s mean execution time is 125.8 nanoseconds; not too shabby! Other statistical data is available for the error and standard deviation of the timing data across the iterations.
Because I included the memory diagnoser attribute I have some extra columns included containing memory related statistics. The first three columns relate to GC collections. They are scaled to show the number per 1,000 operations. In this case, my method would have had to be called very often to trigger a Gen 0 collection and is not likely to cause Gen 1 or Gen 2 collections. The final column is very helpful and it shows the allocated memory per operation. My name parser code currently allocates 160 Bytes every time it is called. In the grand scheme of things, that’s not much at all but we’ll see in a future post how we can reduce this. Remember that whilst allocations in .NET are cheap, there may be more impact caused by GC work to collect and clean-up these objects. In hot paths (highly called methods) this can soon add up.
In my first post, I mentioned a worker process that I maintain which processes between 17 and 20 million events per day. If I needed to call this GetLastName method when processing each event, that would cause 3.2GB of allocations per day! At scale, such small numbers can quickly add up!
Before attempting any optimisation work on code, it’s valuable and important to always establish baselines first. That way you can truly see whether the improved code is faster and/or allocates less than your original code. Measurement of the improvements can help to guide further optimisations and also provide crucial data that can justify the time spent making code such improvements. Benchmarking with a tool like Benchmark.NET is pretty straightforward for simple measurements and with little work, it makes comparing code performance a painless process.
In this post, we’ve seen how we can use Benchmark .NET to baseline some existing code to understand how quickly it runs and how much memory it allocates. In the next post, I’ll introduce Span<T> and we’ll use Benchmark .NET to measure the improvement.