Benchmarking Applications with BenchmarkDotNet – Introduction

TL;DR

BenchmarkDotNet is a Library that enable Developers to define Performance Tests for Applications, it abstracts the complexity from the Developer and allows a degree of extensibility and customisation through its API. Developers can get started pretty quick and refer to is documentation for Advanced Features.

This is not a post to actually perform benchmarking but rather introduce BenchmarkDotNet to Developers. Sources used are available in Github.

 

Introduction

Typically, when we develop a piece of software some degree of testing and measuring is warranted, the level of it depends on the complexity of what we are developing, desired coverage, etc. Unit or Integration Tests are by default part of any project most of the time and in the mind if any software developer, however, when it comes with benchmarking things tend to be a little different.

Now, as a very personal opinion, benchmarking and testing is not the same, the goal of testing is functionality, comparing expected vs actual results, this being a very particular aspects of Unit Tests, we also have Integration Tests which present different characteristics and goals.

Benchmarking on the other hand is about measuring execution, we will likely establish a Baseline that we can compare against, at least once. there are things we’d be interested in like execution time, memory allocation, among other performance counters. Accomplishing this can be challenging, but most importantly, to do it right. so here is where BenchmarkDotNet comes into the play, like with other aspects and problems of when developing software, it is a library that abstracts various aspects from the Developer to define, maintain and leverage Performance Tests.

Some facts about BenchmarkDotNet

  • Part of the .NET Foundation.
  • Runtimes supported .NET Framework (4.6.x+), .NET Core (1.1+), Mono.
  • OS supported: Windows, macOS, Linux

 

Benchmarking our Code

While it is unlikely that every piece of code in the system will need to be Benchmarked, there might be scenarios where we are able to identify modules that are part of the a critical path in our System and given their nature they might subject to benchmarking, what to measure and how intensive would vary on a case-by-case basis.

Whether we have expectations, we need to ensure that critical modules in a System are performing at it most optimal point if possible or at least acceptable so that we can establish a baseline that we can used to compare against as we do gradual and incremental improvements. We need to define a measurable, quantifiable truth for performance indicators.

Usage of a Library

Reasons we have many and are applicable to practically any library, favour reusability, avoid reinventing the wheel but most importantly is about relying on some that is heavily tested and proven to work, in this particular case we care about how accurate the results are, and that depends on the approach, we have all seen examples out there where modules like Stopwatch is used and while it is not entirely bad, it is unlikely it will ever provide the same accuracy BenchmarkDotNet provides nor the flexibility or extensibility, to mention some features BenchmarkDotNet provides:

  • Allows the Developer to target multiple runtimes through Jobs, for instance various version of the .NET Framework and .NET Core, this is instrumental to prevent extrapolating results.
  • Generation of reports in various formats that can be analysed.
  • Provides execution isolation to ensure running conditions are optimal.
  • Takes care of aspects like performing several iterations of execution and warm-up.

More information about the actual flow can be found in the How it works section in the Documentation.

BenchmarkDotNet

Elements

The library provides the necessary components so we can get started and running pretty quick with almost no effort, on the other hand, it also provides Advanced Features to expand the library usage, to mention some:

  • Define Baselines.
  • Target multiple Runtimes through Jobs.
  • Extensive customisation of Jobs through their Configuration API.
  • Control the results output format, through Columns.
  • Collect additional information through Diagnostics.

Quick Setup

Like many other libraries, BenchmarkDotNet comes in the form of a NuGet package, we will install it on the project where we will define our tests, one way is through the Package Manager Console, doing an Install-Package BenchmarkDotNet will be all we need to do.

Designing a Benchmark

There are several ways to structure and run Benchmarks, in its most simple way all we need to do is to annotate our targeted methods with the Benchmark attribute and subsequently tell the Benchmark Runner the class that defines such methods, so we have two main actors:

  • The Benchmark Runner, which is the piece that interprets a source in order to find actual benchmarks, the sources can vary: a class, stringified source, a gist reachable through a URL.
  • Definition of Benchmarks, which according to the previous bullet point, it could vary but typically in a .NET based Software Solution it will likely take the shape of a class. This class defines the various methods that represent a benchmark and not the actual Application Code.

Best Practices

These are just things that are strongly advised in order to define good and accurate benchmarks and have the most realistic scenario possible

  • Always target Release mode, this is simply to make sure we are running the Optimized version of our code, debug code can run hundreds of time slower, along the same lines don’t attach any type of Debugger either as that incurred overhead.
  • Try different Environments, don’t assume the results you get on a given environment are definitive, by environment we mean Hardware, OS, Runtime, among others, variations of these elements will definitely affect the outcome, the result of a benchmark is bound to the environment where it ran. Run it in different environments and compare results.
  • Avoid dead code, This is to avoid situations when our code is JITed and expressions that yield a result that is never referenced is eliminated, technically an optimisation, this will definitely have an impact when Benchmarked and will likely introduce a disparity.

Define and incorporate Benchmarks into our Software Solution

So the simplest way to incorporate this into your project is as yet another class library which will reference the assembly or assemblies with the Application Code to be benchmarked. So let’s start by assuming we have a solution already for a typical ASP.NET Core MVC Web Application, we need to do the following:

  • Define a new Console Application project that matches the runtime out are using for your Application, name it according to your convention of choice, in our case it will be ABC.Benchmarks.
  • We will use the program entry point to run our benchmarks.
  • Contextually organise our benchmarks, this means that we will have a class per module or modules that do something in common.

Now, moving into our Software solution, the project structure looks like the following:

This is a typical ASP.NET Core Web API, in this case for the sake of the exercise we have classes to build paragraphs from arbitrary number of words passed as an array; we have two implementations: concatenation based and buffered, found under lib/abc.utils

Under Tests we have two projects, abc.tests which is a class library for our unit and integration tests and abc.perftests which is a Console Application where we will define our benchmarks.

This is our Benchmarks Class defined in tests/abc.perftests/Benchmarks/ParagraphBuilderBenchmarks.cs, as seen below, each benchmark targets an implementation of the Paragraph Builder.

The test is simple, feed each builder 1,000 words to build the paragraph. We are changing the configuration of the Default Job given that the benchmark session may run for too long, below I enumerate some of the changes I am doing to it:

  • Define a Configuration class using their API, called DefaultConfig where we specify the number of times it will execute the target method including warm-up as well as how times the Benchmark will be run, we target the default job because we are not adding any custom ones.
  • Declaratively apply the configuration to our Benchmark through the Config attribute.
using System;
using System.Linq;

using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Attributes;

using ABC.Utils;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Horology;

namespace ABC.PerfTests.Benchmarks
{
    class DefaultConfig : ManualConfig
    {
        public DefaultConfig()
        {
            Add(Job.Default
                        .WithLaunchCount(1)
                        .WithIterationTime(TimeInterval.FromMilliseconds(200))
                        .WithWarmupCount(10)
                        .WithTargetCount(10));
        }
    }

    [Config(typeof(DefaultConfig))]
    public class ParagraphBuilderBenchmarks
    {
        private readonly string[] _data;
        private readonly IParagraphBuilder _concatBasedBuilder;
        private readonly IParagraphBuilder _bufferBasedBuilder;

        public ParagraphBuilderBenchmarks()
        {
            _data = Enumerable.Repeat(new string('x', 50), 1000).ToArray();

            Console.WriteLine($"Using {_data.Length} items to test.");

            _concatBasedBuilder = new ConcatenationBasedParagraphBuilder();
            _bufferBasedBuilder = new StringBuilderBasedPragraphBuilder();
        }

        [Benchmark]
        public string ParagraphBuilder_ConcatenationBased()
        {
            return _concatBasedBuilder.Create(_data);
        }
        
        [Benchmark]
        public string ParagraphBuilder_Buffered()
        {
            return _bufferBasedBuilder.Create(_data);
        }
    }
}

Our Program.cs has not changed much, I just added the invocation to BenchmarkRunner.Run{T}

using System;

using BenchmarkDotNet.Running;

using ABC.PerfTests.Benchmarks;

namespace abc.perftests
{
    class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<ParagraphBuilderBenchmarks>();
        }
    }
}

Running the Benchmarks

As advised, we need to run them preferably without any process attached or as part of a Debugging session and we should also make sure we target release mode, this said, we position ourselves in the project root in a command prompt and perform the following:

Build the Artifacts

We pass the -c option with the value Release to specify we want to target the Release configuration.

dotnet build -c Release

Publish Artifacts

This is to bundle everything together, this means, the target assembly and their dependencies

dotnet publish -c Release

Run Benchmarks

Publishing drops all files in the directory bin/<configuration>/netcoreapp<runtime_version>/publish to run the Application we can change into that directory or use the relative path:

cd tests/abc.perftests/bin/Release/netcoreapp2.0/publish
dotnet abc.perftests.dll

OR

dotnet tests/abc.perftests/bin/Release/netcoreapp2.0/publish/abc.perftests.dll

NOTE: Paths and publish output might vary depending on the OS and Framework/Runtime targeted, make sure you check the output of every command.

Viewing  and analyzing results

We will get a summary out in our Console which includes:

  • Platform and Runtime information
    • Runtime might vary if multiple Jobs where used to target different runtimes.
  • A row per method (or Benchmark) will also appear with different indicators

The output is quite extensive, at the very bottom though we can focus our attention to the summary which in our case is:

// * Summary *

BenchmarkDotNet=v0.10.13, OS=macOS 10.13.3 (17D102) [Darwin 17.4.0]
Intel Core i7-7660U CPU 2.50GHz (Kaby Lake), 1 CPU, 4 logical cores and 2 physical cores
.NET Core SDK=2.1.4
  [Host]     : .NET Core 2.0.5 (CoreCLR 4.6.0.0, CoreFX 4.6.26018.01), 64bit RyuJIT
  Job-HGGSTH : .NET Core 2.0.5 (CoreCLR 4.6.0.0, CoreFX 4.6.26018.01), 64bit RyuJIT

IterationTime=200.0000 ms  LaunchCount=1  TargetCount=10
WarmupCount=10

                              Method |         Mean |        Error |       StdDev |
------------------------------------ |-------------:|-------------:|-------------:|
 ParagraphBuilder_ConcatenationBased | 13,646.80 ms | 4,511.271 ms | 2,983.926 ms |
           ParagraphBuilder_Buffered |     31.41 ms |     4.408 ms |     2.916 ms |

// * Legends *
  Mean   : Arithmetic mean of all measurements
  Error  : Half of 99.9% confidence interval
  StdDev : Standard deviation of all measurements
  1 ms   : 1 Millisecond (0.001 sec)

As you can see, we get information about the runtimes, in this case we see the runtime information for the Host and the default job; we can also see the parameters we configured: LaunchCount, TargetCount, IterationTime and WarmupCount.

Also, various files are generated for us to analyse, from plain Reports and Logs to raw data in CSV and Reports in Markdown, we can see the location of all these files in the output too, just before the summary, marked as the Exports section:

// * Export *
  BenchmarkDotNet.Artifacts/results/ParagraphBuilderBenchmarks-report.csv
  BenchmarkDotNet.Artifacts/results/ParagraphBuilderBenchmarks-report-github.md
  BenchmarkDotNet.Artifacts/results/ParagraphBuilderBenchmarks-report.html

The folder BenchmarkDotNet.Artifacts will be placed in the working directory where we ran the benchmarks, in our case the project root where we have the following:

  • ParagraphBuilderBenchmarks.log which is the plain text benchmark log file.
  • and results folder with the summary seen before in different formats like HTML, Github-flavored Markdown and raw data in a CSV file.

 

Conclusion

We’ve covered aspects that are introductory for Developers as well as important considerations while designing Benchmarks on their most basic aspects. In a following post we’ll introduce elements that will enable us to define more sophisticated benchmarks, for instance: Further customization through Configurations and other Elements like Jobs, ParametersBenchmark Switching, Run Strategies, Baselining, among others.

References

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: