Part 2 – Performance testing Microsoft Exchange using Jetstress and Load Generator

by Clive Williams on 05/04/2016 10:00

Microsoft Exchange Load Generator 2013 for performance testing

In the previous blog post in this series I outlined the performance risks with Microsoft Exchange. One way to mitigate performance risks when changing Exchange implementations is to run a performance test.  While this may not be the only way to mitigate performance risks, it’s something well worth considering. In this post I discuss the key steps for performance testing Microsoft Exchange.

1. Determine Exchange workload

One of the main issues in testing or diagnosing Exchange performance is getting details of the workload.  Without a good estimate of what users are doing and in what volume, it is difficult to mitigate performance risks. It is also important to know what the user behaviour currently is and how changes may alter this.

For example, in one upgrade, user behaviour was categorised in terms of average emails per day.  Even after applying some factors to estimate peak loads, the actual peak rate at the start of the working day was found to be much higher. Also some user groups were found to have even more particular behaviours; a small group of administrators needed to review all the emails received for executives in a very short time before summarising them for the start of the executives’ day.  This created a short but very high transient load for critical business decision makers.

It is necessary to get accurate estimates of workload for planning a good architecture and design for Exchange. The workload needs to recognise peaks versus average. It needs to take into account the behaviour of workgroups as this will influence design considerations such as mailbox databases, throttling policies, and Exchange roles that can impact performance.

Even with a good measure of current workload, future changes need to be considered. For example, there can be a number of workload changes that come from increased mobilisation. How will users change their behaviour? For example, will they read emails in concentrated intervals or, vice versa, spread the load throughout the working day?

There are a choice of techniques to go about getting details of the workload. If you have access to the current set-up, look to collect details about the workload from perfmon counters. Refer to the Microsoft documentation for your current version of Exchange to look at which counters describe the workload. The data from these counters may need to be analysed by considering architectural changes between Exchange versions.

There are tools available from Microsoft and other sources that can help you obtain similar data about workload. These tools are usually specific to Exchange versions, so I’ll talk about these later in the post.  Be careful though as some of these tools only collect data about part of the overall Exchange traffic so you may need to use more than one tool.

2. Performance test changes to Exchange

How can you go about performance testing your Exchange environment prior to go-live to reduce risk?

Use Microsoft Exchange specific tools for performance testing

In my experience I have found that it is complex to script all of the transaction types supported by Exchange (email, calendar, address book, synchronisation etc) when using conventional performance and load testing tools. I have had more success with special Exchange specific tools, including Jetstress and Load Generator and so my focus in this section is on using this set of tools.

Understand estimated system capacity requirements

Before looking at any testing, it’s worth checking how your design team has estimated the system capacity requirements for the planned Exchange platform. Microsoft provides posts on Exchange Sizing and Configuration Recommendations. Here, Microsoft recommends using an estimating spreadsheet (one version can found here, Exchange Server Role Requirements Calculator). This should be completed as a pre-requisite for any Exchange implementation. For performance risk investigation and test design, I’ve found it useful to use the spreadsheet with the completed design to:

  • Vary some of the assumptions in the design to see the impact on delivered load (sensitivity analysis)
  • Get the calculated disk IOPS from the design
  • Look at how the workload has been defined in the design.

Performance test the Exchange disk subsystem using Jetstress

Once you have the design, I’d consider how to test the implementation.

Microsoft indicate that the most critical component of a modern Exchange implementation is the disk subsystem particularly if this is implemented via a SAN. Microsoft provides a free tool to assist testing load on the disk subsystem, Jetstress (see Microsoft Exchange Server Jetstress 2013 Tool for the tool and Disk performance Testing with Jetstress 2013 for information about its use). This tool provides a simple way to test the proposed Exchange disk system which Microsoft highlights as a key area to get right. Where you’re using a SAN, this is absolutely essential to getting the disk subsystem set up right for Exchange.

The tool works well and is relatively simple to set-up. It provides good summary results and it is simple to vary the load so different scenarios can be tested.

There are some considerations about using JetStress that you need to keep in mind:

  • You need the majority of your planned, full sized Exchange environment to execute the tests. In particular the Mailbox servers and SAN need to be production like.
  • The software running on the Exchange servers is Jetstress plus some Exchange components. You are not testing the servers themselves only the disk subsystem. The servers will need to be rebuilt after testing.
  • Populating the disk subsystem prior to testing takes an appreciable amount of time. The documentation gives estimates but think of days rather than hours!
  • Where to build the disk subsystem is a tricky consideration. The ideal place is in the same SAN as production but you could be potentially subjecting the SAN to load which could risk the rest of production. If you use a separate SAN, you need to consider the effect of not having the rest of the load the SAN needs to support from other applications.
  • Tests are recommended to be run over extended periods. I would choose to run short verification tests first but you need to plan for 8 hours minimum for tests. This has an impact on the number of tests you can run and the cycle time for execution, analysis, fix and re-test. Also be aware of any potential conflicts in your testing environment – back-ups, re-use and re-configuration of the environment will impact the success of your testing. Also make sure the design of the test environment is suitable for such testing. In a recent example, daily housekeeping processes impacted the testing as long running processes got automatically tidied up!
  • The summary results are good but the tool also produces lots of other useful metrics in the form of counters. Analysing these is highly recommended as it provides valuable insight into disk subsystem behaviour especially if running over long periods. You are going to need a suitable tool to analyse these. You do not have to worry about creating lots of individual user accounts. The tool simulates the whole load by using a system account. So no large load is placed on any Active Directory server you may have. This is both a good and bad thing. The good is that it simplifies set-up; the bad is that your Active Directory environment doesn’t get tested.

Performance test the Microsoft Exchange implementation using Exchange Load Generator

Microsoft provides a comprehensive test tool Exchange Load Generator (see Exchange Load Generator 2013 for the tool and documentation about this tool). This allows a real simulated load to be sent to the whole planned Exchange implementation.

This tool is also free. This tool is not for the faint-hearted! While I have used it to get very useful results, there’s a number of things to bear in mind!

  • If you don’t have a good idea of your Exchange workload (see above), you could end up with the old problem of ‘garbage in, garbage out’ after spending time and cost doing the testing. This is true of one engagement I was involved in. The new system worked very well for the planned workload but in production, the actual workload was significantly different – lots of additional peaks and a different traffic mix during the day!
  • You need to have a suitable full-sized environment to test. This environment needs to be isolated from production. This means not only having the Exchange servers but also suitable ancillary components such as Active Directory and DHCP servers to facilitate whatever number of users you support. Getting this set-up can be resource consuming. You also need the environment to be stable over long periods while the set-up and testing take place – not always achievable in some test environments.
  • You will need to create a suitable number of test accounts. Scripts are provided but test user set-up can be problematic in some situations.  Also be careful of the effect of Group Policy Object (GPO) updates – I’ve found that these can alter the way users need to be set up to operate with the tool.
  • The tool is released ‘as-is’ by Microsoft with no support! Any issues need to be solved by yourself or by researching through the internet.
  • It takes a long time to set-up. Before starting any tests, you need to build and populate mailboxes which can take a fair while (again think days rather than hours). Then Exchange will start to build search indices on the Exchange databases which as I have noted previously takes time and consumes resources. The tool also tries to build and populate the mailboxes as quickly as possible and you can easily hit threshold issues (see the next post for notes about thresholds in later versions of Exchange).
  • The way you set up different user profiles in the tool is not fully documented and you may need some trial and error to achieve your target workloads.
  • Again there’s lots of additional metrics provided by means of counters. You need a good tool to analyse these. Some of the metrics provided describe the delivered workload so it is essential these are analysed to see what the actual workload achieved is.
  • The tool does not fully test the disk subsystem as it does not deliver some I/O load. Use Jetstress (as above) for this.

3. Performance issues when migrating Exchange users

You are going to hit performance issues relating to migrating users both during performance testing and during actual migration.

When you migrate users to a new version of Exchange, there’s special performance issues you need to be aware of that won’t reappear post migration:

  • Mailbox build – existing messages being migrated will cause mailboxes to be built. This takes time – you may want to understand the time it takes by undertaking a trial migration. Ensure the mailbox build has completed before running any performance testing.
  • Index build – as new messages are added to the mailboxes, newer versions of Exchange will automatically try to index them. This indexing can take appreciable time and can be seen to continue well after the completion of mailbox migration. This will consume system resources. In normal operation (after migration), index build reduces in intensity and you would expect it to run continuously (you may want to test this!). The service that runs indexing can be stopped temporarily in peak times and restarted when load decreases. You may want to consider this during migration.

The final part of this blog post series covers another performance risk mitigation approach – production monitoring of Microsoft Exchange. If you’re migrating users in phases, this is may be the recommended strategy to take.


Get blog posts by email

New call-to-action
New call-to-action