WordPress Database Performance Showdown: MySQL vs MariaDB vs Percona

WordPress requires that you use a MySQL compatible database for its database backend. It used to be that you could confidently choose MySQL and go on with life, but in 2019 the choice isn’t quite that simple. With the MySQL, MariaDB, and Percona as the attractive options, how do you know which to choose?

Choosing a database isn’t always about performance, but for the sake of this article it will be ๐Ÿ™‚

For this series of tests we tested database performance out-of-the-box with no special tweaks. It is quite possible that a professional database administrator could make each database run more performant, but most people hosting WordPress aren’t DBAs. With regards to caching, none was enabled. We wanted to test database performance, not cache performance.

MariaDB Logo

What was tested?

The WordPress host machine was a DigitalOcean CPU Optimized droplet with 16 vCPUs (dedicated hyper-threads) and 32 GB of RAM. This monster of a machine was chosen so that we could be certain that Nginx + PHP-FPM weren’t the cause of any bottlenecks. A minor tweak to the PHP-FPM config allowed for full use of all 16 vCPUs.

The database host machines were DigitalOcean CPU Optimized droplets with 4 vCPUs (dedicated hyper-threads) and 8GB of RAM. CPU Optimized droplets were chosen because we didn’t want our tests to be at the mercy of shared CPU resources.

Each deployment was in DigitalOcean’s SFO2 region with the WordPress server communicating with the database over the internal private network. The traffic producing nodes were deployed in DigitalOcean’s NYC3 data center and communicated via the public internet.

The software versions are as follows:

  • Percona Server for MySQL 8.0.16-7
  • MariaDB 10.4.8-GA
  • MySQL 8.0.18

What tests were run?

Want to test your site’s performance? Sign up for Kernl WordPress Load Testing.

For each database that was tested, we ran a load test with the following parameters:

  • 500 concurrent users
  • 2 req/s ramp up
  • 30 minute duration

The goal here wasn’t to bring the database to it’s knees but instead see how it performed under sustained heavy load, but not so heavy that it falls over.

Left: WordPress (Nginx + PHP-FPM), Right: MySQL

MariaDB WordPress Performance

During the MySQL acquisition of Oracle in 2009 there was a lot of concern amongst the core developers that Oracle would eventually close off MySQL to the world (similar to Oracle’s business model). Before that could take place, a GPL fork of MySQL was created named MariaDB.

MariaDB is open source and in active development. But how does it stand up to our WordPress load tests? Let’s find out.

First we’ll take a look at the requests and failures per second.

~379 req/s with periodic errors

As you can see from the results above the performance scales up very well over time eventually peaking at ~379 req/s. We see periodic database-related errors (the vCPUs on the database were almost completely saturated) but nothing too crazy. Next, let’s see how the response times look.

Median response time of ~190ms.

The median response time of WordPress when backed by MariaDB under heavy load stays remarkably consistent. You can see that the average is slowly creeping up by the end of the test, but nothing that would be noticeable by customers yet.

99% of request finish in 500ms

The response time distribution is a little more interesting than the median response time. 90% of all requests finish in under 300ms and 99% of all requests finish in less than 500ms. Overall, the performance of MariaDB out of the box with no configuration is quite good.

Percona WordPress Performance

Another open-source fork of MySQL, Percona was started in 2006 and has been steadily delivering value-added features and enterprise support on top of MySQL for more than 12 years. With all that accumulated experience, how does Percona measure up?

~320 requests/s, ~3 errors/s

Overall the Percona WordPress performance was pretty good. Not quite a good as MariaDB but nothing to scoff at. The one interesting thing was that the error rate was consistent across most of the test once the request volume passed ~200 requests/s. Lets see if the response time graph adds to the story.

~450ms Median Response Time

The data here is actually pretty interesting. The response time for Percona was comparable to MariaDB right up until the time it started getting consistent errors. After that, it more than doubled and stayed that way for the rest of the test.

99% of requests finished in ~550ms

The response time distribution tells similar story to the response time chart. The difference between the 50th percentile and the 99th percentile shows that performance was very consistent across the entire test, it just wasn’t as good as MariaDB.

MySQL WordPress Performance

Last but not least (well….) in our test is MySQL. It’s been around since 1994 and is now owned by Oracle. In it’s latest releases there has been lot of great features such as window functions and more JSON features to compete with PostgreSQL. So let’s see how it did.

295 requests/s, 3-4 errors/s

MySQL had a similar trajectory to Percona: It did pretty good across the board with a small but consistent series of errors after it started to go past 200 req/s. Also note that it never achieved higher than 300 req/s, which both MariaDB and Percona did.

500ms Median Response Time

The shape of the MySQL response time graph looks nearly identical to Percona, just about 50ms slower once the vCPUs started to get saturated. The detailed look in the response time distribution tells a similar story.

99% of requests finished in 875ms

The response time distribution is where you can see Percona and MySQL diverge a bit. At the 99th percentile MySQL is returning at 875ms, while Percona was more like 550ms. In general, the distribution matches what one would expect given how the response time graph looks

Conclusions

The out of the box numbers for MariaDB make it look like the clear winner here. And compared to both Percona and MySQL it is in the raw performance department. This isn’t to say that you couldn’t tune Percona or MySQL to out-perform MariaDB, but only that you get more performance with zero configuration changes.

Req / sError / sMed. Response Time99% Dist.
MariaDB379~1190ms500ms
Percona3203450ms550ms
MySQL2953500ms875ms

WordPress Performance on DigitalOcean Managed MySQL

Want to performance test your own WordPress installation? Try Kernl’s WordPress Load Testing service.

The database is the most important part of any WordPress site. If you aren’t adept at managing MySQL, it can be a serious risk for you and your clients. One way to de-risk this portion of your WordPress site is by going with a managed MySQL offering. There are ton of different options out there, but for this post we’re going to look at Digital Ocean’s Managed MySQL.

DigitalOcean DB SaaS background.

Why Use DigitalOcean Managed MySQL?

There are a lot of reasons to use Digital Ocean’s Managed MySQL (or some other offering) including:

  • Simple Setup – With a few simple clicks you can have a high quality MySQL cluster set up.
  • Horizontal Scalability – Site growing fast? You can spin up read-only nodes to help scale out read operations.
  • Automated Daily Backups – No need to set up your own backups. DigitalOcean takes care of it for you.
  • Automated Failover – If for some reason your primary database node fails, you will automatically fail over to your warm spare.
  • Security – MySQL best practices for security are automatically followed by DigitalOcean. In addition to that your database is isolated to your private network so that outside requests can’t access it by default.

Baseline Performance Test

To get things started let’s do a baseline performance test where the MySQL database is on the same box as the Nginx server. The server configuration is as follows:

  • 1GB RAM
  • 3 vCPU
  • Nginx
  • PHP-FPM 7.2
  • MariaDB 10.1
  • No caching

A test of 400 concurrent users was run using Kernl’s WordPress Load Testing service. Without caching the results are predictably bad, but that is to be expected.

Graph of requests per second for self-hosted MySQL with WordPress
Things fall apart at around ~70 req/sec

For completeness, let’s also take a look at the response time distribution.

Response time distribution for self hosted MySQL with WordPress
99% of requests finish in ~2 seconds

Given how many failing requests we had and zero caching, the ones that did manage to get through didn’t perform too poorly. 99% in 2 seconds or less.

DigitalOcean Managed MySQL

When setting up a managed MySQL database on DigitalOcean you get the option to select the underlying hardware that powers it. For this blog post we went with the minimum possible configuration (1GB RAM, 1vCPU).

Knowing that fewer resources were allocated, it was expected that performance would actually be a bit worse on the dedicated MySQL instance. These expectations were confirmed with the load test.

Graph of requests per second for DigitalOcean Managed MySQL with WordPress
Things go sideways at ~50 req/s.

All things considered, 50 req/s against a WordPress site with no caching isn’t all that bad. Once we hit that level, the database charts were showing 100% CPU saturation and thats when we started to see the error rate increase. Now let’s look at the response time distribution.

Response time distribution for managed MySQL with WordPress
99% of requests finish in less than 2.6 seconds

As expected performance is slightly worse here due to fewer resources on the dedicated MySQL machine, but not in a huge way.

Cost & Why Managed MySQL is Important

For our test we used the most basic configuration that DigitalOcean offers:

  • 1 GB RAM
  • 1 vCPU
  • 10 GB Hard Disk
  • No failover
  • Automatic backups

All of this and not having to manage MySQL for $15/month. Not too bad, but to really be getting your money’s worth (automatic failover) you need to spend more money.

  • 2 GB RAM
  • 1 vCPU
  • 25 GB Hard Disk
  • 1 Standby node
  • Automatic failover
  • Automatic backups

With automatic failover you start to reduce the risk to yourself and your customers a lot. However, this level of availability isn’t free ๐Ÿ™‚ DigitalOcean doesn’t support failover on their cheapest node, so you have to upgrade to the next level ($30/month). After that, the failover node costs you an additional $20/month. Now we’re looking at $50/month for a managed MySQL cluster with automated failover.

If you run a WordPress based business on DigitalOcean, $50/month buys you a lot of peace of mind. It’s also easy to scale up as traffic increases and the cost is competitive in the landscape of managed MySQL. If you happen to be an excellent DBA though, you can definitely manage your own cluster at a reduced monetary cost, but increased time cost.

Conclusions

Managing your own database cluster is probably fine if cost is a problem and you are good at it. In general though, if you can afford it we recommend using a managed MySQL host so you can push the complexity of operating a highly available MySQL cluster on to someone else.

Want to performance test your own WordPress installation? Try Kernl’s WordPress Load Testing service.

Whatโ€™s New With Kernl โ€“ September 2019

I hope that everyone has had a great September! There has been some interesting changes with Kernl this month, so let’s dive in and discuss them.

Features

  • Credit Card Add Changes – For the past 3 years Kernl has had a simple credit card form (card number, expiration, etc). In an effort to help reduce fraud we’ve migrated out card addition system to use Stripe’s Checkout.js. This gives us advanced fraud protection provided by Stripe and the added benefit that your card details never hit Kernl’s servers.
  • Continuous Deployment Setup – Kernl is in the middle of a big UI change for continuous deployment setup. The first step was making the ‘connect your Gitlab/GitHub/BitBucket account’ piece a lot prettier. We’ve now made it a 3 panel selection, with nice big logos and fewer ways to get confused. Check it out by going to the “Continuous Deployment” section in Kernl.
  • Repository Pruning – Kernl will now prune repositories from your repository list that you no longer have access to or no longer have connected with Kernl. The exception here is that we’ll keep a reference to repositories that you have connected to plugins or themes.
  • Invoice Payment Failure Notifications – We recently had an issue where a customer wasn’t notified that their payment failed (card had expired) ๐Ÿ™ This should be resolved now as we’ve integrated with Stripe web hooks and immediately catch the event and send a message to the account owner. We’ve also added a notification preference so that you can silence these emails.

Other

  • All packages have been upgraded on all Kernl servers.
  • Marketing pages are now cached in Redis.
  • Fixed a customer reported bug in plugin_update_check.php that threw a warning in newer versions of PHP. Upgrade to version 1.2.2 to get the fix.
  • Feature flag setup wizard had weird behavior if the customer didn’t have any plugins or themes. You can now manually name your feature flag product in the wizard if you so choose.
  • Fixed a few bugs in the feature flag UI where adding/removing individual users from a flag would occasionally fail.
  • The buttons to manually manage deploy keys have been moved to the bottom of the continuous deployment page. They also come with a disclaimer now.

Blog Posts

Beta testing WordPress plugin features with Kernl WordPress Feature Flags

Beta testing WordPress plugin features with Kernl WordPress Feature Flags

Imagine that you are a WordPress plugin author. Maybe you work for an agency or maybe you work for yourself. The point is that you have clients or customers that have come to expect a high level of quality from your work. Great job!

Now imagine that you have been working on a complicated but much sought after feature for your plugin. Complicated means risk. Complicated means that your hard-earned reputation for high-quality could on the chopping-block if you aren’t careful.

Don’t get stressed out like this guy.

So what do you do? You need to release the feature but you also want to limit the risk you take by doing so. You can do few different things, but we’re here to talk about only one of them:

  • “Dog-food” the new feature for as long as you can.
  • Unit and integration tests can help test the validity of your code
  • โญ Run a limited beta program with feature flags โญ

Using Kernl Feature Flags to Manage a Beta Program

First of all, what is a feature flag anyway? A feature flag is a way of toggling sections of code on or off without doing a code deployment. At an extremely high level, it looks like this:

<?php
  if ($flagActive) {
     // enable feature
  }
?>

Thats it. In practice implementing feature flags is a bit more difficult, but it doesn’t have to be much more complicated. For instance, using Kernl’s feature flag library:

<?php
  $kff = new kernl\WPFeatureFlags($kernlFeatureFlagProductKey, $userIdentifier);
  if ($kff->active('MY_FLAG')) {
    // enable feature
  }
?>

The beauty here is that you can toggle this block of code on or off for individual users, all users, or a percentage of users without ever needing to deploy anymore code.

So. Easy.

And just like that ‘jack.slingerland@gmail.com’ gained access to the beta. No code deploys. No complicated anything. Just search, click, save. And what did this look like for the end user?

Before

No Feature Flag Footer Bar

After

Feature Flag Footer Bar Beta Program!

Now this is obviously a contrived example, but it has all the building blocks you need to do far more complicated integrations.

A Beta Program

With the building blocks above you can see it isn’t hard to manage a beta program. Simply ship an update to your plugin wrapped in a feature flag. After that, add specific people to it as you see fit. As you get more feedback, continue to ship updates behind the flag. Once you are confident that the code looks good, you can remove the flag completely!

Now let’s dive in to the actual plugin code.

Tutorial

Adding feature flags to your plugin is a 3 step process.

  1. Create the product & flag in Kernl.
  2. Install the feature flag library via Composer
  3. Add the code to your plugin.

Step 1 is accomplished by signing up for Kernl and adding the product and flag. The product is just a container for all of your flags. That way your plugin can have a bunch of different flags in it but still be logically grouped together. For this case, let’s create a product called ‘Kernl Footer Flag Blog Post’ and a flag named ‘New Footer Beta’.

Step 2 is as simple as installing the Kernl WordPress feature flag library via Composer:

composer require kernl/wp-feature-flags

For step 3, you add some code to your plugin. Let’s take a look at it, followed by discussion of what it’s all doing.

<?php

require __DIR__ . '/vendor/autoload.php';
add_action('init', 'kernl_footer_flags_init');
function kernl_footer_flags_init() {
    add_action('wp_footer', 'beta_program_footer');
}

function beta_program_footer() {
    if (is_user_logged_in()) {
        $currentUser = wp_get_current_user();
        $userIdentifier = $currentUser->user_email;
    } else {
        $userIdentifier = 'Unauthenticated Users';
    }
    $kernlFeatureFlagProductKey = '5d835a2830cbb568728b9bd4';
    $cacheTimeInMinutes = 1;
    $defaultToActive = false;
    $kff = new kernl\WPFeatureFlags(
        $kernlFeatureFlagProductKey,
        $userIdentifier,
        $defaultToActive,
        $cacheTimeInMinutes
    );
    if ($kff->active('NEW_FOOTER_BETA')):
    ?>
        <style>
            .kernl-footer-flag-bar {
                background-color: #5d5d5d;
                font-size: 12px;
                text-align: right;
                color: white;
            }
        </style>
        <div class="kernl-footer-flag-bar">
            You are in the Kernl Footer Flags beta program (<?= $userIdentifier ?>)
        </div>
    <?php endif;
}
?>

That’s the bulk of the plugin code (I excluded the Kernl updater for brevity). Let’s break it down.

Init Action and Composer Auto Load

require __DIR__ . '/vendor/autoload.php';
add_action('init', 'kernl_footer_flags_init');
function kernl_footer_flags_init() {
    add_action('wp_footer', 'beta_program_footer');
}

In this code we auto load the composer dependency that we have. You can see it here. After that we define an ‘init’ action to add our beta program footer. The reason we have this in the ‘init’ action is because if we do it too early we won’t be able to fetch the current user (which we use to create a unique identifier).

User Identifier Creation

function beta_program_footer() {
    if (is_user_logged_in()) {
        $currentUser = wp_get_current_user();
        $userIdentifier = $currentUser->user_email;
    } else {
        $userIdentifier = 'Unauthenticated Users';
    }

After we define our action we create the function that it calls. The first thing we want to do is create a user identifier. The user identifier is used by Kernl to determine what feature flags that the identified user should see. In our case, if the person isn’t logged in they get lumped into an ‘Unauthenticated Users’ bucket. In the person is logged in, we identify them by their email. It is by this identifier that you will enable/disable features when using the ‘individual’ type feature flag. If we were using the ‘enable for a percentage of users’ type feature flag, simply assigning each user a unique identifier (maybe a UUID) would suffice.

Instantiate the Kernl Feature Flag Library

$kernlFeatureFlagProductKey = '5d835a2830cbb568728b9bd4';
$cacheTimeInMinutes = 1;
$defaultToActive = false;
$kff = new kernl\WPFeatureFlags(
    $kernlFeatureFlagProductKey,
    $userIdentifier,
    $defaultToActive,
    $cacheTimeInMinutes
);

The $kernlFeatureFlagProductKey is generated by Kernl when you create your product. The $cacheTimeInMinutes variable is so you can configure how long the flags will be cached in WordPress. If this is set to ‘0’, then the library will call Kernl every time the page loads. In general you probably don’t want this. And last but not least, $defaultToActive is a boolean variable. If true, the ‘active()` function will return true if Kernl can’t find a flag.

Product Key

Active Check

if ($kff->active('NEW_FOOTER_BETA')):

The final piece is simply checking if this feature flag is active for this user.

Putting it all Together

If you want to see the source code for this plugin and/or install it yourself:

Conclusions

Kernl WordPress Feature flags are an incredibly powerful tool for safely releasing your code into the wild. In a situation like WordPress plugins where production is often not a machine that you are responsible for, being able to quickly toggle code on/off without a deploy is of paramount important.

What’s New With Kernl – July & August 2019

I hope everyone has had a great summer (or winter if you are in the southern hemisphere)! Over the past 2 months we’ve gotten a lot great stuff done, so let’s dive in.

Features & Infrastructure

  • Kernl has upgraded from Mongo 3.x to Mongo 4.2 with WiredTiger. We get improved performance and the latest features with this change.
  • Our Redis instance has moved to DigitalOcean along with the rest of our infrastructure. Prior to this we were using managed host that lived outside the NYC3 data center. Response times decreased ~50ms or so with this change.
  • The high traffic plugin and theme update check endpoints had a round of performance tuning done. Resource consumption was lowered in meaningful way.
  • ???Kernl Analytics Active Plugins???- Kernl Analytics will now track what plugins are most active across your install-base. You only need to be signed up for Kernl Analytics and use the latest plugin_update_check.php file to get this new feature.
  • Our MongoDB database has been moved to DigitalOcean NYC3. Prior to this we were hosting on Compose.io outside of the datacenter. Originally this decision was made because managing databases is tough, but the quality of hosting at Compose has gone done significantly in the past year. With this change we shaved ~150ms off of response times.
  • Some tweaks were made to our network firewalls to make them easier to manage. Thanks DigitalOcean!
???Kernl Analytics Active Plugins ???

Bug Fixes

  • Load testing machines would fail to provision if the API call to DigitalOcean failed. This has been resolved.
  • The load testing master node would fail to start sometimes if secondary nodes failed to connect. The threshold for starting tests has been lowered so that this won’t happen anymore.
  • If a credit card expires and the invoice payment fails, the account isn’t marked as paid when a new card is added and a successful payment happens.
  • When switching between themes/plugins in Kernl Analytics the domain data wasn’t reloading with the new plugin/theme.
  • Thanks to a customer bug report and code snippet, the plugin_update_check.php no longer sends headers before the license check fails.

Blog Posts

That’s it for July and August!

Load Testing Vultr’s New High Frequency Servers with WordPress

Run your own WordPress Load Tests with Kernl!

Back in June Vultr announced the general availability of their new “High Frequency” servers. Reading through the announcement I was intrigued by their claim of using “3+GHZ processors and blazing fast NVMe storage!” and immediately wondered what WordPress performance would look like versus their regular Cloud Compute offering.

What was tested?

To get a better idea of the performance characteristics of the Vultr High Frequency Compute servers, I ran several types of load tests with all Vultr servers sitting in their Silicon Valley data center:

  • 200 concurrent users from New York City (Digital Ocean NYC3)
  • 2000 concurrent users from New York City (Digital Ocean NYC3) & London (Digital Ocean LON1)

And I tested the following scenarios:

All servers with the exception of the “performance tuned” ones were built using the pre-built WordPress image that Vultr offers with no caching or performance tuning done. The “performance tuned” servers were created using Nginx, PHP-FPM, MariaDB, Memcached, and W3 Total Cache.

Results

Starting with the Vultr High Frequency boxes, lets break down performance.

$6 32GB NVMe

The $6 Vultr High Frequency(HF) machine performed well against the roughly equivalent $5 Vultr Cloud Compute(CC) instance. If we take a look at the requests/failures per second charts you can see that the HF machine out-performed the CC instance by quite a lot.

$6 32GB NVMe Vultr High Frequency Requests/Failures
$5 25GB SSD Vultr Cloud Compute Requests/Failures

As you can see the HF server was able to handle roughly twice as many requests per second as the CC server without any errors. Let’s check out the average and median response times as well as the response time distribution.

$62 32GB NVMe Vultr High Frequency Response Time
$6 32GB NVMe Vultr High Frequency Response Time
$62 32GB NVMe Vultr High Frequency Response Time Distribution
$6 32GB NVMe Vultr High Frequency Response Time Distribution

You can see that as the test progressed response times got steadily worse until leveling out at around 2.5s. The response time distribution shows that 99% of requests finished in 3s or under, with the 100th percentile outlier coming in at just under 6s. This is honestly pretty good performance for no tuning at all.

$5 25GB SSD Vultr Cloud Compute Response Time
$5 25GB SSD Vultr Cloud Compute Response Time
$5 25GB SSD Vultr Cloud Compute Response Time Distribution
$5 25GB SSD Vultr Cloud Compute Response Time Distribution

The response times for the $5 cloud compute instance were about 2 seconds worse than the high frequency instance, with the average coming in at around 4.5. The response time distribution was worse from a performance perspective, but better from a consistency perspective with the spread between the 50th percentile and the 100th percentile only being ~1.3s.

At a glance, it looks like the Vultr High Frequency server out-performs the cloud compute in a significant way for only $1 extra. However, our next tests show that it might not be so simple.

$24 128GB NVMe (Performance Tuned)

The next set of tests that were performed were 2000 concurrent users on a performance tuned setup. The traffic originated from a cluster of servers in both New York and London, with the host server being in Vultr’s Silicon Valley data center.

First, lets take a look at the requests per second handled by the $24 High Frequency server and the $20 Cloud Compute server.

$24 128GB NVMe Vultr High Frequency Requests

Now, 2000 concurrent users is a lot. I think that the HF machine did quite well overall, but I confess that I did expect a bit more out of it. It topped out at around 1250 requests per second, with errors hovering in the ~175 requests per second range. If left to run for another 30 minutes I think it would have reached closer to 1350 requests per second.

$20 80GB SSD Vultr Cloud Compute Requests

This is where things start to get interesting. The Vultr marketing team bills the high frequency machines as heads and shoulders above the regular cloud compute machines. Maybe it is for some workloads. But for this test it doesn’t seem much better at all. Looking at both graphs you can see that the HF machine topped out a bit higher, and had fewer errors, but not enough to warrant the extra cash (in my opinion). Lets see what sort of story the response times tell.

$24 128GB NVMe Vultr High Frequency Response Time
$24 128GB NVMe Vultr High Frequency Response Time Distribution

On the performance tuned box you can see that the requests that did complete successfully were all quite fast. The average response time was between 100ms and 200ms throughout the entire test. The response time distribution was solid as well, with 99% of requests finishing in under 500ms. Not bad for 2000 concurrent users.

Now let’s take a look at how the comparable cloud compute server did.

$20 80GB SSD Vultr Cloud Compute Response Time
$20 80GB SSD Vultr Cloud Compute Response Time Distribution

For the Vultr Cloud Compute instance response times also hovered between 100ms-200ms, although just a hair higher than the high frequency servers. The response time distribution was excellent as well, with 99% of requests coming in under 500ms. The main difference here is that there was a 100th percentile outlier here at the high frequency server didn’t have.

Thoughts & Conclusions

From what I can tell, the Vultr High Frequency servers are better than the Cloud Compute servers, but maybe only for certain types of workloads. You can see in the $5/$6 test that they easily out-performed the cloud compute instances, but in the more expensive $20/$24 test the results weren’t so cut and dry.

If we look at it from a requests per dollar standpoint, the cloud compute instance is actually a better value for this type of workload.

Requests per Dollar

I suggest running extensive load tests against the Vultr High Frequency machines before making any effort to switch over. They might be better for you. They might perform the same for more money.

In conclusion: ยฏ\_(ใƒ„)_/ยฏ

Run your own WordPress Load Tests with Kernl!

Digital Ocean 30 Hour WordPress Load Test for Reliability and Consistency

Perform your own 30 hour load tests with Kernl!

Over the past 5 months I’ve been writing a lot of different articles testing WordPress performance when under heavy load. One of the comments that I often receive is “Yes, but how reliable is the host over time?”. To determine that answer I made some changes to Kernl that would allow customers to do long duration tests against their providers with a steady load. Given my affinity for Digital Ocean, I figured that would be a great first host to test.

What Was Tested & How I Tested It

Digital Ocean has several data centers across the globe and I figured that I should test each of these data centers to see how reliable they were. For this test I ran a single load test against the following data centers for 30 hours with a 25 concurrent users:

  • New York City (NYC3)
  • Toronto (TOR1)
  • Bangalore (BLR1)
  • Frankfurt (FRA1)
  • London (LON1)
  • Singapore (SGP1)
  • Amsterdam (AMS3)

All requests were made from the Digital Ocean data center in San Francisco (SFO2). The target of each load test was a simple $5 / month droplet with the WordPress image from the Digital Ocean marketplace installed on it.

Results

The table below summarizes the results for all of the long duration load tests. Click the region to see more details of the load test.

Example Load Test Results Page
RegionRequestsFailuresFailure %Req/s Avg
NYC32.49M1390.005%23
TOR12.46M1260.005%22/23
BLR12.17M1150.005%20
FRA12.28M14050.06%21
LON12.33M13350.05%21
SGP12.29M980.004%21
AMS32.30M2280.009%21

As you can see from the results above Digital Ocean’s reliability is excellent across the entire testing period. Even the data centers with the highest error rate (Frankfurt, London) had an incredibly small error rate. I’m not going to add the response time distribution results here because they were uniformly excellent.

Anomalies

  • The Bangalore (BLR1) test averaged only 20 req / s. Even though the geographic distance is far, I expected the response times to go up but to have a similar throughput.
  • The Toronto (TOR1) load test averaged 22 req / s for 28 hours, then jumped up to 23 req / s for that last two hours of the test. Maybe a noisy neighbor went quiet?
  • Digital Ocean’s Frankfurt (FRA1) and London (LON1) data centers had an order of magnitude more errors than the other data centers I tested.

Conclusions

As a whole, Digital Ocean performed very well in all of their data centers with a moderate amount of sustained traffic to a WordPress instance. In the future I would like to try running all of these tests twice with a different origin for each test run. It’s also worth noting that I haven’t done this type of test on any other platform yet. I hope that as I test more providers I’ll find out whether or not Digital Ocean performed as well as I think it did.

Perform your own 30 hour load tests with Kernl!

Scaleway WordPress Performance Review

Test your own WordPress performance using Kernl’s WordPress Load Testing Service!

Scaleway is a European cloud service that provides an easy to use cloud infrastructure for a reasonable price. If you’ve ever used Digital Ocean they feel a lot like that but with fewer features. For this WordPress performance review I load tested 5 different server configurations (DEV1-S, DEV1-XL, GP1-XS, GP1-S, GP1-M) with caching both enabled and disabled using Kernl’s WordPress load testing service.

Server Configuration

As with most of my load tests I followed a very simple LEMP setup guide that left me with the following software versions:

  • Nginxย 1.14
  • PHP FPMย 7.2
  • MariaDB 10.1
  • Ubuntuย 18.04 LTS

Configurations were mostly default for all of these, with the exception Nginx where I bumped up the file upload max size. Each server was in the Scaleway Paris data center and load was generated from Digital Ocean’s Amsterdam data center.

Tests & Test Data

I performed 3 types of tests on Scaleway:

  • Small scale performance (cached & un-cached) – A 200 concurrent user test for 30 minutes. I did this test for DEV1-S, DEV1-XL, GP1-XS, GP1-S, GP1-M machine types.
  • Large scale performance (cached) – A 1000 concurrent user test for 45 minutes. I did this test only on a GP1-XS machine.
  • Reliability (un-cached) – A low volume long duration test. 25 concurrent users for 6 hours repeated a total of 3 times. This test was used to see what reliability under moderate load was like over the course of a day.

As with most of my cloud provider tests I used this blog’s content as my data source. This means that this test skews extremely read heavy.

Best Value

The results for this battery of tests were interesting.

Cost in Euro (ignore the $ in the graph)

The clear winner without cache enabled was the DEV1-S instance, which was nearly 5x cheaper than the closest competitor. But what does that actually mean? It doesn’t mean that DEV1-S is better than GP1-L, only that it is right-sized for this type of workload. What if we look at the data in another format?

In this bubble chart, the x axis is cost in euros and the y axis is max sustained requests. The cluster in the upper-left corner of this chart is the best value for this type of workload. There isn’t a right answer here, but you can select GP1-S if performance is more important than cost, or DEV1-XL if performance is important but not quite as important as cost. It is worth noting that if increased the volume of requests we would likely see this graph shift in dramatic ways.

To see the results that drove this graph, scroll all the way to bottom of this page and you’ll see an image gallery that has all the raw data.

Reliability

The reliability test was performed on a GP1-XS instance in three 6 hour increments over the course of 1 day. It was a low-volume test (25 concurrent users), but enough load to keep the box busy and to test how reliable Scaleway is over an 18 hour period. Over the course of 18 hours I sent 1.5 million requests to the GP1-XS instance.

23 req/s

As you can see the machine stayed consistently at 23 req/s over the course of 6 hours. The response time distribution was good as well.

98ms @ 99th percentile

99% of requests finished in 98ms or less. Solid performance over a 6 hour period.

For the 2nd 6 hour test things were mostly the same with 1 minor change.

23, 22, 23

You can see that for a 2.5 hour period we had some (minor) performance degradation. Why? Maybe a noisy neighbor? While not a huge deal here, in some workloads losing ~5% of your capacity could be quite problematic.

Even though our capacity was slightly lowered, our response time distribution at 99% remained consistent at 98ms.

Our 3rd 6 hour test was similar to the first, except this time with a ~30% reduction in the 99th percentile response time (98ms to 69ms).

Extremely consistent performance

As you can see from the graph above that for our entire 6 hour period performance remained the same.

SO. FAST.

As stated above, the response time for this 6 hour period was 30% faster. I believe that this is due to the time at which this test was run. It was started at ~3PM EDT which is roughly ~10PM in Paris, so most of this test happened when internet traffic in the region wasn’t very high.

High Volume Performance

The final test that I ran was against a GP1-XS instance, for 45 minutes, with 1000 concurrent users. The WordPress install was using caching. Results were fairly good!

~900 requests / second

While the GP1-XS instance was having a little trouble keeping up, it still served the vast majority of the requests without error and managed to do so at 900 requests/s. The response time distribution was equally impressive.

99% @ 190ms

Conclusions

Scaleway seems like a reasonable host if you need one in Europe, although their pricing seems expensive relative to other companies (Hetzner, Digital Ocean). I was a little concerned with the consistency of performance in the long-term test, but I don’t have any data for other providers to compare it against.

I’m not sure what the difference between Scaleway DEV and GP instances is besides price (maybe a better SLA on GP instances?), but the DEV instances seem like a much better value.

In short: Check out Scaleway if you need WordPress hosting in Europe.

Test your own WordPress performance using Kernl’s WordPress Load Testing Service!

Vultr Cloud Compute -vs- Dedicated -vs- Bare Metal WordPress Performance

Load test your own WordPress site with Kernl! Getting started is free!

In the world of cloud computing there are a lot of different options to choose from. Normally you only need to choose how big your instance will be (2 vCPUs or 4, 2GB RAM or 6), but some cloud compute providers are upping their game and providing an even wider array of options and instance types for you to choose from.

Vultr has 3 different types of compute instances:

  • Cloud Compute – You get your own virtual server, but it is sharing hardware resources with lots of friends. Noisy neighbors can definitely be a problem.
  • Dedicated – Dedicated servers, but virtualized. I (think) it is possible to run in to noisy neighbor problems in this situation.
  • Bare Metal – Dedicated servers and hardware. No hypervisor and no noisy neighbors taking up your resources.

In this article we’re going to see how a very basic WordPress install performs on the different types of Vultr compute instances. We’ll do so using Kernl’s WordPress Load Testing service.

The Test

As per usual with Kernl load tests I imported this blog’s content into each load testing environment. The load test skews extremely read heavy. If you have a site that is write heavy or a mix you may see different results.

Each test was performed for 1 hour with 2000 concurrent users generating load from London and New York to Vultr’s data center in New Jersey.

Configuration

For this test I used Vultr’s pre-built WordPress image with no caching. A lot of readers might say “But you can get much better performance using X or Y!”, and they would be right! But I’m not testing Apache vs Nginx performance, or W3 Total Cache vs WP Rocket, I’m testing Vultr hardware under load in a real world scenario. I simply want to know at the end of this article if Vultr Cloud Compute, Dedicated, or Bare Metal is better for WordPress hosting.

Test 1: Vultr Cloud Compute $10 / Month

The first test I performed was against the $10 per month Vultr Cloud Compute offering. As expected of a $10/month VPS performance wasn’t awesome, but it also wasn’t terrible.

All the red of red land

As you can see, lots of failed requests and only maintaining throughput of 16 req/s. Not unexpected with a single core and 1 GB of RAM. After all, I was throwing 2000 concurrent requests per second at the server. The response time distribution was similarly bad.

Bad, but could be a lot worse.

Overall, the results for the $10 VPS were as expected. This isn’t really an apples to apples comparison (we’ll get to that later), but I wanted to give you an idea of what basic VPS instance performance looks like.

Test 2: Vultr Cloud Compute $80 / Month

With this test we’re starting to get closer to the cost of bare metal and dedicated instances. This server had 6 CPUs and 16GB of RAM. Considerably more robust than the $10 server.

Lots of red, but also blue!!!

This graph tells a much different story than the previous test. Performance peaked at 169 req/s and then leveled off at 100 req/s. We still saw a lot of errors, but once again this isn’t unexpected. Honestly if you started to get this much traffic you would likely start breaking up WordPress into its components (file system, PHP + Nginx, MySQL) and start scaling horizontally.

Much Lower Response Time Distribution

The response time distribution was much better for this server as well. The upper end was just as bad as the cheaper box, but the 90% and below ranges were pretty solid for the amount of traffic that was being received.

Test 3: Vultr Bare Metal $120 / Month

The Vultr Bare Metal server was the instance I was most excited about testing. I’ve always had a soft spot for hardware and getting access to a bare metal server is pretty cool. For $120 per month (on sale, price will rise to $300/month eventually) you get 8 CPUs and 32GB of RAM. This is a pretty serious server.

Oooh, 200 req/s.

Lots of blue on this graph but also the expected amount of red. You can see that throwing 2 more non-virtual CPUs and 2X the RAM made a pretty big difference. We peaked at 200 req/s and then leveled out at 125 req/s. For reference that is 17.2 million requests per day.

๐Ÿ™

The lower end of the response time distribution was solid, but the upper end wasn’t great at all. With all of those errors it isn’t surprising that this is the case.

Test 4: Vultr Dedicated $120 / Month

I honestly had a tough time figuring out why Vultr priced the bare metal and dedicated instances so close to each other. Dedicated is clearly inferior (far fewer CPUs and RAM) so why would anyone choose it? Anyway, let’s take a look at the graph.

?????

This test peaked at 100 req/s and then leveled off at around 70. I really would expect a lot better performance for this sort of money.

Also ?, but not as much ?.

Response time distribution was similar to the other boxes. With all the failures it tends to skew pretty hard in the wrong direction. I’m sure that there is a use case for these dedicated Vultr instances, but it definitely isn’t hosting a WordPress site.

Conclusions

With all of this data it was pretty easy to graph which of these is the best value.

Value was calculated by taking the cost per month and dividing it by the maximum number of requests. Based on the performance we saw above the Vultr Cloud Compute instances seem like your best value for WordPress hosting. For WordPress hosting it looks like Vultr Bare Metal and Dedicated instances aren’t a great choice. As mentioned above, there are likely use cases where they are a good choice though (maybe workloads that require very consistent performance).

As with all of these tests, your mileage may vary! I highly recommend that you run load tests on any new host that you use to get an idea of what sort of performance you can expect.

Load test your own WordPress site with Kernl! Getting started is free!

W3 Total Cache Performance Review

Test your own site’s performance with Kernl WordPress Load Testing!

In the world of WordPress performance, you can’t go far without talking about caching plugins. I’ve personally used several different caching plugins throughout my time as a WordPress developer but have never really took the time to see how the plugins perform under pressure. Until now.

Over the next couple of months I’ll be releasing a series of blog posts detailing the performance characteristics of different WordPress caching plugins. This DOES NOT include server caching plugins (LiteSpeed, Varnish, Nginx Fast CGI, etc) because those are in a league of their own and it wouldn’t really be an “apples to apples” comparison.

The Setup

I decided to try and run this comparison as if I were looking for a cheap VPS provider and hoping to get a lot of performance out of it. Since I’m a huge Digital Ocean fanboy, I went ahead and used the pre-built WordPress image in their marketplace and dropped it on a $5/month droplet. This leaves me with a typical LAMP setup with mostly default configurations for everything. I also wanted to test the configuration with Memcached so that was installed as well.

The Tests

In order to fully test W3 Total Cache I ran the following tests with 200 concurrent users out of Digital Ocean’s NYC3 data center. The server under test was located in Amsterdam.

  • Baseline Test – No caching at all. We just need to see what performance we can get without caching enabled.
  • Fragment Cache Only (Memcached)
  • Database Cache Only (Memcached)
  • Page Cache Only (Memcached)
  • All Caches Enabled (Memcached)
  • All Caches Enabled (OPCode APC)
  • All Caches Enabled (Disk Enhanced[page] & Disk)

It is important to note that W3 Total Cache will also do HTML, CSS, and JS minification. I’m not particularly interested in that, but do know that it can compress these outputs with a variety of different tools and may decrease overall load time.

High Level Results

W3 Total Cache is a solid caching plugin that gives you a lot of different options for getting good performance out of WordPress.

TestMax req/sResponse Time (90%)
Baseline192000ms
Fragment Cache Only192100ms
Database Cache Only202100ms
Page Cache Only601100ms
All Caches (Memcached)131110ms
All Caches (OPCode APC)601800ms
All Caches (Disk)13390ms

Baseline Test

As expected the baseline load test didn’t perform the greatest. It is no secret that WordPress doesn’t perform well under load without caching plugins, but we needed a baseline anyways.

Lots of red.

We peaked at around 19 req/s and then started to trail off towards 5 req/s as the error rate increased. Looking at the response time distribution, you can see that things didn’t look much better.

I mean it could be worse. At least people could access your site in under 5 seconds, but most people bail in 1 second or less.

Fragment Cache Only

The fragment cache in W3 Total Cache is supposed to “reduce time for common operations”. I’m not sure what that means but by itself it didn’t seem to help me out very much.

With fragment caching enabled we still see a similar request and failure profile as when we didn’t have any caching enabled at all.

Response time distribution was similar to no caching. My guess is that fragment caching optimizes a very narrow set of operations and that my blog doesn’t really use any of them.

Database Cache Only

I expected to see some pretty good performance with the database cache enabled. Presumably this cached queries to MySQL with should have really helped us scale. It didn’t ๐Ÿ™

You can see that we were just slightly better than with the fragment cache, but not even close to where I thought we would be.

The response time distribution was bad too. It actually is a bit higher than the other ones. I assume that this was just a fluke and that if I ran this test several times it would be the same as the others.

Page Cache Only

With full page caching enabled we start to see some good performance improvements.

60 * 60 *60 * 24 == 5.1 Million per day

The graph above shows that we hit about 60 req/s before the wheels start to fall off, but even then we’re able to continue serving requests at a respectable 50-ish req/s.

The response time distribution looks a lot better here as well. 90% of requests finished in 1.1 seconds. This isn’t bad at all considering you’re going at > 50 req/s.

All Caches Enabled (Memcached)

With all caches enabled we start to see some pretty impressive performance out of W3 Total Cache.

Zoom Zoom

130 req/s is solid, but you’ll notice there is a steady stream of errors coming across as well. In general I would say that this setting is good to prevent to your site from going down but that you should look to increasing resources ASAP.

Response time distribution is where we start to see some really excellent gains. The pages were served almost immediately with the 99th percentile being 220ms. Much of that time can be attributed to latency due because the servers are so far apart.

All Caches Enabled (OPCode APC)

I expected the results of this test to be similar to Memcached because they’re both in-memory caches, but that wasn’t even close to true.

What a strange looking chart.

We initially peaked at 60 req/s and then started to see a large uptick in errors. Requests eventually started to climb back up, but the error rate stayed consistent.

I was pretty disappointed in the response time distribution. I expected it to be similar to Memcached but it looks like OPCode APC just isn’t as efficient. 90th percentile at 1.8s isn’t terrible but it isn’t good either.

All Caches Enabled (Disk)

Disk-based caching is very likely the type you would be using if you were in a shared hosting environment. The good news is that the disk-based caching that W3 Total Cache does is quite good.

133 req/s with a fairly small error rate is great! At this level you could easily handle 99% of the traffic scenarios you’ll see as a WordPress developer.

The response time distribution was similar to the Memcached version of this test in that 99% of requests finished in under 170ms. Not bad considering WordPress was fielding > 130 requests per second.

Conclusions

W3 Total Cache is an excellent choice for caching. While some of its configuration can be confusing, you can still get very acceptable performance by enabling page caching or all caching thats available. Its also worth noting that you could very likely get excellent performance with no errors if you tweaked some Apache and MySQL settings in addition to using W3 Total Cache.

Test your own site’s performance with Kernl WordPress Load Testing!