Hello everyone! Kernl got some pretty interesting updates this month, so let’s dive in!
Average & Median Response Times for Load Tests – Kernl’s WordPress load testing services has always had this data available, we just never surfaced it in a way that was easy to consume. You’ll now see a new tab when you run a load test with information about average and median response times during the course of your test.
Progress Bar for Load Test Initialization – When you create a load test you will now see a progress bar that indicates how far in the process of instantiating your infrastructure we are. Prior to this update it was easy to think that the process had stalled out or broke.
Bug Fixes & Other
Load test site ownership verification would fail to certain types of HTML minification. This has been resolve.
Our internal analytics services has been refactors to be a singleton. This reduced each app server’s memory footprint by 5MB.
The public license validate endpoint now has a 10 second cache on it. This helps us deal with any sort of burst traffic from license validations.
All packages have been update on our servers.
Thats it for June! If you have any questions reach out to firstname.lastname@example.org.
There was a lots of work on Kernl’s WordPress Load Testing this month, so lets dive in and learn about it!
Features & Bug Fixes
Reliability Load Testing – Have you ever wondered how a WordPress host performs over an extended period of time? Kernl’s new reliability testing allows you to answer that question. You can now run low volume (25 user) load tests for up to 30 hours.
Download Full Load Test Data – You are now able to download all the data from your load test. This is the full, non-sampled, data set that Kernl produces during a load testing run.
Sampled Load Test Data – Kernl’s WordPress load testing service can create a lot of data, so for data sets over a certain size we now sample the data. This helps all of charts load faster and improve your experience.
Over the past 5 months I’ve been writing a lot of different articles testing WordPress performance when under heavy load. One of the comments that I often receive is “Yes, but how reliable is the host over time?”. To determine that answer I made some changes to Kernl that would allow customers to do long duration tests against their providers with a steady load. Given my affinity for Digital Ocean, I figured that would be a great first host to test.
What Was Tested & How I Tested It
Digital Ocean has several data centers across the globe and I figured that I should test each of these data centers to see how reliable they were. For this test I ran a single load test against the following data centers for 30 hours with a 25 concurrent users:
New York City (NYC3)
All requests were made from the Digital Ocean data center in San Francisco (SFO2). The target of each load test was a simple $5 / month droplet with the WordPress image from the Digital Ocean marketplace installed on it.
The table below summarizes the results for all of the long duration load tests. Click the region to see more details of the load test.
As you can see from the results above Digital Ocean’s reliability is excellent across the entire testing period. Even the data centers with the highest error rate (Frankfurt, London) had an incredibly small error rate. I’m not going to add the response time distribution results here because they were uniformly excellent.
The Bangalore (BLR1) test averaged only 20 req / s. Even though the geographic distance is far, I expected the response times to go up but to have a similar throughput.
The Toronto (TOR1) load test averaged 22 req / s for 28 hours, then jumped up to 23 req / s for that last two hours of the test. Maybe a noisy neighbor went quiet?
Digital Ocean’s Frankfurt (FRA1) and London (LON1) data centers had an order of magnitude more errors than the other data centers I tested.
As a whole, Digital Ocean performed very well in all of their data centers with a moderate amount of sustained traffic to a WordPress instance. In the future I would like to try running all of these tests twice with a different origin for each test run. It’s also worth noting that I haven’t done this type of test on any other platform yet. I hope that as I test more providers I’ll find out whether or not Digital Ocean performed as well as I think it did.
For quite awhile now Kernl has had the ability to throw some serious load at your WordPress site, but never a great way to share the results. Today that changes with the introduction of load test result sharing!
Why would I share my load test results?
The main use case for sharing your load test results is showing your clients that their new WordPress site can handle the traffic load that they expect. For someone like a YouTube or Twitter celebrity this can be a very real problem. Larger organizations also would enjoy this peace of mind.
How do I get started?
Easy! Just click the share button next to the load test that you want to share. You’ll be presented with your sharing URL that can be copy and pasted into Facebook, Twitter, Slack, or Email!
Scaleway is a European cloud service that provides an easy to use cloud infrastructure for a reasonable price. If you’ve ever used Digital Ocean they feel a lot like that but with fewer features. For this WordPress performance review I load tested 5 different server configurations (DEV1-S, DEV1-XL, GP1-XS, GP1-S, GP1-M) with caching both enabled and disabled using Kernl’s WordPress load testing service.
As with most of my load tests I followed a very simple LEMP setup guide that left me with the following software versions:
PHP FPM 7.2
Ubuntu 18.04 LTS
Configurations were mostly default for all of these, with the exception Nginx where I bumped up the file upload max size. Each server was in the Scaleway Paris data center and load was generated from Digital Ocean’s Amsterdam data center.
Tests & Test Data
I performed 3 types of tests on Scaleway:
Small scale performance (cached & un-cached) – A 200 concurrent user test for 30 minutes. I did this test for DEV1-S, DEV1-XL, GP1-XS, GP1-S, GP1-M machine types.
Large scale performance (cached) – A 1000 concurrent user test for 45 minutes. I did this test only on a GP1-XS machine.
Reliability (un-cached) – A low volume long duration test. 25 concurrent users for 6 hours repeated a total of 3 times. This test was used to see what reliability under moderate load was like over the course of a day.
As with most of my cloud provider tests I used this blog’s content as my data source. This means that this test skews extremely read heavy.
The results for this battery of tests were interesting.
The clear winner without cache enabled was the DEV1-S instance, which was nearly 5x cheaper than the closest competitor. But what does that actually mean? It doesn’t mean that DEV1-S is better than GP1-L, only that it is right-sized for this type of workload. What if we look at the data in another format?
In this bubble chart, the x axis is cost in euros and the y axis is max sustained requests. The cluster in the upper-left corner of this chart is the best value for this type of workload. There isn’t a right answer here, but you can select GP1-S if performance is more important than cost, or DEV1-XL if performance is important but not quite as important as cost. It is worth noting that if increased the volume of requests we would likely see this graph shift in dramatic ways.
To see the results that drove this graph, scroll all the way to bottom of this page and you’ll see an image gallery that has all the raw data.
The reliability test was performed on a GP1-XS instance in three 6 hour increments over the course of 1 day. It was a low-volume test (25 concurrent users), but enough load to keep the box busy and to test how reliable Scaleway is over an 18 hour period. Over the course of 18 hours I sent 1.5 million requests to the GP1-XS instance.
As you can see the machine stayed consistently at 23 req/s over the course of 6 hours. The response time distribution was good as well.
99% of requests finished in 98ms or less. Solid performance over a 6 hour period.
For the 2nd 6 hour test things were mostly the same with 1 minor change.
You can see that for a 2.5 hour period we had some (minor) performance degradation. Why? Maybe a noisy neighbor? While not a huge deal here, in some workloads losing ~5% of your capacity could be quite problematic.
Even though our capacity was slightly lowered, our response time distribution at 99% remained consistent at 98ms.
Our 3rd 6 hour test was similar to the first, except this time with a ~30% reduction in the 99th percentile response time (98ms to 69ms).
As you can see from the graph above that for our entire 6 hour period performance remained the same.
As stated above, the response time for this 6 hour period was 30% faster. I believe that this is due to the time at which this test was run. It was started at ~3PM EDT which is roughly ~10PM in Paris, so most of this test happened when internet traffic in the region wasn’t very high.
High Volume Performance
The final test that I ran was against a GP1-XS instance, for 45 minutes, with 1000 concurrent users. The WordPress install was using caching. Results were fairly good!
While the GP1-XS instance was having a little trouble keeping up, it still served the vast majority of the requests without error and managed to do so at 900 requests/s. The response time distribution was equally impressive.
Scaleway seems like a reasonable host if you need one in Europe, although their pricing seems expensive relative to other companies (Hetzner, Digital Ocean). I was a little concerned with the consistency of performance in the long-term test, but I don’t have any data for other providers to compare it against.
I’m not sure what the difference between Scaleway DEV and GP instances is besides price (maybe a better SLA on GP instances?), but the DEV instances seem like a much better value.
In short: Check out Scaleway if you need WordPress hosting in Europe.
Load test your own WordPress site with Kernl! Getting started is free!
In the world of cloud computing there are a lot of different options to choose from. Normally you only need to choose how big your instance will be (2 vCPUs or 4, 2GB RAM or 6), but some cloud compute providers are upping their game and providing an even wider array of options and instance types for you to choose from.
Cloud Compute – You get your own virtual server, but it is sharing hardware resources with lots of friends. Noisy neighbors can definitely be a problem.
Dedicated – Dedicated servers, but virtualized. I (think) it is possible to run in to noisy neighbor problems in this situation.
Bare Metal – Dedicated servers and hardware. No hypervisor and no noisy neighbors taking up your resources.
In this article we’re going to see how a very basic WordPress install performs on the different types of Vultr compute instances. We’ll do so using Kernl’s WordPress Load Testing service.
As per usual with Kernlloadtests I imported this blog’s content into each load testing environment. The load test skews extremely read heavy. If you have a site that is write heavy or a mix you may see different results.
Each test was performed for 1 hour with 2000 concurrent users generating load from London and New York to Vultr’s data center in New Jersey.
For this test I used Vultr’s pre-built WordPress image with no caching. A lot of readers might say “But you can get much better performance using X or Y!”, and they would be right! But I’m not testing Apache vs Nginx performance, or W3 Total Cache vs WP Rocket, I’m testing Vultr hardware under load in a real world scenario. I simply want to know at the end of this article if Vultr Cloud Compute, Dedicated, or Bare Metal is better for WordPress hosting.
Test 1: Vultr Cloud Compute $10 / Month
The first test I performed was against the $10 per month Vultr Cloud Compute offering. As expected of a $10/month VPS performance wasn’t awesome, but it also wasn’t terrible.
As you can see, lots of failed requests and only maintaining throughput of 16 req/s. Not unexpected with a single core and 1 GB of RAM. After all, I was throwing 2000 concurrent requests per second at the server. The response time distribution was similarly bad.
Overall, the results for the $10 VPS were as expected. This isn’t really an apples to apples comparison (we’ll get to that later), but I wanted to give you an idea of what basic VPS instance performance looks like.
Test 2: Vultr Cloud Compute $80 / Month
With this test we’re starting to get closer to the cost of bare metal and dedicated instances. This server had 6 CPUs and 16GB of RAM. Considerably more robust than the $10 server.
This graph tells a much different story than the previous test. Performance peaked at 169 req/s and then leveled off at 100 req/s. We still saw a lot of errors, but once again this isn’t unexpected. Honestly if you started to get this much traffic you would likely start breaking up WordPress into its components (file system, PHP + Nginx, MySQL) and start scaling horizontally.
The response time distribution was much better for this server as well. The upper end was just as bad as the cheaper box, but the 90% and below ranges were pretty solid for the amount of traffic that was being received.
Test 3: Vultr Bare Metal $120 / Month
The Vultr Bare Metal server was the instance I was most excited about testing. I’ve always had a soft spot for hardware and getting access to a bare metal server is pretty cool. For $120 per month (on sale, price will rise to $300/month eventually) you get 8 CPUs and 32GB of RAM. This is a pretty serious server.
Lots of blue on this graph but also the expected amount of red. You can see that throwing 2 more non-virtual CPUs and 2X the RAM made a pretty big difference. We peaked at 200 req/s and then leveled out at 125 req/s. For reference that is 17.2 million requests per day.
The lower end of the response time distribution was solid, but the upper end wasn’t great at all. With all of those errors it isn’t surprising that this is the case.
Test 4: Vultr Dedicated $120 / Month
I honestly had a tough time figuring out why Vultr priced the bare metal and dedicated instances so close to each other. Dedicated is clearly inferior (far fewer CPUs and RAM) so why would anyone choose it? Anyway, let’s take a look at the graph.
This test peaked at 100 req/s and then leveled off at around 70. I really would expect a lot better performance for this sort of money.
Response time distribution was similar to the other boxes. With all the failures it tends to skew pretty hard in the wrong direction. I’m sure that there is a use case for these dedicated Vultr instances, but it definitely isn’t hosting a WordPress site.
With all of this data it was pretty easy to graph which of these is the best value.
Value was calculated by taking the cost per month and dividing it by the maximum number of requests. Based on the performance we saw above the Vultr Cloud Compute instances seem like your best value for WordPress hosting. For WordPress hosting it looks like Vultr Bare Metal and Dedicated instances aren’t a great choice. As mentioned above, there are likely use cases where they are a good choice though (maybe workloads that require very consistent performance).
As with all of these tests, your mileage may vary! I highly recommend that you run load tests on any new host that you use to get an idea of what sort of performance you can expect.
Load test your own WordPress site with Kernl! Getting started is free!
March was a great month for Kernl! We did some blogging, a bunch of infrastructure work, and a little bit of unplanned work due to API deprecation at BitBucket.
Features & Updates
Resource Starvation (Load Testing) – We’ve decreased the number of users per machine that Kernl WordPress Load Testing uses. This helps prevent resource starvation on the load generation servers.
Pre-Configured Load Tests – Kernl now has 4 different pre-configured load tests to make testing your site’s performance even easier!
BitBucket API Update – BitBucket API calls now use Oauth2 and their v2 endpoints. The v1 endpoints are going away in April.
Analytics – All of the features available in Kernl Analytics Agency plan have been rolled into one single plan at the same $10/month price point as the small plan. We also increased data retention to 365 days to make our comparison tool more useful.
Load Test Working Indicator – Load tests will now show a “working” indicator when in a state before the load test has started but after you have submitted your request to start a load test.
MongoDb Driver – The MongoDB driver that we use has been upgraded for better connection retry handling.
Updates – Kernl Analytics and Kernl WordPress Load Testing have been upgraded to Node.js 10.15.3 and have had all of their packages updated.
License Management Widget – The license management widget was throwing an error when no error was present. This has been resolved.
In the world of WordPress performance, you can’t go far without talking about caching plugins. I’ve personally used several different caching plugins throughout my time as a WordPress developer but have never really took the time to see how the plugins perform under pressure. Until now.
Over the next couple of months I’ll be releasing a series of blog posts detailing the performance characteristics of different WordPress caching plugins. This DOES NOT include server caching plugins (LiteSpeed, Varnish, Nginx Fast CGI, etc) because those are in a league of their own and it wouldn’t really be an “apples to apples” comparison.
I decided to try and run this comparison as if I were looking for a cheap VPS provider and hoping to get a lot of performance out of it. Since I’m a huge Digital Ocean fanboy, I went ahead and used the pre-built WordPress image in their marketplace and dropped it on a $5/month droplet. This leaves me with a typical LAMP setup with mostly default configurations for everything. I also wanted to test the configuration with Memcachedso that was installed as well.
In order to fully test W3 Total Cache I ran the following tests with 200 concurrent users out of Digital Ocean’s NYC3 data center. The server under test was located in Amsterdam.
Baseline Test – No caching at all. We just need to see what performance we can get without caching enabled.
Fragment Cache Only (Memcached)
Database Cache Only (Memcached)
Page Cache Only (Memcached)
All Caches Enabled (Memcached)
All Caches Enabled (OPCode APC)
All Caches Enabled (Disk Enhanced[page] & Disk)
It is important to note that W3 Total Cache will also do HTML, CSS, and JS minification. I’m not particularly interested in that, but do know that it can compress these outputs with a variety of different tools and may decrease overall load time.
High Level Results
W3 Total Cache is a solid caching plugin that gives you a lot of different options for getting good performance out of WordPress.
Response Time (90%)
Fragment Cache Only
Database Cache Only
Page Cache Only
All Caches (Memcached)
All Caches (OPCode APC)
All Caches (Disk)
As expected the baseline load test didn’t perform the greatest. It is no secret that WordPress doesn’t perform well under load without caching plugins, but we needed a baseline anyways.
We peaked at around 19 req/s and then started to trail off towards 5 req/s as the error rate increased. Looking at the response time distribution, you can see that things didn’t look much better.
I mean it could be worse. At least people could access your site in under 5 seconds, but most people bail in 1 second or less.
Fragment Cache Only
The fragment cache in W3 Total Cache is supposed to “reduce time for common operations”. I’m not sure what that means but by itself it didn’t seem to help me out very much.
With fragment caching enabled we still see a similar request and failure profile as when we didn’t have any caching enabled at all.
Response time distribution was similar to no caching. My guess is that fragment caching optimizes a very narrow set of operations and that my blog doesn’t really use any of them.
Database Cache Only
I expected to see some pretty good performance with the database cache enabled. Presumably this cached queries to MySQL with should have really helped us scale. It didn’t 🙁
You can see that we were just slightly better than with the fragment cache, but not even close to where I thought we would be.
The response time distribution was bad too. It actually is a bit higher than the other ones. I assume that this was just a fluke and that if I ran this test several times it would be the same as the others.
Page Cache Only
With full page caching enabled we start to see some good performance improvements.
The graph above shows that we hit about 60 req/s before the wheels start to fall off, but even then we’re able to continue serving requests at a respectable 50-ish req/s.
The response time distribution looks a lot better here as well. 90% of requests finished in 1.1 seconds. This isn’t bad at all considering you’re going at > 50 req/s.
All Caches Enabled (Memcached)
With all caches enabled we start to see some pretty impressive performance out of W3 Total Cache.
130 req/s is solid, but you’ll notice there is a steady stream of errors coming across as well. In general I would say that this setting is good to prevent to your site from going down but that you should look to increasing resources ASAP.
Response time distribution is where we start to see some really excellent gains. The pages were served almost immediately with the 99th percentile being 220ms. Much of that time can be attributed to latency due because the servers are so far apart.
All Caches Enabled (OPCode APC)
I expected the results of this test to be similar to Memcached because they’re both in-memory caches, but that wasn’t even close to true.
We initially peaked at 60 req/s and then started to see a large uptick in errors. Requests eventually started to climb back up, but the error rate stayed consistent.
I was pretty disappointed in the response time distribution. I expected it to be similar to Memcached but it looks like OPCode APC just isn’t as efficient. 90th percentile at 1.8s isn’t terrible but it isn’t good either.
All Caches Enabled (Disk)
Disk-based caching is very likely the type you would be using if you were in a shared hosting environment. The good news is that the disk-based caching that W3 Total Cache does is quite good.
133 req/s with a fairly small error rate is great! At this level you could easily handle 99% of the traffic scenarios you’ll see as a WordPress developer.
The response time distribution was similar to the Memcached version of this test in that 99% of requests finished in under 170ms. Not bad considering WordPress was fielding > 130 requests per second.
W3 Total Cache is an excellent choice for caching. While some of its configuration can be confusing, you can still get very acceptable performance by enabling page caching or all caching thats available. Its also worth noting that you could very likely get excellent performance with no errors if you tweaked some Apache and MySQL settings in addition to using W3 Total Cache.