Load testing is fun. Breaking things is fun. Breaking WordPress with load testing is even more fun. But in the era of highly scaleable WordPress hosting solutions, can we even break WordPress anymore? Oh yes, yes we can. The Crucible Challenge can.
The Crucible WordPress Performance challenge is a deceptively simple test inspired by the poor ops teams that have to handle traffic from Super Bowl advertisements. Given a WordPress site with consistent content and URL mappings:
Handle 50,000 (@ 500 per second ramp up) concurrent authenticated users for 2 hours with load test generators in New York, London, Amsterdam, Singapore, Bangalore, San Francisco, Toronto and Frankfurt.
Have an error rate below 0.1%.
Average response time should be below 800ms.
Median response time should be below 700ms.
99th percentile response time should be below 800ms.
Half way through the test, you must flush your cache.
January was a pretty great month for Kernl. We got a lot of bug fixes, some performance improvements, and even a new beta feature out the door. Let’s dive in!
WordPress Site Health Beta
This month we released a beta of our WordPress Site Health service. The goal of this service is to help you determine where performance problems are and why they are happening. The dashboard below gives you a high level view of how performance looks on each of your sites.
Once you click in to any site you see data for the last 7 days.
The data here helps you diagnose performance issues. Using the plugin changes panel you can tie performance issues back to adding/removing/updating of plugins. Google Lighthouse scores show trends on you site’s performance, usability, SEO and more over time.
Improved performance of plugin/theme list pages – The API calls to /api/v1/plugins and /api/v1/themes were incredibly inefficient. They hadn’t really been touched since Kernl’s inception and needed some love. After doing some query optimization and stripping out unnecessary data, the payload and response time were reduced by a factor of 10. The worst case example was response size going from ~750KB to ~75KB and response time from ~4000ms to ~250ms.
Server resource increases – Kernl’s main Node.js application servers have been running with 1 vCPU and 1GB of RAM for about the last 2 years. Lately we’ve seen some resource exhaustion and decided it was time to upgrade. The new app servers now run with 2 vCPUs and 2GB of RAM.
Server disk space / inode exhaustion – The process of building plugins and themes can use up a lot of space and file system resources. We weren’t doing a great job of cleaning those resources up periodically which could cause some performance issue. We now clean up all temporary files once a day which should prevent this from happening anymore.
Easy Digital Download License Validation Error – There was a bug in EDD license validation where the source system wouldn’t send back valid JSON. This would break license validation instead of handling the error gracefully.
Profile page autocomplete – If you had form auto-completion on it would sometimes cause the profile page to reset your password. We’ve disabled autocomplete on this form to resolve the issue.
Theme tiles not showing correct build status – During the course of our performance improvement work we noticed that theme tiles were not showing the correct build status. This has now been resolved.
Welcome to the end of 2019! I hope that everyone has had as good a year as Kernl. Let’s dive in to the final update of 2019 to see what’s new.
Features, Bug Fixes, & Misc.
Improved License Management Search – License management now includes improved search functionality. The previous search functionality was flaky (at best) and not very discoverable. Search is now a first-class citizen, includes free-text search, and should greatly improve the overall usability of Kernl’s WordPress license management.
Load Testing Unit & Integration Tests – When we created the WordPress load testing service it was an experiment. Now that we have proved the viability of the service it’s time to work on stability and overall platform longevity. This month’s focus was on the authorization framework that our load testing service uses.
JS Bundle Size Reduction – Over the past year Kernl’s JS bundle size for our web app grew to over 2MB. We spent some time this month figuring out why and making changes to reduce it. In the end we were able to reduce the bundle size by over 50% down to 1.1MB.
Bug: Inconsistent Webhook & Deploy Key Behavior – After a few customer reported incidents with the automatic webhook and deploy key behavior, we discovered that Kernl wasn’t deleting local references to remote keys and hooks. If you have some issues with deploy keys or webhooks please contact us and we can help resolve the data inconsistency issues that this caused.
Node.js Upgrade to 12.14.0 – This month we upgrade all of our servers to use the latest LTS version of Node. This is includes stability and performance improvements.
Yoast SEO is a popular SEO enablement plugin for WordPress. It helps you avoid common mistakes when it comes to SEO on your blog and also handles things like social media “og:meta” tags. If your blog has a public audience, then it stands a good chance that you are using Yoast.
For this test we used the lowest-tier (1vCPU, 1GB RAM, $5) DigitalOcean droplet out of their SFO2 datacenter. Our server setup was as follows:
Ubuntu 19.04 with all updates installed.
The theme that was used was TwentyTwenty with no modifications and the content tested was 3 “What’s new with Kernl?” posts from earlier this year. We selected a small number due to the effort of filling out all of the SEO data in Yoast.
The WordPress setup was bare-bones. There were no plugins installed except for when we were running the Yoast test.
As with our previous post in this series, the load for this test was generated out of DigitalOcean’s NYC3 datacenter.
The tests were with 200 concurrent users over the course of 1 hour.
Max Requests per Second
One method for determining website performance is what is the maximum number of requests that in can field in a given second. For our purposes this is a pretty good indicator of the performance hit you take for installing plugins.
As you can see from the image above you lose about 25% of your maximum capacity from installing Yoast SEO. With no plugins installed we were able to hit 43 req/s, while with Yoast installed that number went down to 30 req/s.
It’s worth noting here that 30 req/s is 2.5 million requests a day.
First Error Occurrence
The next chart shows when we first started to see errors in our two tests.
Without the plugin installed WordPress was able to hit 38 req/s before seeing errors. Once we enabled Yoast that number went down to 28 req/s. Once again, this is consistent with the performance penalty we saw with the maximum requests per second of about 25%.
Average Response Time
The average response time with and without Yoast SEO tells a similar story to the requests per second measures we have done.
The chart above shows us that without any plugins installed, the average response time under load is around 3000ms. With the Yoast plugin installed the response time goes up to about 4300ms. We’re looking at a roughly 25% change in response time.
99th Percentile Response Time
The 99th percentile chart can be read as “99% of all requests finished in under this time”.
The chart above tells us that without any plugins installed 99% of our requests finished in under ~3600ms. With Yoast SEO installed 99% of our requests finished in ~4900ms. Once again, a roughly 25% penalty for having Yoast installed.
Yoast SEO Performance Conclusions
Yoast is a really good SEO plugin. If SEO matters to you it’s definitely a plugin you should have installed. However you should use caching if you do use it. This goes for WordPress in general, but every plugin you add to WordPress has a performance cost associated with it. If you run a site where caching is difficult you’ll have to carefully weigh the performance cost versus benefit of installing Yoast SEO.
If you are just starting with Cloudways it can be tough to figure out where your money is best spent! The prices are different for each platform and how they are sized isn’t them same either. This review gathers data using Kernl’s WordPress Load Testing tool and then massages that data into someone easy to digest.
Google Cloud – A “small” instance out of their Northern Virginia data center. $39.36 / month.
Amazon Web Services – A “small” instance out of their Northern Virginia data center. $36.51 / month.
Vultr – “2GB” plan out of their New Jersey data center. $23 / month
Linode – “2GB” plan from their New Jersey data center. $24 / month
Digital Ocean – “2GB” plan from one of the New York City data centers. $22 / month.
And we had the following load testing setup:
Content was an export of this blog
1000 concurrent users
30 minute duration
3 users per second ramp up
Load generating machines were in the San Francisco #2 Digital Ocean data center.
Overall Cloudways Performance and Value
Each provider performed extremely well in our tests. No errors were reported and they all reached over 700 requests / s sustained traffic. However not all hosts are created equal.
Costs / Month (cents)
$ / 1000 req (cents)
Given that each load test was exactly the same, you can see that some hosts performed better than others.
AWS and GCP are a bit more costly than Linode, Vultr, and Digital Ocean and they also performed worse than the lower cost providers.
Cloudways Provider Response Times
While cost per requests is a good metric we also care about the quality of those requests. Quality can be measured in a number of ways, but response time is often a “good enough” measure of how well a particular host performs. For the providers that you can get through Cloudways, the variance in request quality is extremely interesting.
The chart below shows the 99th percentile response time performance for each provider. It means that 99% of all requests during the load test finished at or below this time.
You can see that lower cost providers once again out-perform the higher cost AWS and GCP in pretty significant ways. The really interesting result here is that Vultr responded to 99% of requests in 240ms or less! Way to go Vultr!
If you aren’t sure which provider is the best value on Cloudways, from a raw cents per requests perspective you can’t go wrong with Linode, Vultr, and Digital Ocean. It’s the same story when you consider the quality (re: response time) of service: Vultr, Linode, and Digital Ocean are the clear winners here.
November was a great month for Kernl! After several years of trying, we finally launched team management and also made our pricing structure easier to understand. Let’s dive in!
Team Management – With our new Agency and Unlimited plans you can now grant users access to your account! The Agency plan allows for 3 team members and unlimited allows for unlimited team members.
Agency & Unlimited Plans – We introduced an unlimited plan for Kernl which has no usage limits (with the exception of load testing, but those limits have been increased substantially) and includes team management and Kernl Analytics. We also updated our agency plan to more closely match the old enterprise plan. The new agency plan has much higher limits than the previous agency plan as well as access to team management.
Bug Fixes & Other
Increased Load Test Generator VM Size – We’ve increased the load test generator machines from 1vCPU to 2vCPUs to allow us to scale our load testing up to 50,000 concurrent users.
Repository Sync – When we changed how you connected to your Git repositories we overlooked the ability to sync them from that same location. We’ve added that ability back in.
Load Test List Page Performance – With some clever SQL querying and awesome Postgres built-in functions, we’ve decreased the average load time on this page by 32%.
Bug: Invalid license if using licenses but no versions present – An odd edge case was found where Kernl would say your license was invalid if you didn’t have any plugin/theme versions uploaded. This has been resolved.
Bug: IPv6 Issue on Load Generators – The virtual machines that we spin up the Digital Ocean Singapore data center were having issues communicating over IPv6. We disabled IPv6 for all load generators for the time being.
Load Testing Service Node.js Upgrade – The WordPress load testing service backend and workers have been upgraded from Node.js 10.x to Node.js 12.x. All packages were upgraded to their latest with this change.
Analytics Service Node.js Upgrade – The WordPress analytics service backend was upgraded from Node.js 10.x to Node.js 12.x. All packages were upgraded to their latest with this change.
In the world of WordPress there are a lot of different plugins you can install to extend its functionality. One of the most popular plugins is Wordfence, a security plugin developed by the fine folks over at Defiant.
Most WordPress developers understand that the more plugins you add to your site the slower it goes, but exactly how much slower isn’t something that is often measured. More importantly, nobody has bothered to figure out the performance implications of installing many of the most popular plugins available to WordPress users today.
This all changes now with this series of blog posts exploring the performance implications of different WordPress plugins. In this series of posts we’ll test each plugin in isolation and then with caching enabled using Kernl’s WordPress load testing service. First up, is Wordfence.
The test machine for these tests was a $5 Digital Ocean droplet in their SFO2 (San Francisco) data center. The machine has 1GB of RAM, 1 vCPU, and a 25GB hard disk.
The software installed on the machine is as follows:
Ubuntu 19.04 with all updates installed.
For tests that required caching we used our favorite caching plugin, W3 Total Cache, with memcached as the data store.
The theme that was used was TwentyTwenty with no modifications and the content was an export of this blog.
All traffic was generated out of Digital Ocean’s NYC3 (New York City) data center.
A series of 4 tests were run to test the performance implications of installing Wordfence. They were:
Each test was with 200 concurrent users for 1 hour.
Maximum Requests per Second
The first metric that we looked at was the maximum requests per second that the site was able to handle under each situation outlined above.
As you can see the difference between having Wordfence enabled and having Wordfence disabled is huge. The non-cached site with no plugins enabled handled a maximum of 26 requests per second. With Wordfence enabled it could only handle 12.
More interesting though was that with caching enabled (W3 Total Cache) the site could handle 165 requests per second, but only 67 requests per second with Wordfence enabled.
These results were so surprising that we ran the tests twice. The results were the same (within 1%-2%) each time.
First Error Occurrence
The next metric we looked at was when did the first error occur during our load tests.
Once again we see that adding Wordfence took a pretty serious toll on our performance. For the baseline test (no plugins, no cache) we see our first error at around 26 requests / second. In the Wordfence test with no cache we saw it at 12 requests / second.
With our caching enabled test, we never saw any errors when Wordfence wasn’t enabled. However with Wordfence enabled we saw our first error at 56 requests / second.
Average Response Time
Now that we’ve looked at server capacity metrics (max requests, first error), let’s take a look at how your end user experience changes with Wordfence enabled.
These results were fairly interesting and not at all what was expected. The average response time for the baseline test was around 4.6 seconds, while the average response with Wordfence enabled was about 2.5 seconds. So why might that be? Looking through the data it appears that the baseline test had far more successful requests and that with Wordfence enabled the requests seemed to fail faster. In short, baseline test had higher throughput but slower response time. The Wordfence test had lower throughput and faster response time.
Turning our attention to the caching scenarios, we can see that enabling Wordfence is extremely problematic at scale. With caching enabled, our baseline test had an average response time of 146ms. With caching + Wordfence that time ballooned over 10x to 1874ms.
99th Percentile Response Time
Our final metric was looking at the 99th percentile response times for our different scenarios. What does that mean? It answers the question “How long does it take for 99% of requests to finish?”. This intentionally leaves out the last 1% because those are often outliers.
As you can see above, the un-cached 99th percentile hovers right around 5s when Wordfence is enabled or disabled. Since this is the 99th percentile such a high response time isn’t too surprising in both scenarios.
The more interesting scenario is when caching is enabled. For WordPress with only a caching plugin running, the 99th percentile is 370ms. That means 99% of all requests finished in 370ms. With Wordfence also enabled (Wordfence + caching), that number jumped to 2400ms. That’s a ~7X increase.
From these tests we came to a few conclusions:
Caching does not solve all of your performance problems. If your cache strategy relies on requests making it all the way through to WordPress, then you are very likely to still take a performance hit from other plugins. Things like Cloudflare, Varnish, Litespeed, or Nginx can help alleviate this problem.
Running Wordfence is expensive from a performance standpoint. The data suggests that by simply enabling Wordfence you lose about 50% of your maximum capacity and can increase your response time between 2x-7x.
Wordfence is still worth itfor a lot of people. If you’ve operated a WordPress site for any length of time you know how often they get attacked. Wordfence does a great job of reducing attack surface area and making it hard for people to attack you.
One of the most requested features that Kernl gets is Team Management. Team management can take on a lot of different forms, but for Kernl’s purposes it is the ability to share access via the Kernl interface with multiple people with different sets of credentials.
Starting today you can sign up for the Kernl Unlimited plan (see below) and start managing your team. Team management allows you to share your Kernl account with authorized users. When a user is part of your team, they have a slightly restricted view of Kernl but otherwise see the same things you do. What’s restricted?
Continuous Deployment – Adding connected Git repositories is the responsibility of the admin. Non-admin users don’t need to see this information.
Billing – Once again, this is an admin responsibility. The admin is able to change plans, add credit cards, etc.
Team Management – Only the admin can add and remove users from their account.
Aside from that everything is the same for admin and non-admin users.
The Unlimited Plan & Pricing Changes
In addition to team management we are also reducing the number of plans that Kernl has. Prior to this change we had 5 different plans (solo, agency, enterprise, huge, massive) which seemed like a bit much. To simplify Kernl’s pricing structure and give our customers better options we went down to 3 plans:
Solo – This $9 per month plan is our base plan. Most customers start here and then grow into larger plans as their needs change.
Agency – The $39 agency plan is essentially our former “Enterprise” plan but with increased limits for plugins, themes, licenses, load testing, and feature flags.
Unlimited – This new $79 unlimited plan allows you to have unlimited plugins, themes, licenses, feature flags and very high limited on load testing (50K concurrent users). This plan also includes team management so you can more easily control access to your Kernl account.
As always, current Kernl customers are grandfathered into the plan that they are currently on.
Now that Kernl has a more robust upper-level offering we hope to start rolling more features into it. For example:
In the near future we’ll include Kernl Analytics with Unlimited plan.
We’re also planning on rolling out limited team management into the Agency plan.
WordPress requires that you use a MySQL compatible database for its database backend. It used to be that you could confidently choose MySQL and go on with life, but in 2019 the choice isn’t quite that simple. With the MySQL, MariaDB, and Percona as the attractive options, how do you know which to choose?
Choosing a database isn’t always about performance, but for the sake of this article it will be 🙂
For this series of tests we tested database performance out-of-the-box with no special tweaks. It is quite possible that a professional database administrator could make each database run more performant, but most people hosting WordPress aren’t DBAs. With regards to caching, none was enabled. We wanted to test database performance, not cache performance.
What was tested?
The WordPress host machine was a DigitalOcean CPU Optimized droplet with 16 vCPUs (dedicated hyper-threads) and 32 GB of RAM. This monster of a machine was chosen so that we could be certain that Nginx + PHP-FPM weren’t the cause of any bottlenecks. A minor tweak to the PHP-FPM config allowed for full use of all 16 vCPUs.
The database host machines were DigitalOcean CPU Optimized droplets with 4 vCPUs (dedicated hyper-threads) and 8GB of RAM. CPU Optimized droplets were chosen because we didn’t want our tests to be at the mercy of shared CPU resources.
Each deployment was in DigitalOcean’s SFO2 region with the WordPress server communicating with the database over the internal private network. The traffic producing nodes were deployed in DigitalOcean’s NYC3 data center and communicated via the public internet.
For each database that was tested, we ran a load test with the following parameters:
500 concurrent users
2 req/s ramp up
30 minute duration
The goal here wasn’t to bring the database to it’s knees but instead see how it performed under sustained heavy load, but not so heavy that it falls over.
MariaDB WordPress Performance
During the MySQL acquisition of Oracle in 2009 there was a lot of concern amongst the core developers that Oracle would eventually close off MySQL to the world (similar to Oracle’s business model). Before that could take place, a GPL fork of MySQL was created named MariaDB.
MariaDB is open source and in active development. But how does it stand up to our WordPress load tests? Let’s find out.
First we’ll take a look at the requests and failures per second.
As you can see from the results above the performance scales up very well over time eventually peaking at ~379 req/s. We see periodic database-related errors (the vCPUs on the database were almost completely saturated) but nothing too crazy. Next, let’s see how the response times look.
The median response time of WordPress when backed by MariaDB under heavy load stays remarkably consistent. You can see that the average is slowly creeping up by the end of the test, but nothing that would be noticeable by customers yet.
The response time distribution is a little more interesting than the median response time. 90% of all requests finish in under 300ms and 99% of all requests finish in less than 500ms. Overall, the performance of MariaDB out of the box with no configuration is quite good.
Percona WordPress Performance
Another open-source fork of MySQL, Percona was started in 2006 and has been steadily delivering value-added features and enterprise support on top of MySQL for more than 12 years. With all that accumulated experience, how does Percona measure up?
Overall the Percona WordPress performance was pretty good. Not quite a good as MariaDB but nothing to scoff at. The one interesting thing was that the error rate was consistent across most of the test once the request volume passed ~200 requests/s. Lets see if the response time graph adds to the story.
The data here is actually pretty interesting. The response time for Percona was comparable to MariaDB right up until the time it started getting consistent errors. After that, it more than doubled and stayed that way for the rest of the test.
The response time distribution tells similar story to the response time chart. The difference between the 50th percentile and the 99th percentile shows that performance was very consistent across the entire test, it just wasn’t as good as MariaDB.
MySQL WordPress Performance
Last but not least (well….) in our test is MySQL. It’s been around since 1994 and is now owned by Oracle. In it’s latest releases there has been lot of great features such as window functions and more JSON features to compete with PostgreSQL. So let’s see how it did.
MySQL had a similar trajectory to Percona: It did pretty good across the board with a small but consistent series of errors after it started to go past 200 req/s. Also note that it never achieved higher than 300 req/s, which both MariaDB and Percona did.
The shape of the MySQL response time graph looks nearly identical to Percona, just about 50ms slower once the vCPUs started to get saturated. The detailed look in the response time distribution tells a similar story.
The response time distribution is where you can see Percona and MySQL diverge a bit. At the 99th percentile MySQL is returning at 875ms, while Percona was more like 550ms. In general, the distribution matches what one would expect given how the response time graph looks
The out of the box numbers for MariaDB make it look like the clear winner here. And compared to both Percona and MySQL it is in the raw performance department. This isn’t to say that you couldn’t tune Percona or MySQL to out-perform MariaDB, but only that you get more performance with zero configuration changes.