After 5 months of development and 2 months of closed beta, we are happy to announce the public availability of Kernl’s WordPress Site Health service.
What is it?
Kernl’s WordPress Site Health service allows you to understand changes in your site’s health. But what does that mean? We monitor your response times, time to first byte (TTFB), Google Lighthouse scores, plugin changes (added, updated, or deleted), as well as CPU, disk, and memory usage.
Using the data above, you can see when the health of your site started to degrade and why. Maybe it was when a plugin got updated or installed? Perhaps you started to run out of memory? With Kernl’s WordPress Site Health you can figure this out at a glance.
Common Use Cases
While this tool is not a monitoring service, the value it can provide to you and your customers is immense.
Identify the source of performance regressions – The data gathered by our WordPress Site Health service makes it easy to identify the source of performance regressions. Because we capture machine-level information (CPU, Memory, Disk) and when plugins were added/updated/removed, it’s easy to draw a line between response time or TTFB increases and changes in your WordPress site & host.
Monitor changes in Google Lighthouse scores – Google Lighthouse is a series of tests against your site that checks for modern web best practices. Kernl runs Lighthouse against your site once per day and graphs the results. If you see a score go down, you can look at the other data we gather to try and tie it back to a change in your site.
WordPress Site Health Pricing & Limits
Kernl’s WordPress Site Health is included in every Kernl plan. The only thing that changes are the number of sites you can add and the check frequency.
# of Site
Sign Up Now
You can sign up now for free. Kernl comes with a 30 day free trial and no credit card is required.
I hope everyone had a great February! We didn’t too much feature development this month, but there was a lot of bug fixing and performance improvements, so let’s dive in!
Feature, Bugs, and Performance
Node.js – Kernl is now on Node.js 12.16.1. This release was all about security fixes.
Load Testing Machine Provisioning – We weren’t calculating the correct number of machines to provision on DigitalOcean. This lead to some serious over-provisioning when running load tests. This has been resolved, which means more customers can run more load tests at the same time.
Load Testing Secondary Node Behavior – Kernl uses Locust under the covers to run our WordPress Load Testing service. The Locust primary node has an argument called “–expect-slaves”. It tells Locust “Don’t start the load test until at this this many secondary nodes have connected.”. We weren’t calculating this number correctly which led to some weird behavior. This is now resolved so load tests should start in every situation now.
Easy Digital Downloads Domains – Kernl wasn’t passing the domain along to EDD. We now do this, which allows you to restrict updates to specific domains while using EDD.
Load Testing Snapshots – Kernl used to build up each load testing machine from the ground up every time a load test was started. We now start from a snapshot that gets us 50% of the way there. This has improved load test start times (especially on large tests) by an average of 30%.
GitHub Authorization Changes – The GitHub API is changing how it handles authorization headers. We’ve update Kernl to handle this change, so we’ll be good going forward when GitHub deprecates the old method.
High Traffic Endpoint Audit – We did an audit of our high-traffic API endpoints and cleaned some things up. Slight performance improvements were had (1%-2%), but mostly the improvements have been in code readability and comprehension.
GitLab Deployment Issues – In a recent release of GitLab they changed the required fields when asking for an access token via a refresh token. This broke all GitLab deployments for Kernl for a few days while we tracked down the issue. This has since been resolved.
Load Testing Unit & Integration Tests – When our load testing service was launched we weren’t sure if it was going to be successful. We’ve proven that it is a worthwhile feature, so now we’re focusing on reliability. We’re in the process of adding a suite of unit and integration tests around this functionality.
Choosing between Memcached or Redis for your WordPress cache is a tough decision. Not because they have vastly different performance profiles (they don’t), but because either choice is a good one depending on your needs. In this post we’re going to explore the differences between Redis and Memcached, how they perform for WordPress, and a lot of different non-performance things you should consider when making your choice.
What is Memcached?
Memcached is an open-source, high performance, distributed memory object caching system. What does that mean? It means you can store a bunch of strings in memory and access them really fast. From a WordPress perspective, it means that using a caching plugin like W3 Total Cache we can store the results of the complicated SQL queries that WordPress does in memory and have them available instantly.
What is Redis?
Redis is an open source in-memory store that can be used as a cache or a message broker. It’s a bit different then Memcached because you get a lot more out of the box with it. For example, Redis has built in replication, transactions, disk persistence, and provides high availability and partitioning. All those features means that managing WordPress can be a little harder to do, but not much harder. Especially if you just need to use it as a cache.
Both Redis and Memcached have excellent performance. They’re both used by some of the largest websites in the world and are fully ingrained in the Fortune 500. Given that all things are not created equal, let’s see how they perform with a read-heavy WordPress site (this blog).
The load tests are performed against the DigitalOcean WordPress Marketplace image with either Redis or Memcached installed alongside of it. The machines have 2 vCPUs, 2 GB RAM and live in DigitalOcean’s SFO2 (San Francisco) data center.
The load test configuration:
500 concurrent users
2 users / second ramp up
45 minute test ( ran twice )
Traffic comes from Digital Ocean’s NYC3 data center.
The content of the load test is a copy of this blog.
Baseline Performance (No Cache)
The baseline performance for WordPress with no cache isn’t great.
The response time also isn’t great. A little over 2 seconds on average.
Once we install Redis and configure W3 Total Cache to use it, the number of requests that we can handle increases substantially.
The requests remain steady at around 300 per second and no failures are recorded. The response time also improves quite a bit.
475ms isn’t bad at all. That’s 4 times faster response times then without any caching at all.
With Memcached installed and W3 Total Cache configured to use it, we see some excellent performance.
In this situation, Memcached performs even better then Redis with 425 req/s versus Redis’ 300 req/s. Response time improvements are similar.
The Memcached response time is almost 3 times faster than the Redis response time. In general, the results where Memcached is faster than Redis are surprising. In most benchmarks Redis is equal or faster than Memcached, so it’s likely a configuration problem.
When deciding what cache to use with your WordPress setup, there are a few other considerations your should be looking at:
Ease of setup – As you can see from the performance results above, Memcached has better performance out of the box. Knowing what I know about Redis this is likely a configuration issue, but the fact that I could get that level of performance with no configuration from Memcached is a good data point.
3rd Party Hosting – Do you really want to manage your own Redis or Memcached server? If you don’t, you’ll want to look at the landscape of 3rd party providers. Redis has a robust provider ecosystem. Memcached’s is a little less robust.
Persistence – Do you need your cache to survive a reboot? This is important if the cost of re-populating your cache is too high for your system. If you do need persistence, Redis is your best option.
High Availability– If you need high availability of your caching cluster, Redis is the clear winner here. Memcached can be made to operate this way, but Redis has it baked in to the core of the application.
If you’d like to see the full results of the load testing runs on Kernl, see the links below.
Load testing is fun. Breaking things is fun. Breaking WordPress with load testing is even more fun. But in the era of highly scaleable WordPress hosting solutions, can we even break WordPress anymore? Oh yes, yes we can. The Crucible Challenge can.
The Crucible WordPress Performance challenge is a deceptively simple test inspired by the poor ops teams that have to handle traffic from Super Bowl advertisements. Given a WordPress site with consistent content and URL mappings:
Handle 50,000 (@ 500 per second ramp up) concurrent authenticated users for 2 hours with load test generators in New York, London, Amsterdam, Singapore, Bangalore, San Francisco, Toronto and Frankfurt.
Have an error rate below 0.1%.
Average response time should be below 800ms.
Median response time should be below 700ms.
99th percentile response time should be below 800ms.
Half way through the test, you must flush your cache.
January was a pretty great month for Kernl. We got a lot of bug fixes, some performance improvements, and even a new beta feature out the door. Let’s dive in!
WordPress Site Health Beta
This month we released a beta of our WordPress Site Health service. The goal of this service is to help you determine where performance problems are and why they are happening. The dashboard below gives you a high level view of how performance looks on each of your sites.
Once you click in to any site you see data for the last 7 days.
The data here helps you diagnose performance issues. Using the plugin changes panel you can tie performance issues back to adding/removing/updating of plugins. Google Lighthouse scores show trends on you site’s performance, usability, SEO and more over time.
Improved performance of plugin/theme list pages – The API calls to /api/v1/plugins and /api/v1/themes were incredibly inefficient. They hadn’t really been touched since Kernl’s inception and needed some love. After doing some query optimization and stripping out unnecessary data, the payload and response time were reduced by a factor of 10. The worst case example was response size going from ~750KB to ~75KB and response time from ~4000ms to ~250ms.
Server resource increases – Kernl’s main Node.js application servers have been running with 1 vCPU and 1GB of RAM for about the last 2 years. Lately we’ve seen some resource exhaustion and decided it was time to upgrade. The new app servers now run with 2 vCPUs and 2GB of RAM.
Server disk space / inode exhaustion – The process of building plugins and themes can use up a lot of space and file system resources. We weren’t doing a great job of cleaning those resources up periodically which could cause some performance issue. We now clean up all temporary files once a day which should prevent this from happening anymore.
Easy Digital Download License Validation Error – There was a bug in EDD license validation where the source system wouldn’t send back valid JSON. This would break license validation instead of handling the error gracefully.
Profile page autocomplete – If you had form auto-completion on it would sometimes cause the profile page to reset your password. We’ve disabled autocomplete on this form to resolve the issue.
Theme tiles not showing correct build status – During the course of our performance improvement work we noticed that theme tiles were not showing the correct build status. This has now been resolved.
Welcome to the end of 2019! I hope that everyone has had as good a year as Kernl. Let’s dive in to the final update of 2019 to see what’s new.
Features, Bug Fixes, & Misc.
Improved License Management Search – License management now includes improved search functionality. The previous search functionality was flaky (at best) and not very discoverable. Search is now a first-class citizen, includes free-text search, and should greatly improve the overall usability of Kernl’s WordPress license management.
Load Testing Unit & Integration Tests – When we created the WordPress load testing service it was an experiment. Now that we have proved the viability of the service it’s time to work on stability and overall platform longevity. This month’s focus was on the authorization framework that our load testing service uses.
JS Bundle Size Reduction – Over the past year Kernl’s JS bundle size for our web app grew to over 2MB. We spent some time this month figuring out why and making changes to reduce it. In the end we were able to reduce the bundle size by over 50% down to 1.1MB.
Bug: Inconsistent Webhook & Deploy Key Behavior – After a few customer reported incidents with the automatic webhook and deploy key behavior, we discovered that Kernl wasn’t deleting local references to remote keys and hooks. If you have some issues with deploy keys or webhooks please contact us and we can help resolve the data inconsistency issues that this caused.
Node.js Upgrade to 12.14.0 – This month we upgrade all of our servers to use the latest LTS version of Node. This is includes stability and performance improvements.
Yoast SEO is a popular SEO enablement plugin for WordPress. It helps you avoid common mistakes when it comes to SEO on your blog and also handles things like social media “og:meta” tags. If your blog has a public audience, then it stands a good chance that you are using Yoast.
For this test we used the lowest-tier (1vCPU, 1GB RAM, $5) DigitalOcean droplet out of their SFO2 datacenter. Our server setup was as follows:
Ubuntu 19.04 with all updates installed.
The theme that was used was TwentyTwenty with no modifications and the content tested was 3 “What’s new with Kernl?” posts from earlier this year. We selected a small number due to the effort of filling out all of the SEO data in Yoast.
The WordPress setup was bare-bones. There were no plugins installed except for when we were running the Yoast test.
As with our previous post in this series, the load for this test was generated out of DigitalOcean’s NYC3 datacenter.
The tests were with 200 concurrent users over the course of 1 hour.
Max Requests per Second
One method for determining website performance is what is the maximum number of requests that in can field in a given second. For our purposes this is a pretty good indicator of the performance hit you take for installing plugins.
As you can see from the image above you lose about 25% of your maximum capacity from installing Yoast SEO. With no plugins installed we were able to hit 43 req/s, while with Yoast installed that number went down to 30 req/s.
It’s worth noting here that 30 req/s is 2.5 million requests a day.
First Error Occurrence
The next chart shows when we first started to see errors in our two tests.
Without the plugin installed WordPress was able to hit 38 req/s before seeing errors. Once we enabled Yoast that number went down to 28 req/s. Once again, this is consistent with the performance penalty we saw with the maximum requests per second of about 25%.
Average Response Time
The average response time with and without Yoast SEO tells a similar story to the requests per second measures we have done.
The chart above shows us that without any plugins installed, the average response time under load is around 3000ms. With the Yoast plugin installed the response time goes up to about 4300ms. We’re looking at a roughly 25% change in response time.
99th Percentile Response Time
The 99th percentile chart can be read as “99% of all requests finished in under this time”.
The chart above tells us that without any plugins installed 99% of our requests finished in under ~3600ms. With Yoast SEO installed 99% of our requests finished in ~4900ms. Once again, a roughly 25% penalty for having Yoast installed.
Yoast SEO Performance Conclusions
Yoast is a really good SEO plugin. If SEO matters to you it’s definitely a plugin you should have installed. However you should use caching if you do use it. This goes for WordPress in general, but every plugin you add to WordPress has a performance cost associated with it. If you run a site where caching is difficult you’ll have to carefully weigh the performance cost versus benefit of installing Yoast SEO.
If you are just starting with Cloudways it can be tough to figure out where your money is best spent! The prices are different for each platform and how they are sized isn’t them same either. This review gathers data using Kernl’s WordPress Load Testing tool and then massages that data into someone easy to digest.
Google Cloud – A “small” instance out of their Northern Virginia data center. $39.36 / month.
Amazon Web Services – A “small” instance out of their Northern Virginia data center. $36.51 / month.
Vultr – “2GB” plan out of their New Jersey data center. $23 / month
Linode – “2GB” plan from their New Jersey data center. $24 / month
Digital Ocean – “2GB” plan from one of the New York City data centers. $22 / month.
And we had the following load testing setup:
Content was an export of this blog
1000 concurrent users
30 minute duration
3 users per second ramp up
Load generating machines were in the San Francisco #2 Digital Ocean data center.
Overall Cloudways Performance and Value
Each provider performed extremely well in our tests. No errors were reported and they all reached over 700 requests / s sustained traffic. However not all hosts are created equal.
Costs / Month (cents)
$ / 1000 req (cents)
Given that each load test was exactly the same, you can see that some hosts performed better than others.
AWS and GCP are a bit more costly than Linode, Vultr, and Digital Ocean and they also performed worse than the lower cost providers.
Cloudways Provider Response Times
While cost per requests is a good metric we also care about the quality of those requests. Quality can be measured in a number of ways, but response time is often a “good enough” measure of how well a particular host performs. For the providers that you can get through Cloudways, the variance in request quality is extremely interesting.
The chart below shows the 99th percentile response time performance for each provider. It means that 99% of all requests during the load test finished at or below this time.
You can see that lower cost providers once again out-perform the higher cost AWS and GCP in pretty significant ways. The really interesting result here is that Vultr responded to 99% of requests in 240ms or less! Way to go Vultr!
If you aren’t sure which provider is the best value on Cloudways, from a raw cents per requests perspective you can’t go wrong with Linode, Vultr, and Digital Ocean. It’s the same story when you consider the quality (re: response time) of service: Vultr, Linode, and Digital Ocean are the clear winners here.
November was a great month for Kernl! After several years of trying, we finally launched team management and also made our pricing structure easier to understand. Let’s dive in!
Team Management – With our new Agency and Unlimited plans you can now grant users access to your account! The Agency plan allows for 3 team members and unlimited allows for unlimited team members.
Agency & Unlimited Plans – We introduced an unlimited plan for Kernl which has no usage limits (with the exception of load testing, but those limits have been increased substantially) and includes team management and Kernl Analytics. We also updated our agency plan to more closely match the old enterprise plan. The new agency plan has much higher limits than the previous agency plan as well as access to team management.
Bug Fixes & Other
Increased Load Test Generator VM Size – We’ve increased the load test generator machines from 1vCPU to 2vCPUs to allow us to scale our load testing up to 50,000 concurrent users.
Repository Sync – When we changed how you connected to your Git repositories we overlooked the ability to sync them from that same location. We’ve added that ability back in.
Load Test List Page Performance – With some clever SQL querying and awesome Postgres built-in functions, we’ve decreased the average load time on this page by 32%.
Bug: Invalid license if using licenses but no versions present – An odd edge case was found where Kernl would say your license was invalid if you didn’t have any plugin/theme versions uploaded. This has been resolved.
Bug: IPv6 Issue on Load Generators – The virtual machines that we spin up the Digital Ocean Singapore data center were having issues communicating over IPv6. We disabled IPv6 for all load generators for the time being.
Load Testing Service Node.js Upgrade – The WordPress load testing service backend and workers have been upgraded from Node.js 10.x to Node.js 12.x. All packages were upgraded to their latest with this change.
Analytics Service Node.js Upgrade – The WordPress analytics service backend was upgraded from Node.js 10.x to Node.js 12.x. All packages were upgraded to their latest with this change.
In the world of WordPress there are a lot of different plugins you can install to extend its functionality. One of the most popular plugins is Wordfence, a security plugin developed by the fine folks over at Defiant.
Most WordPress developers understand that the more plugins you add to your site the slower it goes, but exactly how much slower isn’t something that is often measured. More importantly, nobody has bothered to figure out the performance implications of installing many of the most popular plugins available to WordPress users today.
This all changes now with this series of blog posts exploring the performance implications of different WordPress plugins. In this series of posts we’ll test each plugin in isolation and then with caching enabled using Kernl’s WordPress load testing service. First up, is Wordfence.
The test machine for these tests was a $5 Digital Ocean droplet in their SFO2 (San Francisco) data center. The machine has 1GB of RAM, 1 vCPU, and a 25GB hard disk.
The software installed on the machine is as follows:
Ubuntu 19.04 with all updates installed.
For tests that required caching we used our favorite caching plugin, W3 Total Cache, with memcached as the data store.
The theme that was used was TwentyTwenty with no modifications and the content was an export of this blog.
All traffic was generated out of Digital Ocean’s NYC3 (New York City) data center.
A series of 4 tests were run to test the performance implications of installing Wordfence. They were:
Each test was with 200 concurrent users for 1 hour.
Maximum Requests per Second
The first metric that we looked at was the maximum requests per second that the site was able to handle under each situation outlined above.
As you can see the difference between having Wordfence enabled and having Wordfence disabled is huge. The non-cached site with no plugins enabled handled a maximum of 26 requests per second. With Wordfence enabled it could only handle 12.
More interesting though was that with caching enabled (W3 Total Cache) the site could handle 165 requests per second, but only 67 requests per second with Wordfence enabled.
These results were so surprising that we ran the tests twice. The results were the same (within 1%-2%) each time.
First Error Occurrence
The next metric we looked at was when did the first error occur during our load tests.
Once again we see that adding Wordfence took a pretty serious toll on our performance. For the baseline test (no plugins, no cache) we see our first error at around 26 requests / second. In the Wordfence test with no cache we saw it at 12 requests / second.
With our caching enabled test, we never saw any errors when Wordfence wasn’t enabled. However with Wordfence enabled we saw our first error at 56 requests / second.
Average Response Time
Now that we’ve looked at server capacity metrics (max requests, first error), let’s take a look at how your end user experience changes with Wordfence enabled.
These results were fairly interesting and not at all what was expected. The average response time for the baseline test was around 4.6 seconds, while the average response with Wordfence enabled was about 2.5 seconds. So why might that be? Looking through the data it appears that the baseline test had far more successful requests and that with Wordfence enabled the requests seemed to fail faster. In short, baseline test had higher throughput but slower response time. The Wordfence test had lower throughput and faster response time.
Turning our attention to the caching scenarios, we can see that enabling Wordfence is extremely problematic at scale. With caching enabled, our baseline test had an average response time of 146ms. With caching + Wordfence that time ballooned over 10x to 1874ms.
99th Percentile Response Time
Our final metric was looking at the 99th percentile response times for our different scenarios. What does that mean? It answers the question “How long does it take for 99% of requests to finish?”. This intentionally leaves out the last 1% because those are often outliers.
As you can see above, the un-cached 99th percentile hovers right around 5s when Wordfence is enabled or disabled. Since this is the 99th percentile such a high response time isn’t too surprising in both scenarios.
The more interesting scenario is when caching is enabled. For WordPress with only a caching plugin running, the 99th percentile is 370ms. That means 99% of all requests finished in 370ms. With Wordfence also enabled (Wordfence + caching), that number jumped to 2400ms. That’s a ~7X increase.
From these tests we came to a few conclusions:
Caching does not solve all of your performance problems. If your cache strategy relies on requests making it all the way through to WordPress, then you are very likely to still take a performance hit from other plugins. Things like Cloudflare, Varnish, Litespeed, or Nginx can help alleviate this problem.
Running Wordfence is expensive from a performance standpoint. The data suggests that by simply enabling Wordfence you lose about 50% of your maximum capacity and can increase your response time between 2x-7x.
Wordfence is still worth itfor a lot of people. If you’ve operated a WordPress site for any length of time you know how often they get attacked. Wordfence does a great job of reducing attack surface area and making it hard for people to attack you.