Load Testing the CloudWays Managed WordPress Service

At the beginning of December Kernl launched the closed beta of it’s WordPress load testing service. As a test to shake out any bugs we’ve decided to run a blog series load testing managed WordPress services. Today we’re going to talk about the CloudWays managed WordPress service. In particular, CloudWays deployed to Vultr.

How is the platform judged?

Cloudways will be tested using 3 different load tests:

  • The Baseline – This is a 200 concurrent user, 10 minutes, 2 user/s ramp up test from San Francisco. This test is used to verify the test configuration and to make sure that Cloudways doesn’t go belly-up before we get started 🙂
  • The Sustained Traffic Test – This test is for 2000 concurrent users, ramps up at 2 users/s, from San Francisco, for 2 hours. The sustained traffic test represents a realistic load for a high traffic site.
  • The Traffic Spike Test – This test is intentionally brutal. It simulates 20000 concurrent users, ramps up at 10 users/s, from San Francisco, for 1 hour. It represents the sort of traffic pattern you might see if a Twitter celebrity shared a link to your blog.

What CloudWays plan was used?

For this test we used the lowest tier plan available while hosting on Vultr. The cost of the plan is $11 / month and includes full SSH access to the box that CloudWays deploys your WordPress instance on.

CloudWays $11 / month plan hosted on Vultr
Selected CloudWays Plan

Where does the traffic originate?

The traffic for this load test originates in Digital Ocean‘s SFO2 (San Francisco) data center. The Vultr server lives in their Seattle data center.

The baseline load test

200 concurrent users, 2 users / s ramp up, 10 minutes, SFO

The baseline WordPress load test that we did with CloudWays is used to test configuration. CloudWays performed well on this test. You can see from the request graph that we settled in at around 25 requests / second.

CloudWays BaseLine Load Test Requests
CloudWays Baseline Test – Requests per second

The failure graph for the baseline load test was empty, which is generally expected for the baseline test.

CloudWays BaseLine Load Test Failures
CloudWays Baseline Test – Failures

Finally the request distribution graph for the baseline test. You can see that 99% of the requests finished in ~200ms. There was at least one outlier at the ~5000ms mark, but this isn’t uncommon for load tests.

CloudWays BaseLine Load Test Response Time Distribution
CloudWays Baseline – Response Time Distribution

The sustained heavy traffic load test

2000 concurrent users, 2 users / s ramp up, 2 hours, SFO

The sustained traffic load test represents what a WordPress site with high readership might look like day over day.  The CloudWays setup responded quite well for the hardware that it was on.

CloudWays Sustained Heavy Traffic Load Test - Requests
CloudWays Sustained Load Test – Requests

You can see that performance was great for the first 10% of the test. The CloudWays setup had no trouble handling the load thrown at it. However once we started getting to around 85 requests / second the hardware had trouble keeping up with the request volume. You can see from the choppy behavior of request graph that the Varnish server which sits in front of WordPress was starting to get overwhelmed by the request volume. Considering that this particular CloudWays plan was deployed to a low-level Vultr VM, this performance isn’t bad at all.

The failure graph was a little disappointing, but not unexpected knowing the hardware that we tested on. It is very likely that if we tested on a more robust underlying Vultr box we would have had much better results. You can see that failures increased in a fairly linear rate through the whole load test.

CloudWays Sustained Heavy Traffic Load Test - Failures
CloudWays Sustained Load Test – Failures

The final graph for this test is the response distribution graph. This graph shows you for a given percentage of requests how many milliseconds they took to complete. In this case CloudWays didn’t perform great, but once again I’ll point to the fact that the underlying Vultr hardware isn’t that robust.

CloudWays Sustained Heavy Traffic Load Test - Response Time Distribution
CloudWays Sustained Load Test – Response time distribution

From the graph you can see that 99% of requests completed in ~95 seconds. Yes, you read that correctly. You can interpret this graph as you like but taking the other graphs into consideration you can see that Varnish and the underlying Vultr hardware were completely overwhelmed. Knowing that makes this a little less terrible. We suspect that a smaller load test (maybe 750 concurrent users?) might yield a far better response time distribution. Once a server becomes overwhelmed the response time distribution tends to go in a bad direction.

The traffic spike load test

20000 concurrent users, 10 users / s ramp up, 1 hour, SFO

Given what we know about the sustained traffic load test your expectations for how this test went are probably spot on. CloudWays did as good as can be expected with how the underlying hardware is allocated, but you would likely need to upgrade to a much larger plan to handle this level of traffic. We ended up stopping this load test after about 30 minutes due to the increased failure rate.

CloudWays Traffic Spike Load Test - Requests
CloudWays Traffic Spike Load Test – Requests per Second

The requests per second never really leveled out. It isn’t clear what the underlying reason was for the uneven level at the top of the graph. Regardless, top-end performance was similar to the sustained traffic test.

The failure chart looks as we expected it to. After a certain point we start to see increased failure rates. They continue up and to the right in a mostly linear fashion.

CloudWays Traffic Spike Load Test - Failures
CloudWays Traffic Spike Load Test – Requests per Second

The response time distribution is really bad for this test.

CloudWays Traffic Spike Load Test - Response Time Distribution
CloudWays Traffic Spike Load Test – Response Time Distribution

As you can see 80% of the requests finished in < 50s which means that 20% of the requests took longer than that. The 99% mark was only reached after > 200s, at which point the user is likely long gone.

Conclusions

For $11 / month the CloudWays managed WordPress installation did a great job, but there are better performers out there in the same price range (GoDaddy for instance). For the sake of this review which only looks at raw performance, CloudWays probably isn’t the best choice. But if you’re looking for good-enough performance with extreme flexibility then you would be hard pressed to find a better provider.

Want to run load tests against your own WordPress sites? Sign up for Kernl now!

What’s New With Kernl – November 2018

It has been a busy few months for Kernl. Lots of great work has gone into the WordPress load testing feature work as well as a few structural changes to increase reliability.

  • Cache moved to Redis – For as long as Kernl has existed our cache backend was powered by Memcached. We have now finished migrating to Redis hosted at Compose.io.
  • AngularJS Upgrade to 1.7.5 – Fairly straight-forward upgrade to Angular 1.7.5. We wanted to take advantage of performance improvements and few bug fixes.
  • WordPress Load Testing – Over the past few months we’ve been cooking up something new. Imagine if you could easily test performance changes to you or your client’s WordPress installation? Or be able to tell your client with confidence how many customers at a time their site can support (and what their experience will be like!). What if you could do all this without writing a single line of code or spinning up your own testing infrastructure? We’re ready to start beta testing so send an email to jack@kernl.us if you would like to be a part of it.