Building & Scaling Kernl Analytics

Over the past 3 years I’ve often received requests from new and existing Kernl customers for some form of analytics on their plugin/theme. I avoided doing this for a long time because I wasn’t sure that I could do so economically at the scale Kernl operates at, but I eventually decided to give Kernl Analytics a whirl and see where things ended up.

kernl analytics product versions
Product Versions Graph

Concerns

After deciding to give the analytics offering a try, I had to figure how to build it. When I first set out to build Kernl Analytics I had 3 main concerns:

  • Cost – I’ve never created a web service from scratch that needs to INSERT data at 75 rows per second with peaks of up to 500 rows per second. I wanted to be sure that running this service wouldn’t be prohibitively expensive.
  • Scale – How much would I need to distribute the load? This is tightly coupled to cost.
  • Speed – This project is going to generate a LOT of data by my standards. Can I query it in performant manner?

As development progressed I realized that cost and scale were non-issues. The database that I chose to use (PostgreSQL) can easily withstand this sort of traffic with no tweaking, and I was able to get things started on a $5 Digital Ocean droplet.

Kernl Analytics Architecture & Technology

Kernl Analytics was created to be it’s own micro-service with no public access to the world. All access to it is behind a firewall so that only Kernl’s Node.js servers can send requests to it. For data storage, PostgreSQL was chosen for a few reasons:

  1. Open Source
  2. The data is highly relational
  3. Performance

The application that captures the data, queries it, and runs periodic tasks is a Node.js application written in TypeScript. I chose TypeScript mostly because I’m familiar with it and wanted type safety so I wouldn’t need to write as many tests.

kernl analytics and typescript
TypeScript FTW!

With regards to size of the instance that Kernl Analytics is running on, I currently pay $15/month for a 3 core Digital Ocean droplet. I upgraded to 3 cores so that Postgres could easily handle both writes and multiple read requests at the same time. So far this setup has worked out well!

Pain Points

Overall things went well while implementing Kernl Analytics. In fact they went far better than expected. But that doesn’t mean there weren’t a few pain points along the way.

  • Write Volume – Kernl’s scale is just large enough to cause some scaling and performance pains when creating an analytics service. Kernl averages 25 req/s which translates to roughly 75 INSERTs into Postgres. Kernl also has peaks of 150 req/s which scales up to about 450 INSERTs into Postgres. Postgres can easily handle this sort of load, but doing it on a $5 digital ocean droplet was taxing to say the least.
  • Hardware Upgrade – I tried to keep costs down as much as possible with Kernl Analytics, but in the end I had to increase the size of the droplet I was using to a $15 / 3-core droplet. I ended up doing that so one or two cores could be dedicated to writes while leaving a single core available for read requests. Postgres determines what actions are executed where, but adding more cores had led to a lot less resource contention.
  • Aggregation – Initially the data wasn’t aggregated at all. This caused some pain because even with some indexing, plucking data out of a table with > 2.5 million rows can be sort of slow. It also didn’t help that I was writing data constantly to the table, which further slowed things down. Recently I solved this by doing daily aggregations for Kernl Analytics charts and domain data. This has improved speed significantly.
  • Backups & High Availability – To keep costs down the analytics service is not highly available. This is definitely one of those “take out some tech debt” items that will need to be addressed at a later date. Backups also happen only on a daily basis, so its possible to lose a day of data if something serious goes wrong.
kernl analytics server load
Yay for affordable hosting

Future Plans

Kernl Analytics is a work in progress and there is always room to improve. Future plans for the architecture side of analytics are

  • Optimize Indexes – I feel that more speed can be coaxed out of Postgres with some better indexing strategies.
  • Writes -vs- Reads – Once I gain a highly available setup for Postgres I plan to split responsibilities for writing and reading. Writes will go to the primary and reads will go to the secondary.
  • API – Right now the analytics API is completely private and firewalled off. Eventually I’d like to expose it to customers so that they can use it to do neat things.

What’s New With Kernl – November 2016

It’s been a long time since the last Kernl update blog, so lets get right into it.

Big Features

  • GitLab CI Support – You can now build your plugins and themes automatically on Kernl using GitLab.com!  We’ve had support for GitHub and BitBucket for a long time, and finally figured out a good way to make things work for GitLab.  See the documentation on how to get started.
  • Slack Build Integration – If you are a slack user, you can now tell Kernl where to publish build status messages.
  • Replay Last Webhook – Sometimes when you’re running a CI service with Kernl it would be useful to re-try that last push that Kernl received.  You can now do that on the “Continuous Integration” page.

Minor Features

  • Repository Caching – We now do some minor caching of your git repositories on the Kernl front end.  The first load will still reach out to the different git providers, but subsequent loads during your sessions will read an in-memory cache instead.
  • Better Webhook Log Links – Instead of displaying a UUID, the webhook build log now displays the name of the plugin or theme.

Other

  • Miscellaneous Upgrades – Underlying OS packages and Node.js packages were upgraded.
  • Payment Bug Fixes – There were a few minor bugs that kept showing up if someone’s credit card expired.  This fix hopefully allows for a more self-service approach.
  • Minor copy changes – A few changes were made to the wording on the Kernl landing page.

What’s next?

  • It’s been a few months since Ubuntu 16.04 LTS came out, so I’ll be spending significant amounts of time upgrading our infrastructure to the latest LTS version.
  • If our load balancer goes down right now, everything goes under.  A floating IP address between two load balancers will solve that issue and provide high(er) availability.
  • Better insights into purchase code usage and activity.