As a company running on Ruby on Rails and with hundreds of our customers running Rails to power their businesses I thought what better opportunity to share our experience of moving our Ruby on Rails application to run in containers and on top of Kubernetes.
Cloud 66 has been a Rails company from the very first days. Our application stack consists of a main Rails app that we call Central, a MySQL backend, Redis and Memcached for persistent and non-persistent caching, Faye as our realtime push server, and four services written in Go: Delphi for service discovery, Starter (open source) for codebase analysis, TTY as a web based terminal emulator, and White Rabbit that powers LiveLogs.
We run multiple instances of Central for different purposes: Web and API, multiple Sidekiq Pro processes and Clockwork for scheduled jobs.
We also use RabbitMQ and InfluxDB alongside another three services written in Go for our metrics backend (the metrics charts you see on server details page).
On the storage side, we used NFS NAS backed by AWS EBS for dynamic shared storage, and AWS S3 for static shared storage.
This makes a total of 22 services (!) that power Cloud 66.
Our infrastructure before moving to Kubernetes
Before moving to Kubernetes, we were running all of the above on AWS EC2. From an infrastructure point of view we had different server types: frontend, backend, Redis, Memcached, InfluxDB, RabbitMQ as well as ELB load balancers and AWS MySQL RDS spread across multiple availability zones and VPCs.
Why the move?
Our old infrastructure setup was working for us for more than three years, but the biggest issue we had—which was exposed further by Kubernetes—was the speed of adding new services. Take our search for example: we wanted to use Sphinx for our search. This meant we had to setup Sphinx on a different set of servers, configure load balancing internally and externally, back them up, setup firewalls, etc for Sphinx. While this is not a difficult task for our team, it is yet another infrastructure component to take care of for us. Ultimately we wanted to achieve two things with our move to Kubernetes:
- Have only one type of server (no snowflakes)
- Be able to add preconfigured components to our infrastructure quickly and easily
Right before our move, Cloud 66 was running on ~40 servers on AWS EC2, eight ELB load balancers and four RDS servers.
Getting ready for the move
As a company dealing with containers and Kubernetes every day, we had a good idea as to how to build a production Kubernetes cluster and how to manage it. However, the biggest discussion was whether to break up our monolithic Rails application into smaller services.
After much deliberation, we decided to keep our application setup the same. Rails is built to act as a single entity taking care of a lot of what you need for a web application, from routing to domain object, ORM, caching and much more. A refactored Rails application had to be made up of smaller Ruby services (with a smaller framework focused on building APIs, like Grape on top of Sinatra for example) in order to make sense; but that would leave a lot of duplicates on each service, which would go against the conventions Rails is so strongly built on. There was no visible cost efficiency either: we needed more flexibility on the infrastructure side. That meant keeping our Rails application as it was.
To start the move, we built a base Kubernetes configuration file, defining all of our services internally (including databases, storage,...) as Kubernetes services so we can run everything locally or on a single node cluster and make it work.
Building the configuration
Kubernetes configuration files can be tricky to understand at the beginning, but the good news is that they are uniform and follow a single pattern. Also, while while Kubernetes is always changing, you can rely on the quality and consistency of the project's documentation to a great degree.
We created 11 configuration files for our project:
The names of the files are mostly self explanatory:
setup builds the k8s
namespace and any basic infrastructure requirements like storage classes,
security configures the RBAC,
secrets hold application configuration values, etc.
While the files are checked into our git repository, the secrets and security files are encrypted (later we moved our configuration to Gifnoc and used KMS as our Gifnoc secret backend).
The files have numerical prefixes to help with execution order. While it is possible to run the same k8s configuration file multiple times, we wanted to build some order around it to avoid false errors caused by the first runs (like when the namespace is not created yet but deployments are being configured).
Dealing with Rails specific issues
To get an end-to-end deployment of the application going we had to solve some issues around the following Rails areas:
- Database migration
- Asset pipeline migration
- Sidekiq restarts
- Unicorn memory and thread usage
Having relied on Capistrano for our old deployments, we were used to it running our DB migrations. With Kubernetes, since we never have backward-incompatible migrations, running DB migrations before a deployment wasn't an issue. To run the migrations we simply wrote a Kubernetes job that takes in the Rails codebase and runs
rake db:migrate before each deployment. If you have backward incompatible migrations (you really shouldn't) you need to shutdown the operations before running your migrations.
Asset Pipeline Compilation
This was a bit more tricky as Capistrano rolls back your deployment if assets fail to compile during a deployment, by keeping the old deployment folder and the
current folder symlink the same.
To solve this, we moved the compilation of our assets to our image building flow (more on our image building flow later). This meant that any asset is compiled into the images. We don't use huge assets and therefore this is ok, if you have large assets you might want to push them up to S3 or a CDN as part of your image build flow.
This is probably the trickiest part. Our Sidekiq jobs can take up to an hour to finish. We solved the deployment issues with Sidekiq in our old infrastructure by creating a flow that would allow us to run multiple worker processes for each queue and hit them with a kill signal that would make them finish their current jobs. Sidekiq processes that were drained were then automatically cleaned up.
This isn't easy to do in Kubernetes for several reasons:
- You need to configure k8s to delay killing the process if it's not down by quite a lot to avoid the k8s scheduler shutting down a Sidekiq process before it's done.
- The issue becomes worse if you are using k8s sidecar containers in Sidekiq pods. For example if you are running your cluster on Google's GKE with CloudSQL (Google's managed database service), you need to have a CloudSQL Proxy container next to each Sidekiq container so it can access your database. This means while you might be able to keep Sidekiq from dying when k8s wants to kill it until it's finished, the kill signal will be caught by the CloudSQL Proxy sidecar and it will dutifully exit, leaving your Sidekiq in the middle of a job and with no DB access. To solve this we had to write a shim that controls the behaviour of CloudSQL Proxy until Sidekiq is done.
We have written about this in detail: Kubernetes: (Graceful) Sidekiq Worker Lifecycle.
Unicorn memory and thread model
Before moving to Kubernetes we were using Unicorn. We had one Nginx on each frontend server that fed Unicorn through UNIX Sockets. Unicorn was in turn responsible for processing the requirements in parallel using multiple child processes. Moving to Kubernetes, this didn't make sense as we could run multiple instances of Unicorn, which in turn would spun off multiple processes causing potential high memory usage. This was combined with the fact that we didn't want to influence the k8s scheduler to only put one frontend pod on each cluster server, to avoid uneven load on the servers. To resolve these issues, we moved from Unicorn to Puma which can use threads instead of processes, and therefore is better suited for our setup.
One of the biggest challenges we had during our move was the use of dynamic shared storage. Our old setup relied on the availability of a shared folder between certain servers. This was not possible with k8s, as running NFS NAS in a container is not easy in a reliable way. We didn't want to mount a block storage onto a single pod and use that as a single point of failure to share it with other pods. Before the move we could scale our NFS storage but with container this was going to be tricky. We also didn't like the idea of running NFS inside of a container as we had seen issues with Docker file system when running NFS. At the end we decided to do a bit of refactoring and remove the need for a shared storage folder all together.
Once we had the entire system working on a single node cluster we moved to solving the deployment problem.
One thing you notice when using containers and Kubernetes is that spinning up new instances of your application becomes much more easier. You can deploy an entire stack in a matter of seconds, so we started thinking, how can we benefit from that? The upsides were obvious: feature specific deployments, a full stack for each branch and PR,... but there were obvious challenges too.
One side effect of deploying multiple instances of your application is that the role of environments becomes less clear: all of a sudden you might have three stacks all running on what Rails sees as
staging which means the configurations are read from the same section of
production.rb; this can complicate things.
To solve these issues we developed the concept of Formations: a Formation is a single instance of an application deployed to a cluster. Think of it as a combination of environment and deployment destination. We can give these Formations unique names and build our configuration files and secrets around those instead of environments, thus making our deployment configurations more fine-grained.
Another challenge in deploying to Kubernetes is generation and upkeep of its configuration files. We needed a process that works both for creation of new k8s configuration files as well as managing their change. We wanted to put our secrets into k8s configuration files and check them into our git repository without worrying about them being compromised. We also wanted to ensure certain policies are applied when k8s configuration files are changed: for example we wanted to make sure all of our pods have the CloudSQL Proxy sidecar container and a certain set of volumes, environment variables and configurations when deployed to a production environment while we didn't have the same requirements for our staging environment.
We soon realised that our existing tooling was not adequate for all of these challenges. We needed an entire toolchain for generating, modifying, controlling and applying k8s configuration files.
The result of all this was two open source projects and one internal command line tool: Gifnoc , an open source configuration provider that works with secrets as well and is well suited for Formations; Copper, an open source tool to ensure policy compliance for Kubernetes configuration files; and OPX which is a command line tool originally written in Ruby and Thor to generate, modify and apply configurations for our deployments. I am happy to announce that this week we released OPX as an addition to Cloud 66 Skycap with support for Formations, Copper policies and much more!
To build our containers we use our own BuildGrid. Each git commit triggers a build: the code for all services is pulled in and built into Docker images. All non-built images (like Redis) are pulled and retagged with the git ref of the Central repository. With all the images built and pushed into our private Docker registry, we use a workflow to pull the images, generate the k8s configuration files, inject them with secrets from Gifnoc on a secure server and apply them to our Kubernetes cluster.
Migrating from our old infrastructure to the new one took a couple of months of planning to reduce the needed down time. Since we were moving away from AWS, we needed changes in our DNS records as well as moving our MySQL data which would have meant some downtime. The move itself was smooth as we had our staging environment running on Kubernetes for a while by then. However a hardware issue on one of the nodes of our cluster meant our Redis servers were having an intermittent issue which was dependent on which node Redis was scheduled to run on. This caused some headache, but eventually replacing the faulty node with another one fixed the issue.
We've been running on Kubernetes for about a year now and we couldn't be happier with our decision. Even when running a monolithic app, we are now down to ~10 servers from ~40 (bigger ones but much more efficient and utilized) which reduced our infrastructure bills by a significant margin; also, none of them are snowflakes. But more important than that, we now have a much more flexible and uniform infrastructure setup which makes us rest much better at nights!