Having a status page is a necessity in Cloud 66. Our customers need to know if all the systems are up and running and when there any issues.
When building our status page, we had these requirements:
- Highly available. A status page that is down with your site is no use.
- Simple to update from anywhere.
- Parsable by machines. We wanted our users to be able to parse the status page automatically without the need to do page scrapping so they can integrate it in their own systems.
- Flexible structure so we can add and remove sections for sub-system statuses in seconds.
There are many solutions for this. Our main objective was to have the
highest availability with a minimum cost in a system we could update
from anywhere in case of an issue.
At the end we settled for a simple JSON backed web page rendered on the client using handlebar and hosted on Amazon S3.
This ticks all the boxes:
We couldn’t beat S3’s SLA no matter how hard we tried. Having it backed by JSON meant we could edit it with a simple ruby script: the script would download the JSON, update it and upload it back to S3.
It also means our users can download the JSON directly and use it as a machine readable data source of our status.
If we need to add a new section to the status, we simple add it to the
JSON once and it will be available and editable going forward.
Handlebar is used to render this JSON data structure on the client side into a simple and elegant web page you see at status.cloud66.com
As for machinereadability, you can try
curl http://status.cloud66.com/status.json
You can find the source code for the page, JS and the updating script
here.