RSS

How We Do Databases

Flynn is an open source platform for deploying and managing web applications and databases. You can run it yourself or we can manage it for you.

When we created Flynn, our goal was to build a single platform that ran everything you needed to run your apps. One of the most important and most challenging parts of that vision was how to manage databases.

Setting up and managing databases is one of the hardest things to do in ops. Most databases are difficult to install and configure, especially if you’re trying to use best practices like high availability and automatic failover.

Configuring databases by hand is hard, but automating them so they can manage themselves is much trickier. There are a number of Database as a Service providers out there, but most are closed source and lock you into a single infrastructure provider. For databases to work in Flynn they needed to be completely self-managing, quick and easy to set up, and be able to run anywhere.

We believe the most important job of a database is to keep data safe. When an application tells a user it’s received input, the database needs to hold onto it, no matter what. Unfortunately it’s all too easy to run databases in configurations that lose data when things go wrong, and at scale things always go wrong. Hard drives break, networks partition, uninterruptible power fails, and so on.

We built Flynn’s database appliances to survive those kinds of failure and keep users' data safe.

state machine diagram

Here’s how we do it:

  • High Availability: Flynn’s PostgreSQL, MariaDB, and MongoDB appliances are highly available by default. That means there are always multiple copies of the database running and ready to go. So if something happens to one, the others are ready to take over. As long as you are running three or more servers, Flynn runs multiple copies of the database on different machines, just in case.

  • State Machine: We use a state machine based on Joyent’s Manatee so there’s a chain of replicas: a primary, a synchronous replica, and an asynchronous replica. When an application sends a write to a Flynn database appliance, the appliance will only confirm the write after it’s been copied to the primary and synchronous replica. If the primary goes down, the synchronous replica is promoted to the primary, the asynchronous replica is promoted to become the synchronous replica, and a new instance is created to become the asynchronous replica. This “chain” means that there’s never any question about which replica should take over.

  • Automatic Failover: Flynn’s health check and service discovery services monitor each of the instances in the database appliance’s state machine. It automatically promotes and creates new instances in the state machine. Since this failover is automatic, it can switch over nearly instantaneously and responds much faster than a human operator who receives a page and has to investigate what happened.

  • Whole Cluster Backups: Flynn manages your applications and databases on the same cluster. The command that takes a cluster backup includes databases. It takes only two lines of code to backup a running cluster and restore it somewhere else.

  • Cloud Independence: Flynn doesn’t use any cloud-specific features to run or manage databases. So a Flynn cluster using PostgreSQL and MongoDB runs the same on AWS as it does on DigitalOcean or your own hardware in a colo. Your product and data stay portable, so your options stay open.

We have a lot more features and improvements in the works for how Flynn handles databases. There’s already a solid foundation but we want running all your production databases (and applications) in Flynn to be a no-brainer. You can subscribe to the GitHub issue or sign up for our mailing list to watch our progress.

 RSS

Mailing List