Discourse in a Docker container
about 11 years ago
Deploying Rails applications is something we all struggle with:
You would think that years along we would have made some progress in this front, but no, deploying Rails is almost as complicated as it was back in the mongrel days.
Yes, we have passenger, however, getting it installed and working with rvm/rbenv is a bit of a black art, and let us not mention daemonizing Sidekiq or Resque. Or, god forbid, configuring PostgreSQL.
This is why we often outsource the task to application as a service providers.
Last week I decided to spend some time experimenting with Docker
###What is Docker?
an open source project to pack, ship and run any application as a lightweight container
The concept is that developers and sysadmins can author simple images, usually authored using Dockerfiles that provide a pristine state that encapsulate an application. It uses all sorts of trickery to make authoring of these images a painless experience and contains a central repo where users can share images.
Think of it as a VM without the performance penalty of having a VM. Docker containers run in the same kernel as the host unvirtualized.
When a user launches a “container” a private unique IP is provisioned and the process runs isolated. Docker will launch a single process inside the container, however that process may spawn others.
Docker (today: version 0.6.5) is a front end that drives Linux LXC containers and uses a copy-on-write storage engine built on AUFS. It is the “glue” that gives you a simple API to deal with containers and optionally run them in the background, persistently.
Docker is built in golang, and has a very active community.
###Restrictions
Docker version 0.6.5 is still not deemed “production ready”, the technologies it wraps are considered production ready, however the APIs are changing rapidly with some radical changes to come later on.
There are plans to extract the AUFS support and probably use lvm thin provisioning as the preferred storage backend.
As it stands the only recommended OS to run Docker on by the Docker team is Ubunu LTS 12.04.03 (note, LTS ships with multiple kernels, you need 3.8 at least). I have had luck with Ubuntu 13.04, however 13.10 does not work with Docker today (since is ships with an incompatible, alpha, version of lxc). Additionally you should be aware of a networking issue in VMs that affect 3.8.
The AUFS dependency is the main reason for this tough restriction, however I feel confident this is going away, Red Hat are banking on it.
###Security
It is very important to read through the LXC security document. Depending on your version of LXC, the root use inside a container may have global root privileges. This may not matter to you, or it may be very critical to you depending on your application / usage.
Additionally file mounts are a mess, if you mount a location external to the docker container using the -v
options for docker run
permissions are all a bit crazy. UIDs inside docker do not match UIDs outside of it, so for example:
View from the outside
View from inside the container.
There are plans to mitigate this problem. It can be worked around with NFS shares, avoiding mounts or synchronizing users and groups between containers and host.
###The 42 layer problem in AUFS
AUFS only supports 42 layers. It may seem like a lot, but you hit is very early when building complex images. Dockerfiles make if very easy to reuse work when building images. For example, say I am building an image and decide to add “one more thing”. When I add a new RUN command, docker is smart enough to re-use all my previous work so building the image is snappy. As a result many docker files contain lots and lots of RUN commands.
To circumvent this issue our base image is built as a single layer. When I am experimenting with changes I add them at the end of the file, eventually rolling them in to the big shell command.
###Gotchas developing with Docker
When developing with Docker it is quite easy to accumulate a pile of images you never use, and containers that have long ago stopped and are disposable. It is fairly important to stay vigilant and keep cleaning up. Any complex docker environments are going to need a very clean process for eliminating unnecessary containers and images.
While developing I found myself running the following quite a lot:
docker rm `docker ps -a | grep Exit | awk '{ print $1 }'`
remove all containers that exited
###This blog is running on Docker
There has been a previous attempt to run Discourse under Docker by srid. However I wanted to take a fresh look at the problem and in a “trial-by-fire” come up with a design that worked for me.
Note, this setup is clearly not something we will be supporting externally or would like made official quite yet, however it has enormous amount of appeal and potential. After working through a Docker Discourse setup with our awesome sysadmin supermathie he described it as “20% of the work” he usually does.
This is how you would work through it
- Install Ubuntu 12.04.03 LTS
- sudo apt-get install git
- git clone GitHub - discourse/discourse_docker: A Docker image for Discourse
- cd discourse_docker, run ./launcher for instructions on how to install docker
- Install docker
- Modify the base template to suit your needs (standalone.yml.sample):
# this is the base template, you should not change it
template: "standalone.template.yml"
# which ports to expose?
expose:
- "80:80"
- "2222:22"
params:
# ssh key so you can log in
ssh_key: YOUR_SSH_KEY
# git revision to run
version: HEAD
# host name, required by Discourse
database_yml:
production:
host_names:
# your domain name
- www.example.com
# needed for bootstrapping, lowercase email
env:
DEVELOPER_EMAILS: 'my_email@email.com'
- Save it as say, web.yaml
- Run
sudo ./launcher bootstrap web
to create an image for your site - Run
sudo ./launcher start web
to start the site
At this point you will have a Discourse site up and running with sshd / nginx / postgresql / redis / unicorn running in a single container with runit ensuring all the processes keep running. (though I still need to build in a monitoring bits)
At no point during this setup did you have to pick the redis and postgres version, or mess around with nginx config files. It was all scripted in a completely reproducible fashion.
###This solution is 100% transparent and hackable for other purposes
The launcher shell script has no logic regarding Discourse built in. Nor does pups, the yaml based image bootstrapper inspired by ansible. You can go ahead and adapt this solution to your own purposes and extend as you see fit.
I took it on myself to create the most complex of setup first, however this can easily be adapted to run separate applications per container using the single base image. You may prefer to run PostgreSQL and Redis in a single container and the web in another, for example. The base image has all the programs needed, copy-on-write makes storage cheap.
I elected to keep all persistent data outside of the container, that way I can always throw away a container and start again from scratch, easily.
###The importance of the sshd backdoor into the container
During my work with docker I really wanted to be able to quickly be able to log-on to a container and mess about a bit. I am not alone.
A common technique to allow users direct access into a system container is to run a separate sshd inside the container. Users then connect to that sshd directly. In this way, you can treat the container just like you treat a full virtual machine where you grant external access. If you give the container a routable address, then users can reach it without using ssh tunneling.
###One process per container
Docker will only launch a single process per container, it is your responsibility to launch any other processes you need and take care of monitoring. This is why I picked runit as the ideal process for this task:
compare that to the 105000 VSZ and 18700 RSS memory bluepill takes
VSZ and RSS numbers this low are probably very foreign to today’s programmers, this is perfect for this task and makes orchestrating a container internally very simple. It takes care of dependency so, for example, unicorn will not launch until Postgres and Redis are running.
###The upgrade problem
Docker opens a bunch of new options when it comes to application upgrades. For example, you can bootstrap a new container with a new version, stop your old container and start the new one.
You can also enable seamless upgrades on a single machine using 4 containers, a db container an haproxy container and 2 web containers. Just notify haproxy a web is going down, pull it out of rotation, upgrade that container and push it back into rotation.
Since we are running sshd in each container we can still use the traditional mechanisms of upgrade as well.
In more “enterprisey” setups you can run your own Docker registry, that way your CI machine can prep the images and the deploy process simply pulls the image on each box shuts down old containers and starts new ones. Distributing images is far more efficient and predictable than copying thousands of file with rsync each time you deploy.
###Why yet another ansible?
While working on the process I came up with my own DSL for bootstrapping my Discourse images. I purpose built it so it solves the main issues I was hitting with a simple shell script. Multiline replace is hard in Awk and Grep. The syntax is scary to some, merging yaml files is not something you really could do that easily in a shell script.
pups makes these problems quite easy to solve
run:
- replace:
filename: "/etc/nginx/conf.d/discourse.conf"
from: /upstream[^\}]+\}/m
to: "upstream discourse {
server 127.0.0.1:3000;
}"
multiline regex replace for an nginx conf file
The DSL and tool lives here: GitHub - discourse/pups: Simple yaml based bootstrapper for Linux machines feel free to use it where you need. I picked it over ansible cause I wanted an exact fit for my problem.
The initial image was simple enough to fit in a Docker file, however the process of bootstrapping Discourse is complex. You need to spin up background processes, do lots of fancy replacing and so on. You can see the template I am using for this site here: https://github.com/SamSaffron/discourse_docker/blob/master/standalone.template.yml
###The future
I see a very bright future for Docker, a huge eco-system is forming with new Docker based applications launching monthly. For example CoreOS , Deis and others are building businesses on top of Docker. OpenStack Havana supports Docker out-of-the-box.
Many of the issues I have raised in this post are being actively resolved. Docker is far more than a pretty front end on the decade old BSD jail concept. It is attempting to provide a standard we can all use, in dev and production regardless of the OS we are running, allowing us to set up environments quickly and cleanly.
This is seriously cool. I just fell is love with Docker a few months ago and I love to see how ideas and tooling improve around it! I also just happened to install Discourse btw
I’ve also blogged about using Docker to deploy a simple Django app - but your article is much more complex.