Discourse in a Docker container

over 10 years ago

Deploying Rails applications is something we all struggle with:

You would think that years along we would have made some progress in this front, but no, deploying Rails is almost as complicated as it was back in the mongrel days.

Yes, we have passenger, however, getting it installed and working with rvm/rbenv is a bit of a black art, and let us not mention daemonizing Sidekiq or Resque. Or, god forbid, configuring PostgreSQL.

This is why we often outsource the task to application as a service providers.

Last week I decided to spend some time experimenting with Docker

###What is Docker?

an open source project to pack, ship and run any application as a lightweight container

The concept is that developers and sysadmins can author simple images, usually authored using Dockerfiles that provide a pristine state that encapsulate an application. It uses all sorts of trickery to make authoring of these images a painless experience and contains a central repo where users can share images.

Think of it as a VM without the performance penalty of having a VM. Docker containers run in the same kernel as the host unvirtualized.

When a user launches a “container” a private unique IP is provisioned and the process runs isolated. Docker will launch a single process inside the container, however that process may spawn others.

Docker (today: version 0.6.5) is a front end that drives Linux LXC containers and uses a copy-on-write storage engine built on AUFS. It is the “glue” that gives you a simple API to deal with containers and optionally run them in the background, persistently.

Docker is built in golang, and has a very active community.

###Restrictions

Docker version 0.6.5 is still not deemed “production ready”, the technologies it wraps are considered production ready, however the APIs are changing rapidly with some radical changes to come later on.

There are plans to extract the AUFS support and probably use lvm thin provisioning as the preferred storage backend.

As it stands the only recommended OS to run Docker on by the Docker team is Ubunu LTS 12.04.03 (note, LTS ships with multiple kernels, you need 3.8 at least). I have had luck with Ubuntu 13.04, however 13.10 does not work with Docker today (since is ships with an incompatible, alpha, version of lxc). Additionally you should be aware of a networking issue in VMs that affect 3.8.

The AUFS dependency is the main reason for this tough restriction, however I feel confident this is going away, Red Hat are banking on it.

###Security

It is very important to read through the LXC security document. Depending on your version of LXC, the root use inside a container may have global root privileges. This may not matter to you, or it may be very critical to you depending on your application / usage.

Additionally file mounts are a mess, if you mount a location external to the docker container using the -v options for docker run permissions are all a bit crazy. UIDs inside docker do not match UIDs outside of it, so for example:

View from the outside

View from inside the container.

There are plans to mitigate this problem. It can be worked around with NFS shares, avoiding mounts or synchronizing users and groups between containers and host.

###The 42 layer problem in AUFS

AUFS only supports 42 layers. It may seem like a lot, but you hit is very early when building complex images. Dockerfiles make if very easy to reuse work when building images. For example, say I am building an image and decide to add “one more thing”. When I add a new RUN command, docker is smart enough to re-use all my previous work so building the image is snappy. As a result many docker files contain lots and lots of RUN commands.

To circumvent this issue our base image is built as a single layer. When I am experimenting with changes I add them at the end of the file, eventually rolling them in to the big shell command.

###Gotchas developing with Docker

When developing with Docker it is quite easy to accumulate a pile of images you never use, and containers that have long ago stopped and are disposable. It is fairly important to stay vigilant and keep cleaning up. Any complex docker environments are going to need a very clean process for eliminating unnecessary containers and images.

While developing I found myself running the following quite a lot:

docker rm `docker ps -a  | grep Exit | awk '{ print $1 }'`

remove all containers that exited

###This blog is running on Docker

There has been a previous attempt to run Discourse under Docker by srid. However I wanted to take a fresh look at the problem and in a “trial-by-fire” come up with a design that worked for me.

Note, this setup is clearly not something we will be supporting externally or would like made official quite yet, however it has enormous amount of appeal and potential. After working through a Docker Discourse setup with our awesome sysadmin supermathie he described it as “20% of the work” he usually does.

This is how you would work through it

Install Ubuntu 12.04.03 LTS
sudo apt-get install git
git clone GitHub - discourse/discourse_docker: A Docker image for Discourse
cd discourse_docker, run ./launcher for instructions on how to install docker
Install docker
Modify the base template to suit your needs (standalone.yml.sample):

# this is the base template, you should not change it
template: "standalone.template.yml"
# which ports to expose?
expose:
  - "80:80"
  - "2222:22"

params:
  # ssh key so you can log in
  ssh_key: YOUR_SSH_KEY
  # git revision to run
  version: HEAD


  # host name, required by Discourse
  database_yml:
    production:
      host_names:
        # your domain name
        - www.example.com


# needed for bootstrapping, lowercase email
env:
  DEVELOPER_EMAILS: 'my_email@email.com'

Save it as say, web.yaml
Run sudo ./launcher bootstrap web to create an image for your site
Run sudo ./launcher start web to start the site

At this point you will have a Discourse site up and running with sshd / nginx / postgresql / redis / unicorn running in a single container with runit ensuring all the processes keep running. (though I still need to build in a monitoring bits)

At no point during this setup did you have to pick the redis and postgres version, or mess around with nginx config files. It was all scripted in a completely reproducible fashion.

###This solution is 100% transparent and hackable for other purposes

The launcher shell script has no logic regarding Discourse built in. Nor does pups, the yaml based image bootstrapper inspired by ansible. You can go ahead and adapt this solution to your own purposes and extend as you see fit.

I took it on myself to create the most complex of setup first, however this can easily be adapted to run separate applications per container using the single base image. You may prefer to run PostgreSQL and Redis in a single container and the web in another, for example. The base image has all the programs needed, copy-on-write makes storage cheap.

I elected to keep all persistent data outside of the container, that way I can always throw away a container and start again from scratch, easily.

###The importance of the sshd backdoor into the container

During my work with docker I really wanted to be able to quickly be able to log-on to a container and mess about a bit. I am not alone.

A common technique to allow users direct access into a system container is to run a separate sshd inside the container. Users then connect to that sshd directly. In this way, you can treat the container just like you treat a full virtual machine where you grant external access. If you give the container a routable address, then users can reach it without using ssh tunneling.

###One process per container

Docker will only launch a single process per container, it is your responsibility to launch any other processes you need and take care of monitoring. This is why I picked runit as the ideal process for this task:

compare that to the 105000 VSZ and 18700 RSS memory bluepill takes

VSZ and RSS numbers this low are probably very foreign to today’s programmers, this is perfect for this task and makes orchestrating a container internally very simple. It takes care of dependency so, for example, unicorn will not launch until Postgres and Redis are running.

###The upgrade problem

Docker opens a bunch of new options when it comes to application upgrades. For example, you can bootstrap a new container with a new version, stop your old container and start the new one.

You can also enable seamless upgrades on a single machine using 4 containers, a db container an haproxy container and 2 web containers. Just notify haproxy a web is going down, pull it out of rotation, upgrade that container and push it back into rotation.

Since we are running sshd in each container we can still use the traditional mechanisms of upgrade as well.

In more “enterprisey” setups you can run your own Docker registry, that way your CI machine can prep the images and the deploy process simply pulls the image on each box shuts down old containers and starts new ones. Distributing images is far more efficient and predictable than copying thousands of file with rsync each time you deploy.

###Why yet another ansible?

While working on the process I came up with my own DSL for bootstrapping my Discourse images. I purpose built it so it solves the main issues I was hitting with a simple shell script. Multiline replace is hard in Awk and Grep. The syntax is scary to some, merging yaml files is not something you really could do that easily in a shell script.

pups makes these problems quite easy to solve

run:
  - replace:
      filename: "/etc/nginx/conf.d/discourse.conf"
      from: /upstream[^\}]+\}/m
      to: "upstream discourse {
        server 127.0.0.1:3000;
      }"

multiline regex replace for an nginx conf file

The DSL and tool lives here: GitHub - discourse/pups: Simple yaml based bootstrapper for Linux machines feel free to use it where you need. I picked it over ansible cause I wanted an exact fit for my problem.

The initial image was simple enough to fit in a Docker file, however the process of bootstrapping Discourse is complex. You need to spin up background processes, do lots of fancy replacing and so on. You can see the template I am using for this site here: https://github.com/SamSaffron/discourse_docker/blob/master/standalone.template.yml

###The future

I see a very bright future for Docker, a huge eco-system is forming with new Docker based applications launching monthly. For example CoreOS , Deis and others are building businesses on top of Docker. OpenStack Havana supports Docker out-of-the-box.

Many of the issues I have raised in this post are being actively resolved. Docker is far more than a pretty front end on the decade old BSD jail concept. It is attempting to provide a standard we can all use, in dev and production regardless of the OS we are running, allowing us to set up environments quickly and cleanly.

Posted by: Sam Permalink | Comments (21)

Comments

Csaba Okrona over 10 years ago

This is seriously cool. I just fell is love with Docker a few months ago and I love to see how ideas and tooling improve around it! I also just happened to install Discourse btw
I’ve also blogged about using Docker to deploy a simple Django app - but your article is much more complex.

Hongli Lai over 10 years ago

Yeah, I agree with you. Even with Passenger, deployment is still not easy enough. The thing is that Passenger doesn’t do enough – right now it only takes care of the web processes. I find myself putting off deployment of certain apps because of this. PHP is a little bit better but it’s still not as easy as I want it to be.

While Docker solves some problems, it introduces other problems. Building a Docker container is a complete pain.

We (Phusion) are working on two projects to solve this. With Dockerizer we hope to make Docker container building a breeze. On the Phusion Passenger side, we’re introducing daemon management support by supporting Procfile.

Sam Saffron over 10 years ago

@ochronus

Thanks! Your article looks pretty good as well. How did your Discourse install go? Did you feel the pain, it does get quite involved.

It’s amazing how fast Docker is moving, I wonder how long before this blog post I just made goes out-of-date.

Sam Saffron over 10 years ago

@honglilai

I totally agree that Docker files are a complicated trap. The one I created ended up having a single RUN command cause I really wanted a flat image. As I was building it I added stuff to the end and eventually folded it in once I was sure it worked. Luckily, latest docker allows for multiline RUN commands.

I had a look at your work at GitHub - phusion/passenger-docker: Docker base images for Ruby, Python, Node.js and Meteor web apps btw this Docker file can be flattened some eg:

RUN /build/enable_repos.sh && /build/prepare.sh &&\
    /build/system_services.sh && /build/utilities.sh &&\ 
    /build/ruby.sh && /build/python.sh && /build/nodejs.sh &&\
    /build/passenger.sh && /build/finalize.sh

I discussed this problem with our our expert sys admin and I think we are going to pull our entire docker bootstrap image into a shell script. I suspect that once the tight AUFS dependency is dropped some sanity can return to Dockerfiles. That said I feel they are a bit limiting and am not sure if the DSL is the right one for bootstrapping, I much prefer pups.

I did want to push our image to speed up deployment but really wanted to slim it down a lot before throwing a 1.5 gig image in the Docker repo.

I totally love what you have done with Passenger to simplify deployment and am very excited about proc file support.

For my image here I chose unicorn mainly cause oobgc is available out-of-the-box something that I think only kicks in with passenger enterprise.

I feel one huge advantage of Docker is that it opens up much more sophisticated setups to the general public to consume. I can build a stack heavily tuned with tcmalloc and appropriate GC tuning env vars, without having to worry that people will get it wrong.

Keep up the awesome work, and let me know how you go. Very excited to here about your work, making Rails easy to deploy is a huge pain in our community.

Hongli Lai over 10 years ago

We’re looking at ways to slim down passenger-docker. We’ve already split the base system to a separate project: GitHub - phusion/baseimage-docker: A minimal Ubuntu base image modified for Docker-friendliness. Passenger-docker can probably be made smaller by making Node.js, Python, Qt4 (for capybara-webkit) and other stuff optional.

I’m thinking about releasing the image in two variants: minimal and full. Minimal contains almost nothing, and you have to opt-in for stuff. Choose this if you value size over convenience. Full contains everything: you don’t have to worry about the stack at all, you just have to wait for the download to finish.

Having said that, I don’t think the size is anything you have to worry about. Docker doesn’t redownload the base image if it’s already installed. During redeploys you only download what you’ve changed on top of passenger-docker.

As for oobgc: it’s not in Enterprise, it’s in the open source version. Passenger open source’s OOBGC improves on Unicorn’s OOBGC in a major way: it only allows one process to run OOBGC at a time, thereby avoiding situations in which all processes are busy garbage collecting and blocking your clients.

Sam Saffron over 10 years ago

Very interesting point, I may try it out then while testing oobgc I think I am really wanting for the hack Github added that allows you to ask the GC for the total free slots, without this getting it to act sanely is just too hard. Hope we get the free slot count into 2.1.

Tom Atkins over 10 years ago

Thanks for an excellent and thorough post on Docker. I’ve been using Docker quite a bit but learned some new things here.

Regarding the problems you mention with Dockerfiles, I’m quite impressed with this approach: http://zef.me/6049/nix-docker It might not be for everyone but looks like a powerful alternative.

Csaba Okrona over 10 years ago

@sam Thanks for the kind words
The install went fine, but only because I have some practice in deploying Rails apps - it’s quite cumbersome otherwise - and also the install howto also missed a few steps.

Caleb Land over 10 years ago

I’m getting an error trying to get discourse up and running in a docker container.

The problem comes from the redis.conf line:

logfile stdout

I’m getting the error:

*** FATAL CONFIG FILE ERROR ***
Reading the configuration file, at line 74
>>> 'logfile stdout'
Can't open the log file: Permission denied

If I change the line to logfile "" everything starts up fine.

Sam Saffron over 10 years ago

I fixed this yesterday and a host of other bugs, can you re-bootstrap (also no need to build the base images anymore I published samsaffron/docker)

Caleb Land over 10 years ago

That worked great!

How are you keeping the discourse version on your docker server up to date? Are you building new images, or just using ssh and doing it the old fashioned way?

Sam Saffron over 10 years ago

I am actually using the awesome GitHub - discourse/docker_manager: plugin for use with discourse docker image , it allows me to update the container using /admin/docker , if I make a serious config change like updating redis / postgres / ruby / nginx I will re-bootstrap.

I am keeping the base image up to date in the docker repo.

Ben Lubar over 10 years ago

I added myself to the docker group, but it doesn’t make it work without sudo.

discourse@discourse:~/discourse_docker$ sudo usermod -aG docker discourse
discourse@discourse:~/discourse_docker$ ./launcher bootstrap app
2013/12/13 15:57:47 dial unix /var/run/docker.sock: permission denied
2013/12/13 15:57:47 dial unix /var/run/docker.sock: permission denied
Calculated ENV: 
2013/12/13 15:57:47 dial unix /var/run/docker.sock: permission denied
2013/12/13 15:57:47 dial unix /var/run/docker.sock: permission denied
2013/12/13 15:57:47 dial unix /var/run/docker.sock: permission denied
2013/12/13 15:57:47 dial unix /var/run/docker.sock: permission denied

Usage: docker rm [OPTIONS] CONTAINER [CONTAINER...]

Remove one or more containers

  -link=false: Remove the specified link and not the underlying container
  -v=false: Remove the volumes associated to the container
2013/12/13 15:57:52 dial unix /var/run/docker.sock: permission denied
FAILED TO COMMIT

Usage: docker rm [OPTIONS] CONTAINER [CONTAINER...]

Remove one or more containers

  -link=false: Remove the specified link and not the underlying container
  -v=false: Remove the volumes associated to the container
Successfully bootstrappd, to starup use ./launcher start app

With sudo added, I get this:

Unable to find image 'samsaffron/discourse' (tag: latest) locally
Pulling repository samsaffron/discourse

followed by a few newlines at random intervals.

Edit: The problem was that I had forgotten to mkdir -p /var/docker/data. I think.

Sam Saffron over 10 years ago

What version of docker are you running? be sure to be on latest. Also be sure to log out / in after adding yourself to sudo.

Will add smarts into launcher that ensures the mounted volumes exist or raises a proper error.

Ben Lubar over 10 years ago

I installed another instance of discourse/docker on another machine last night. It looks like the Unable to find image step hides most of its output, which makes the blank lines appear at seemingly random times.

I’m on Docker version 0.7.1, build 88df052, by the way. I had to log out/in after adding myself to the docker group, but I was already in sudo (the Ubuntu installer did that for me).

Side note: On the production install, I have it listening on host port 2280 instead of 80, and then I have another instance of nginx (on the host) that proxies, and I commented out the proxy_set_header X-.* lines in location @discourse. That way, I can run multiple things on port 80 on the same server.

expose:
  - "2280:80"
  - "2222:22"

location / {
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header Host $http_host;
    proxy_pass http://127.0.0.1:2280;
}

Sam Saffron over 10 years ago

Any way you can PM me full logs? curious to see what is happening

Ben Lubar over 10 years ago

I don’t have logs, but I figured out the reason it happens.

launcher contains this:

      (exec echo "$input" | docker run $env -e DOCKER_HOST_IP=$docker_ip -cidfile $cidbootstrap -i -a stdin -a stdout -a stderr $volumes $image \
         /bin/bash -c "$run_command") \
         || (docker rm `cat $cidbootstrap` && rm $cidbootstrap)

which (if the image isn’t downloaded yet) eventually gets to code in this file.

Since stdin isn’t a terminal, docker omits progress messages. It does not, however, omit the newlines from the progress messages (for whatever reason).

Sam Saffron over 10 years ago

I think I know what is happening here, you must be using an old samsaffron/discourse image, can you try pulling latest

docker pull samsaffron/discourse

Daniel over 9 years ago

@sam I’m trying to use your launcher / pups framework to set up a different rails project (http://dradisframework.org). I’ve got a couple of questions:

Would you be interested in a pull request that removes Discourse-specific stuff from discourse/discourse_docker (e.g. making some variable name changes, console output messages, etc.)
The base image you use, is already prefilled with some Discourse-specific stuff (e.g. the git repo under /var/www/discourse etc.). In this blog you reference to your Docker bootstrap image broken link) but that has been since removed from your repo. Do you have/share documentation about how you create the samsaffron/discourse Docker image somewhere?

Update not sure how I could miss the stuff under /image in the main repo. For some reason it wasn’t in my boot2docker setup.

Many thanks!

Sam Saffron over 9 years ago

I would be happy to take PRs that make it more flexible by adding switches, we already support the custom: base_image: that covers the majority of your needs.

However, we can not nuke discourse specific stuff out of launcher, just make it optional default on.

Sam Saffron over 9 years ago

yeah exited containers and leftover images is a major pain. I need to
automate something to keep this clean.

Sam Saffron

Discourse in a Docker container

Comments

Twitter updates

Stack Overflow

Online content

About me