Dev with Docker

I’ve been using Docker to run my local development environments for nearly two years now. In that time, it has gone from experimental and arcane, to almost stable and user-friendly.

Docker logoI work on a MacBook Pro, in OS X. After a brief dive into the details of what Docker is and how it works, you learn that you can’t run the Docker engine in OS X. Containers require process isolation and resource management features of the Linux kernel, so you can only run the Docker engine in Linux.

There are a number of workarounds for this, all of which involve using a virtual machine. Initially, I used boot2docker, which loaded a small VM using VirtualBox. Eventually, this was subsumed into Docker Machine and the Docker Toolbox, which used the same VM in VirtualBox but will slightly improved tools. Now I’m using Docker for Mac, which still uses a boot2docker-based virtual machine, but it runs on the hypervisor framework built into OS X, obviating the requirement for a separate hypervisor like VirtualBox.

With the notable exception of filesystem read performance, Docker for Mac has removed many of the more tedious aspects of running a local Docker environment, and had made development much easier.

This is the first in an anticipated series of posts documenting how I have my local development environment configured, focusing on using Docker for Mac for WordPress development.

Data Persistence with Docker Volumes

Docker containers are designed to be ephemeral. You can destroy one and spin up an exact replica in seconds. Everything that defines the container can be found in the Dockerfile that declares how to build it.

This model does not, however, explain what to do with persistent data. Things like databases or uploaded media. In a production environment, I would recommend delegating these tasks to an external service, like Amazon’s RDS or S3.

For local development, though, you can use volumes for storing persistent data. Volumes come in two main flavors: data volumes and host directory mounts.

The latter is perhaps the most straightforward. You connect a directory in your container to a directory on your host machine, so they are essentially sharing the file system. Indeed, when you’re actively working on code, this is the simplest way to share your local code with your running containers. Mount the root directory of your project as a volume in your container, and anytime you update code, your container will also have the updates.

# docker run --rm -it -v="/your/local/dir:/srv/www/public" nginx:stable-alpine /bin/sh

This runs a container that has its /srv/www/public directory shared with your host system’s /your/local/dir directory. Updates you make to files either on your local system or in the container are automatically shared with the other.

Data volumes do not map directly to your host filesystem. Docker stores the data somewhere, and you generally don’t need to know where that is. When using Docker for Mac, one of the key differences is that a host mount shares files using osxfs (which currently has some performance issues) while a data volume stores its data inside the Docker virtual machine (which is subsequently much more performant for I/O). While I use host mounts for things like uploaded media, I prefer to use a data volume for storing databases.

# docker volume create --name=mysqldata

Once the volume is created, you can mount it into one or more containers.

# docker run --rm -v="mysqldata:/var/lib/mysql" mysql:5.5

The contents of our volume “mysqldata” will be available to MySQL in the /var/lib/mysql directory. The data volume itself doesn’t have a directory name (in contrast to the prior best practice of using a directory within a data-only containers). I think of a volume as a single directory that can be mounted wherever I want in a container.

Reaching localhost from a Docker container

Docker Engine has some built-in networking features to allow containers to communicate with each other and with the outside world. For example you can link two containers together to allow them to talk to each other, either via the docker run command or in your docker-compose.yml.

version: "2"
services:
  memcached:
    image: memcached:1.4-alpine
  php:
    image: php:5.6-fpm
    links:
      - memcached
  nginx:
    image: nginx:stable-alpine
    links:
      - php
    ports:
      - "80"

In our example, the php container can communicate with the memcached container, the nginx container can communicate with the php container, and the nginx container exposes its port 80 to receive connections from, for example, your web browser.

In some cases, though, your container needs to be able to reach back out to your host system. The specific use case I have is debugging with Xdebug. Using Docker for Mac, there’s not a reliable address you can use to make that connection. A container that tries to connect to localhost or 127.0.0.1 will be talking to itself, not to your host machine.

To work around this, I set up an additional IP address for my host OS’s loopback interface. I use 10.254.254.254, but you may find it conflicts with other applications you have running, so any other local address will be effective.

The command to set up this address is:

sudo ifconfig lo0 alias 10.254.254.254

Running this will allow any container you have running to connect back to your host OS using the address 10.254.254.254. You can, for example, set this in your php.ini as the remote host address for Xdebug (I prefer to set it in a reverse proxy configuration, but that’s a topic for a later post).

xdebug.remote_host=10.254.254.254

Verify that it worked with:

ifconfig lo0

Your new address should appear there along with a few other defaults.

You’ll have to run the command after every boot, unless you set a launchd script to do it for you. There are a few versions of the launchd plist file floating around, or you can make your own from one of them. Copy it into /Library/LaunchDaemons/ and your new loopback address will be set for you on every boot.

nginx as HTTPS proxy for Elasticsearch

Since Elasticsearch is exposed via an HTTP API, we can user our nginx server to proxy Elasticsearch requests using the HTTPS protocol.

Let’s say you have your local dev environment configured to use SSL. Your dev site is accessible at https://mysite.dev/. Wonderful! Now you need to add Elasticsearch to your project. Let’s add it to docker-compose.yml, something like:

version: "2"
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:5.2.2
    environment:
      - xpack.security.enabled=false
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
      mem_limit: 1g
    volumes:
      - elasticsearchindex:/usr/share/elasticsearch/data
    ports:
      - "9200"
    network_mode: "bridge"
  # and some other services like PHP, nginx, memcached, mysql
volumes:
  elasticsearchindex:

How do you make requests to Elasticsearch from the browser?

Option 1: Set up a proxy in your app. This probably resembles what you’ll ultimately get in production. You don’t really need any security on Elasticsearch for local dev, but in production it will need some sort of access control so users can’t send arbitrary requests to the server. If you’re not using a third-party service that already handles this for you, this is where you’ll filter out invalid or dangerous requests. I prefer to let more experienced hands manage server security for me, though, and this is a lot of overhead just to set up a local dev server.

Option 2: Expose Elasticsearch directly. Since I don’t need security locally, I could just open up port 9200 on my container and make requests directly to it from the browser at http://localhost:9200/. Notice the protocol there, though. If my local site is at https://mysite.dev/, then the browser will block insecure requests to Elasticsearch.

Option 3: Use nginx as a proxy. I’m already using a reverse proxy in front of my project containers. It terminates the SSL connections and then passes through unencrypted requests to each project’s nginx server. The project’s nginx container doesn’t need to deal with SSL. It listens on port 80 and passes requests to PHP with fastcgi.

server {
	listen 80 default_server;
	server_name mysite.dev;
	# ... more server boilerplate
}

Since Elasticsearch is exposed via an HTTP API, we can create another server block to proxy Elasticsearch requests. First, make sure the nginx container can talk to the Elasticsearch container. In docker-compose.yml:

  nginx:
    image: nginx:stable-alpine
    environment:
      - VIRTUAL_HOST=mysite.dev,*.mysite.dev
    volumes:
      - ./nginx/default.conf:/etc/nginx/conf.d/default.conf:ro
      - ./nginx/elasticsearch-proxy.conf:/etc/nginx/conf.d/elasticsearch-proxy.conf:ro
      - ./nginx/php.conf:/etc/nginx/php.conf:ro
    links:
      - php
      - elasticsearch
    ports:
      - "80"
    network_mode: "bridge"

And then create elasticsearch-proxy.conf to handle the requests:

upstream es {
	server elasticsearch:9200;
	keepalive 15;
}

server {
	listen 80;
	server_name search.mysite.dev;

	location / {
		proxy_pass http://es;
		proxy_http_version 1.1;
		proxy_set_header Connection "Keep-Alive";
		proxy_set_header Proxy-Connection "Keep-Alive";
	}
}

Now we can make requests to Elasticsearch from the browser at https://search.mysite.dev/. The nginx proxy will handle the SSL termination, and communicate with Elasticsearch using its standard HTTP API.