Docker 102 - Container communication.

Disclaimer: This assumes you’ve read Docker 101 or that you know Docker basics.

Although not necessary to follow along, the contents of the examples run have been made available in a repository in Github.

In this article I’ll talk about connecting Docker container with the Internet and other containers. I’ll go through an example that’s rich enough to understand the concept but not really hard to follow. To be specific, we’ll be building an API with a database service running on background and we will be running some tests on it.

Dockerized Flask API

For our first part, we’ll use only two files: the code for the Flask API (api.py) and the Dockerfile that will generate the container the API will run in. The contents of the files can be seen below:

#api.py
from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello World!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000)
#Dockerfile
FROM python:3.5.3

RUN pip install flask

COPY api.py /app/

WORKDIR /app

CMD python api.py

Our api.py defines one endpoint only and runs the API on the port 5000. The Dockerfile installs Flask, copies and runs the api.py in the container. If we build our image and run it:

docker build . -t api:1.0
docker run api:1.0

this will run our API inside of the container in the port 5000, but we won’t be able to reach it from our browser/shell. In order to do so, we need to map ports in our local host to the ones in our container. We don’t need to rebuild our image, it’s enough to specify it like so:

docker run -p 8080:5000 api:1.0

this will map the port 8080 in the localhost to the 5000 in the container. With this, we should be able to navigate to localhost:8080 and succesfully connect to the API.

Adding a database service

We can easily add a database service. I’ve chosen MongoDB, but any other should be as easy to add. I have modified the api.py and API’s Dockerfile lightly as follows:

#api.py
from flask import Flask, request
from pymongo import MongoClient
import os

MONGO_URI = os.environ["MONGO_URI"]
dummy_number = 0

app = Flask(__name__)
cl = MongoClient(MONGO_URI)
db = cl["test"]

@app.route("/", methods=["GET", "POST"])
def insert_dummy():
    if request.method == "POST":
        db.coll.insert_one({"number": dummy_number})
        dummy_number += 1
        return "Post succeded"

    elif request.method == "GET":
        return db.coll.find({})

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000)
#Dockerfile
FROM python:3.5.3

ENV MONGO_URI db:27017

RUN pip install flask pymongo

COPY api.py /app/

WORKDIR /app

CMD python api.py

We’ve included a POST method to our endpoint that will insert a record in the mongo collection we create in the first lines of the file. To use pymongo we need to have a mongo instance running db:27017. Note that we use ENV in the Dockerfile to define a MONGO_URI environment variables inside of the container that we will retrieve from our API code. There’s a mongo image available in Dockerhub that we will use for creating this instance.

For containers to communicate with one another, they have to be be part of the same network (legacy methods worked differently, more on this later). We can create a new network named my-api-network, by typing:

docker network create my-api-network

After doing this, we have to run the docker containers normally but specifying the network parameter and giving the container a name so that we can communicate with it from the other container.

docker run -d --net my-api-network --name db mongo:3.4.7 

docker build . -t api
docker run --net my-api-network -p 8080:5000 api

Once we’ve done this we should be able to query the API which will be running in http://localhost:8080.

Let’s review what we’ve done:

  • First, we run the mongo:3.4.7 image (the first time using this image it will be pulled from Dockerhub before) specifying that it will belong to the created network. We also give it a name, db, for ease of access. This means that inside of my-api-network in the host named db in its 27017 port we have an instance of mongo running. We’ve also included the -d parameter, so that in can execute in background and we don’t need an extra shell to execute the api.

  • After this, we build again the image for our API docker and run it in the same network, publishing the service in the port 8080 of our host. When we run this docker, the api.py file will look for a database in the URI specified in our MongoClient constructor (db:27017).

Some important remarks

  • There can only be one container (running or stopped) with the same name. This is, if we try to run the last commands a second time, it will detect a conflict when running the MongoDB image as there’s already an existing one called db. In order to remove the previous container we can do:
    docker rm db
    
  • We only have to publish those ports that will map to outside the my-api-network. In other words, we don’t need to publish ports involved in communications inside of the created network.

  • For checking that the API is working, we can use curl to reach the API with a GET or a POST:
    #For a POST example, we specify the --data parameter
    curl --data "foo bar" localhost:8080
    #For a GET example, we don't specify anything
    curl localhost:8080
    

    Adding some tests

In an real-life example of an API, we may want to add some end-to-end tests, where we test how the environment would work when fully working (database and API together). This can be easily included with a test file and a third docker container as follows.

#tests.py
import unittest
import requests
import os 

API_URL = os.environ["API_URL"]

class TestGet(unittest.TestCase):
    def runTest(self):
        r = requests.get(API_URL)
        self.assertEqual(r.status_code, 200)

class TestPost(unittest.TestCase):
    def runTest(self):
        r = requests.post(API_URL, data = "dummy data")
        self.assertEqual(r.status_code, 200)
#tests_dockerfile
FROM python:3.5.3

ENV API_URL http://myapicontainer:5000

RUN pip install requests

COPY tests.py /app/

WORKDIR /app

CMD python -m unittest tests.py

Tests will look for an host named myapicontainer and will try to reach port 5000.

We will first raise the mongo service as before:

docker run -d --net my-api-network --name db mongo:3.4.7

Then we will run the api similarly (no need to build, if we haven’t changed the api.py or its Dockerfile since the last build):

docker run -d --net my-api-network --name myapicontainer api

We run the api, with the name we’ve chosen before (myapicontainer). We don’t need to publish any ports as all the inter-container communication will happen inside of the same network. Finally, we run our build our image tests and run them in the same network.

docker build -f tests_dockerfile . -t tests-image
docker run --net my-api-network tests-image

We should see the tests run normally.

docker-compose

We can see a clear pattern happening here. Each time we need a service, we create a new docker image, build a new container and assign it to the same network if I want it to interact with other containers. For 3 services this is more or less fine, but let’s say you had 14 microservices. That’d mean 14 dockerfiles, and you’d probably end up creating a script that would do all the commands for you but it’s sort of a hassle to make sure the network is created and doing all the clean-up after executing it once to be able to execute it twice.

Good news are: docker-compose exists; it’s a tool that simplifies the process of creating multi-container Docker applications. With a single file, we’ll be able to connect as many services as needed. In order to use it, we first need to install it. Here’s a link.

Once we have docker-compose installed, we can define the previous network (database, api and tests) as follows:

#docker-compose.yml
version: '3'

environment:
    - API_URL=my-api-container:5000
    - MONGO_URI=db:27017

services:
    db:
       image: mongo:3.4.7 
    my-api-container:
        build: 
            context: .
            dockerfile: api_dockerfile
        depends_on:
            - db
    tests:
        build:
            context: .
            dockerfile: tests_dockerfile
        depends_on:
            - api

This docker-compose.yml describes an application formed by three services: db, api and tests. By default, docker-compose runs all the services listed in a network that’s created automatically. Specifying a service name (db, for example), is the equivalent to running its docker with --name <service-name>. For the images that will be constructed from a Dockerfile we need to specify a build context and, if named differently than Dockerfile, the name of the dockerfile. We can also define environment variables at service level (which is cleaner if we’re using this variables across several containers) and leave them out from individual dockerfiles.

Once we’ve configured this docker-compose.yml we can build our application with the following commands:

docker-compose build
docker-compose up

We should see how the database starts, our API connects to the database and finally how our tests are run against the API. Services will start in the order defined by the depends_on directives but at no point will one service wait for the other services to be “ready”.

Waiting for services

Services don’t wait for each other to be ready, mainly because this definition of ready is really loose and would change with the service. For services with dependencies such as ours (we need the API to reachable) a race condition is prone to happen as sometimes the API will be ready before the tests are run and sometimes it won’t. For this sort of problem, we have three kind of solutions:

  • Leaving out the CMD directive in the tests_dockerfile, waiting for the API to be ready and running:
    docker-compose up --build
    docker-compose run tests python -m unittest tests.py
    

    This will successfully run the the command inside of the container where we have configured our tests. (Note that we have to specify --build each time we change a Dockerfile for one of the containers.) This technique similar to a docker run command and does the job if we want to run the tests manually, as we will have to check whether the API is ready. I don’t quite like this.

  • Option numero 2. Coding your tests such that they include checking for the connection before running each test. For example, when using unittest we can specify a setUp method in each test that waits for a service to appear in the API url or timeout if it takes too long. Something like this:
    ...
      def setUp(self):
          r = requests.get(API_URL, timeout=15)
          if status_code != 200:
              self.fail("Connection couldn't be stablished with "+API_URL)
    ...
    
  • The third and final option is the most application-agnostic one but includes external code. It’s the one recommended in the Docker documentation: using bash scripts already made and tested that wait for a service to be reachable. You can find one of this scripts here After downloading the wait-for file we can change our docker-compose.yml file to the following:
#docker-compose.yml
version: '3'

services:
  db:
    image: mongo:3.4.7
  my-api-container:
    environment:
      - MONGO_URI=db:27017
    build:
      context: .
      dockerfile: api_dockerfile
    depends_on:
      - db
  tests:
    environment:
        - API_URL="http://my-api-container:5000"
    build:
      context: .
      dockerfile: tests_dockerfile
    command: "./wait-for-it.sh my-api-container:5000 -s -- python -m unittest tests.py"
    depends_on:
      - api

This configuration overwrites the CMD in the tests_dockerfile and prepends the necessary command to wait for the API to be reachable form the tests container. This is not the prettiest of solutions either but… it’s all we have for now.

Final Remarks

  • A good habit to follow is to include everything that’s prone to vary (URLs, URIs, etc) as a environment variable inside of your docker-compose file so that you can follow clearly what value is being used.

  • Legacy methods for inter-container communication used --link to explicitly connect containers. This used the bridge network but its usage is now deprecated. This is just in case you work with previous code that uses this directive.

  • If you have a big number of environment variables to set, or different values depending if the execution is done locally or remotely, testing vs deploying, etc you can use an auxiliar file to define them and pass it as a parameter for each one of the containers (no matter whether you run them with Docker commands or with docker-compose)

Written on September 10, 2017