Docker 102 - Container communication.
Disclaimer: This assumes you’ve read Docker 101 or that you know Docker basics.
Although not necessary to follow along, the contents of the examples run have been made available in a repository in Github.
In this article I’ll talk about connecting Docker container with the Internet and other containers. I’ll go through an example that’s rich enough to understand the concept but not really hard to follow. To be specific, we’ll be building an API with a database service running on background and we will be running some tests on it.
Dockerized Flask API
For our first part, we’ll use only two files: the code for the Flask API (api.py) and the Dockerfile that will generate the container the API will run in. The contents of the files can be seen below:
#api.py
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello():
return "Hello World!"
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
#Dockerfile
FROM python:3.5.3
RUN pip install flask
COPY api.py /app/
WORKDIR /app
CMD python api.py
Our api.py defines one endpoint only and runs the API on the port 5000. The Dockerfile installs Flask, copies and runs the api.py in the container. If we build our image and run it:
docker build . -t api:1.0
docker run api:1.0
this will run our API inside of the container in the port 5000, but we won’t be able to reach it from our browser/shell. In order to do so, we need to map ports in our local host to the ones in our container. We don’t need to rebuild our image, it’s enough to specify it like so:
docker run -p 8080:5000 api:1.0
this will map the port 8080 in the localhost to the 5000 in the container. With this, we should be able to navigate to localhost:8080 and succesfully connect to the API.
Adding a database service
We can easily add a database service. I’ve chosen MongoDB, but any other should be as easy to add. I have modified the api.py and API’s Dockerfile lightly as follows:
#api.py
from flask import Flask, request
from pymongo import MongoClient
import os
MONGO_URI = os.environ["MONGO_URI"]
dummy_number = 0
app = Flask(__name__)
cl = MongoClient(MONGO_URI)
db = cl["test"]
@app.route("/", methods=["GET", "POST"])
def insert_dummy():
if request.method == "POST":
db.coll.insert_one({"number": dummy_number})
dummy_number += 1
return "Post succeded"
elif request.method == "GET":
return db.coll.find({})
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
#Dockerfile
FROM python:3.5.3
ENV MONGO_URI db:27017
RUN pip install flask pymongo
COPY api.py /app/
WORKDIR /app
CMD python api.py
We’ve included a POST method to our endpoint that will insert a record in the mongo collection we create in the first lines of the file. To use pymongo we need to have a mongo instance running db:27017. Note that we use ENV in the Dockerfile to define a MONGO_URI environment variables inside of the container that we will retrieve from our API code. There’s a mongo image available in Dockerhub that we will use for creating this instance.
For containers to communicate with one another, they have to be be part of the same network (legacy methods worked differently, more on this later). We can create a new network named my-api-network, by typing:
docker network create my-api-network
After doing this, we have to run the docker containers normally but specifying the network parameter and giving the container a name so that we can communicate with it from the other container.
docker run -d --net my-api-network --name db mongo:3.4.7
docker build . -t api
docker run --net my-api-network -p 8080:5000 api
Once we’ve done this we should be able to query the API which will be running in http://localhost:8080.
Let’s review what we’ve done:
-
First, we run the mongo:3.4.7 image (the first time using this image it will be pulled from Dockerhub before) specifying that it will belong to the created network. We also give it a name, db, for ease of access. This means that inside of my-api-network in the host named db in its 27017 port we have an instance of mongo running. We’ve also included the -d parameter, so that in can execute in background and we don’t need an extra shell to execute the api.
-
After this, we build again the image for our API docker and run it in the same network, publishing the service in the port 8080 of our host. When we run this docker, the api.py file will look for a database in the URI specified in our MongoClient constructor (db:27017).
Some important remarks
- There can only be one container (running or stopped) with the same name. This is, if
we try to run the last commands a second time, it will detect a conflict when running
the MongoDB image as there’s already an existing one called db. In order to remove the
previous container we can do:
docker rm db
-
We only have to publish those ports that will map to outside the my-api-network. In other words, we don’t need to publish ports involved in communications inside of the created network.
- For checking that the API is working, we can use curl to reach the API with a GET or a POST:
#For a POST example, we specify the --data parameter curl --data "foo bar" localhost:8080 #For a GET example, we don't specify anything curl localhost:8080
Adding some tests
In an real-life example of an API, we may want to add some end-to-end tests, where we test how the environment would work when fully working (database and API together). This can be easily included with a test file and a third docker container as follows.
#tests.py
import unittest
import requests
import os
API_URL = os.environ["API_URL"]
class TestGet(unittest.TestCase):
def runTest(self):
r = requests.get(API_URL)
self.assertEqual(r.status_code, 200)
class TestPost(unittest.TestCase):
def runTest(self):
r = requests.post(API_URL, data = "dummy data")
self.assertEqual(r.status_code, 200)
#tests_dockerfile
FROM python:3.5.3
ENV API_URL http://myapicontainer:5000
RUN pip install requests
COPY tests.py /app/
WORKDIR /app
CMD python -m unittest tests.py
Tests will look for an host named myapicontainer and will try to reach port 5000.
We will first raise the mongo service as before:
docker run -d --net my-api-network --name db mongo:3.4.7
Then we will run the api similarly (no need to build, if we haven’t changed the api.py or its Dockerfile since the last build):
docker run -d --net my-api-network --name myapicontainer api
We run the api, with the name we’ve chosen before (myapicontainer). We don’t need to publish any ports as all the inter-container communication will happen inside of the same network. Finally, we run our build our image tests and run them in the same network.
docker build -f tests_dockerfile . -t tests-image
docker run --net my-api-network tests-image
We should see the tests run normally.
docker-compose
We can see a clear pattern happening here. Each time we need a service, we create a new docker image, build a new container and assign it to the same network if I want it to interact with other containers. For 3 services this is more or less fine, but let’s say you had 14 microservices. That’d mean 14 dockerfiles, and you’d probably end up creating a script that would do all the commands for you but it’s sort of a hassle to make sure the network is created and doing all the clean-up after executing it once to be able to execute it twice.
Good news are: docker-compose exists; it’s a tool that simplifies the process of creating multi-container Docker applications. With a single file, we’ll be able to connect as many services as needed. In order to use it, we first need to install it. Here’s a link.
Once we have docker-compose installed, we can define the previous network (database, api and tests) as follows:
#docker-compose.yml
version: '3'
environment:
- API_URL=my-api-container:5000
- MONGO_URI=db:27017
services:
db:
image: mongo:3.4.7
my-api-container:
build:
context: .
dockerfile: api_dockerfile
depends_on:
- db
tests:
build:
context: .
dockerfile: tests_dockerfile
depends_on:
- api
This docker-compose.yml describes an application formed by three services: db, api and tests.
By default, docker-compose runs all the services listed in a network that’s created automatically.
Specifying a service name (db, for example), is the equivalent to running its docker with --name <service-name>
.
For the images that will be constructed from a Dockerfile we need to specify a
build context and, if named differently than Dockerfile, the name of the dockerfile.
We can also define environment variables at service level (which is cleaner if we’re using
this variables across several containers) and leave them out from individual
dockerfiles.
Once we’ve configured this docker-compose.yml we can build our application with the following commands:
docker-compose build
docker-compose up
We should see how the database starts, our API connects to the database and finally how our tests are run against the API. Services will start in the order defined by the depends_on directives but at no point will one service wait for the other services to be “ready”.
Waiting for services
Services don’t wait for each other to be ready, mainly because this definition of ready is really loose and would change with the service. For services with dependencies such as ours (we need the API to reachable) a race condition is prone to happen as sometimes the API will be ready before the tests are run and sometimes it won’t. For this sort of problem, we have three kind of solutions:
- Leaving out the CMD directive in the tests_dockerfile, waiting for the API to be ready and running:
docker-compose up --build docker-compose run tests python -m unittest tests.py
This will successfully run the the command inside of the container where we have configured our tests. (Note that we have to specify
--build
each time we change a Dockerfile for one of the containers.) This technique similar to a docker run command and does the job if we want to run the tests manually, as we will have to check whether the API is ready. I don’t quite like this. - Option numero 2. Coding your tests such that they include checking for the connection before running
each test. For example, when using unittest we can specify a
setUp
method in each test that waits for a service to appear in the API url or timeout if it takes too long. Something like this:... def setUp(self): r = requests.get(API_URL, timeout=15) if status_code != 200: self.fail("Connection couldn't be stablished with "+API_URL) ...
- The third and final option is the most application-agnostic one but includes external code. It’s the one recommended in the Docker documentation: using bash scripts already made and tested that wait for a service to be reachable. You can find one of this scripts here After downloading the wait-for file we can change our docker-compose.yml file to the following:
#docker-compose.yml
version: '3'
services:
db:
image: mongo:3.4.7
my-api-container:
environment:
- MONGO_URI=db:27017
build:
context: .
dockerfile: api_dockerfile
depends_on:
- db
tests:
environment:
- API_URL="http://my-api-container:5000"
build:
context: .
dockerfile: tests_dockerfile
command: "./wait-for-it.sh my-api-container:5000 -s -- python -m unittest tests.py"
depends_on:
- api
This configuration overwrites the CMD in the tests_dockerfile and prepends the necessary command to wait for the API to be reachable form the tests container. This is not the prettiest of solutions either but… it’s all we have for now.
Final Remarks
-
A good habit to follow is to include everything that’s prone to vary (URLs, URIs, etc) as a environment variable inside of your docker-compose file so that you can follow clearly what value is being used.
-
Legacy methods for inter-container communication used
--link
to explicitly connect containers. This used the bridge network but its usage is now deprecated. This is just in case you work with previous code that uses this directive. -
If you have a big number of environment variables to set, or different values depending if the execution is done locally or remotely, testing vs deploying, etc you can use an auxiliar file to define them and pass it as a parameter for each one of the containers (no matter whether you run them with Docker commands or with docker-compose)