I recently had the opportunity to deploy a MongoDB server on Amazon Web Services (AWS). In order to limit the problems of crash and data loss, it is also replicated with two other servers, ideally in a different geographical area to ensure high availability.
Consul: Service Discovery and Failure Detection
Consul is a product developed in Go language by the HashiCorp company and was born in 2013.
Consul has multiple components but its main goal is to manage the services knowledge in an architecture (which is service discovery) and also allows to ensure that all contacted services are always available by adding health checks on them. Basically, Consul will bring us a DNS server that will resolve IP addresses of a host's services, depending on which one will be healthy.
This method also allows to do load balancing even if we will see in this blog post that it's actually not possible to distribute priorities between services.
Another Consul service we'll use in this article is key/value storage because we'll create a Docker Swarm cluster and will use Consul as the discovery/storage for Docker Swarm.
In order to clarify the rest of the article, here are the ports used by Consul:
8302): RPC exchanges,
8400: RPC exchanges by the CLI,
8500: Used for HTTP API and web interface,
8600: Used for DNS server.
Next, we'll focus on service discovery and failure detection. To do that, we'll create a Docker Swarm cluster with the following architecture:
As you can see, we'll have 3 Docker machines:
Consulmachine (used for Swarm Discovery),
- A machine that will act as our "
node 01" with an HTTP service that will run on it (Swarm),
- A machine that will act as our "
node 02" with an HTTP service that will run on it (Swarm).
We'll also add on our two nodes a Docker container with Registrator image that will facilitate the discovery of Docker containers into Consul.
For more information about Registrator, you can visit: https://gliderlabs.com/registrator/. Let's start to install our architecture!
First machine: Consul (Swarm Discovery)
We'll start by creating our first machine: Consul
To do that, just type:
Once the machine is fully ready, prepare your environment to use this Docker machine and launch a Consul container:
Well, your Consul is ready to receive your services and also our Docker Swarm nodes!
By the way, you can open the web interface by obtaining the Consul machine IP address:
Then, open in your browser the following URL:
Second machine: Node 01
Now, it's time to create t
he machine that corresponds to our first Docker Swarm cluster node and that will also receive the master role for our cluster (we need one...):
As you can see, we've added the
--swarm-discovery option with our Consul machine IP address and port 8500 that corresponds to the Consul API. This way, Docker Swarm will use the Consul API to store machine information with the rest of the cluster.
We'll now configure our environment to use this machine and install a
Registrator container on top of it in order to auto-register our new services (Docker containers).
To do that, type the following:
Here, you can notice that we share the host Docker socket in the container. This solution could be a controversial solution but in our example case, forgive me about that ;)
If you want to register services to Consul I recommend to register them using the Consul API in order to keep control on what's added in your Consul.
-ip option allows to precise to Registrator that we want to register our services with the given IP address: so here the Docker machine IP address and not the Docker container internal IP address.
We are now ready to start our HTTP service. This one is located under the "ekofr/http-ip" Docker image which is a simple Go HTTP application that renders "hello,
In order to fit this article needs, we will also create a different network for the two Docker machines in order to clearly identify IP addresses for our two services.
Let's create a new network concerning our first node:
then you can use the newly created network to be used with your HTTP service container:
Before executing the same steps for our second node, we will ensure that our HTTP service works:
Third machine: Node 02
We'll now repeat most steps we've ran for our first node but we'll change some values. First, create the Docker machine:
Prepare your environment to use this new node 02 machine and install Registrator container on it:
Now, create the new network and launch your HTTP service:
We are all set! We can now discover what Consul brings to us.
Indeed, you can now resolve your service hostname
http-ip.service.consul by using the DNS server brought by Consul and you should see your two services appearing as a DNS record:
In other words, a kind of load balancing will be done on one of these services when you'll try to join them
Ok, but what about the load balancing repartition? Are we able to define a priority and/or weight for each services? Sadly, the answer is no, actually. An issue is currently opened about that on Github in order to bring this support. You can find it here: https://github.com/hashicorp/consul/issues/1088.
Indeed, if we look in details about
SRV DNS record type, here is what we get:
As you can see here, priority and weight are both defined to value 1 so the load balancing will be equal between all services. If you add the Consul Docker machine IP address as a DNS server on your operating system, you'll be able to perform HTTP calls and see more easily what's happening on load balancing:
Here, we have an IP address that corresponds to each HTTP service that we have registered so we can clearly see that we are load balanced between our two containers.
We'll now add a Health Check to our service in order to ensure that it can be use safely by our users.
In this case we'll start to return on our node 01 and suppress the container named
ekofr/http-ip in order to recreate it with a Health Check:
Registrator brings us some environment variables in order to add some Health Check for our containers into Consul and you can see the full list here: http://gliderlabs.com/registrator/latest/user/backends/#consul.
Idea here is to verify that port 80 is opened and application answers correctly so we'll add a script that simply executes a curl command:
You can do the same thing on your node 02 (by paying attention to modify the
node-01 values to
node-02 ) and you should now visualize these checks on the Consul web UI:
You can also use the Consul API in order to verify the good health of your services:
You are now able to install Consul on your projects architectures in order to ensure that services you contact are available and also be able to identify eventual issues that can occur on your services.
It's important to add a maximum of checks on elements that can make your services become unavailable (ensure that this one can be contacted and answer, ensure that remaining available disk space is sufficient, etc...). Consul is a tool that integrates well in your architectures by its simplicity of use and its powerful API.