⚠️ This post is still work in progress ⚠️
About this Guide
This post will guide you through the initial setup of Nomad, Consul and Vault.
Additionally, I will cover some common additional steps for
- AWS CLI (for ECR)
- Docker (+ Authentication)
Prequisites
CNI Bridge
Nomad uses CNI plugins to configure the network namespace used to secure the Consul service mesh sidecar proxy. All Nomad client nodes using network namespaces must have CNI plugins installed. See the Consul CNI Docs for more information.
See Nomad Install Docs for more information
curl -L -o cni-plugins.tgz "https://github.com/containernetworking/plugins/releases/download/v1.0.0/cni-plugins-linux-$( [ $(uname -m) = aarch64 ] && echo arm64 || echo amd64)"-v1.0.0.tgz && \
sudo mkdir -p /opt/cni/bin && \
sudo tar -C /opt/cni/bin -xzf cni-plugins.tgz
Configure environment values
This script configures multiple env values NOMAD_ADDR
, VAULT_ADDR
, CONSUL_HTTP_ADDR
so we can run cli commands without appending the address and port every time.
PRIVATE_IP=$(/sbin/ip -o -4 addr list ens18 | awk '{print $4}' | cut -d/ -f1)
echo -e "\nexport NOMAD_ADDR=http://$PRIVATE_IP:4646" >> /root/.bashrc
echo -e "export VAULT_ADDR=https://$PRIVATE_IP:8200" >> /root/.bashrc
echo -e "export CONSUL_HTTP_ADDR=http://$PRIVATE_IP:8500" >> /root/.bashrc
source /root/.bashrc
Installation
Get private interface IP
You need to obtain the private IP of your chosen network interface. Check if the command below fits your needs or set the IP address manually.
PRIVATE_IP=$(/sbin/ip -o -4 addr list ens18 | awk '{print $4}' | cut -d/ -f1)
echo $PRIVATE_IP
Install require packages
apt-get update && apt-get upgrade -y
apt-get install curl wget gpg gnupg coreutils ca-certificates lsb-release
# AWS CLI
apt-get install -y awscli amazon-ecr-credential-helper
Install Docker
See the official Docker install guide for more information.
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io
Add HashiCorps PPAs
See the official install guide if your prefer to use the prebuilt binary.
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
Install Nomad, Consul & Vault
apt-get update
apt-get install -y nomad consul vault
systemctl enable nomad
systemctl enable consul
systemctl enable vault
Configuration
Docker AWS ECR Authentication
We will create a new /etc/docker/config.json
file to provide Nomad with our Docker login credentials.
Replace <aws_id>
and <aws_region>
with your own values.
mkdir -p /etc/docker
cat <<EOT >> /etc/docker/config.json
{
"credHelpers": {
"public.ecr.aws": "ecr-login",
"<aws_id>.dkr.ecr.<aws_region>.amazonaws.com": "ecr-login"
}
}
EOT
Nomad
Before getting started with Nomad instances, we need to configure some environment values in /etc/nomad.d/nomad.env
mkdir -p /etc/nomad.d
cat <<EOT >> /etc/nomad.d/nomad.env
AWS_ACCESS_KEY_ID=******
AWS_SECRET_ACCESS_KEY=******
AWS_DEFAULT_REGION=<aws_region>
VAULT_ADDR=http://127.0.0.1:8200
VAULT_TOKEN=
CONSUL_HTTP_ADDR=$PRIVATE_IP:8500
CONSUL_CACERT=/etc/consul.d/certs/consul-agent-ca.pem
CONSUL_CLIENT_CERT=/etc/consul.d/certs/dc1-server-consul.pem
CONSUL_CLIENT_KEY=/etc/consul.d/certs/dc1-server-consul-key.pem
CONSUL_HTTP_SSL=false
EOT
We will now configure your Nomad instances as client and/or server.
Place this file in /etc/nomad.d/nomad.hcl
rm -f /etc/nomad.d/nomad.hcl && nano /etc/nomad.d/nomad.hcl
Nomad Client
datacenter = "dc1"
data_dir = "/opt/nomad"
bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"
server {
enabled = false
}
client {
enabled = true
template {
disable_file_sandbox = true
}
}
consul {}
plugin "docker" {
config {
volumes {
enabled = true
}
auth {
config = "/etc/docker/config.json"
}
}
}
Nomad Client & Server
datacenter = "dc1"
data_dir = "/opt/nomad"
bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"
server {
enabled = true
bootstrap_expect = 3
}
client {
enabled = true
template {
disable_file_sandbox = true
}
}
consul {}
plugin "docker" {
config {
volumes {
enabled = true
}
auth {
config = "/etc/docker/config.json"
}
}
}
Some values explained
- The server
server.bootstrap_expect
values defined, how many nomad server instances need to be running in order to select a leader. - The
client.template.disable_file_sandbox
allows you to mount host files into you job allcos.
Consul
Setup TLS & encryption (optional)
To enable internal TLS encryption we need to generate a certificate using the following commands. See the Consul TLS docs for more information.
mkdir -p /etc/consul.d/certs && cd /etc/consul.d/certs
consul keygen
# UyaZRVMUdoNinDtEDxMZFiqpQmjbsIQXUeGYDWgi=
consul tls ca create -domain consul
# ==> Saved consul-agent-ca.pem
# ==> Saved consul-agent-ca-key.pem
You now need to distribute the generated consul-agent-ca.pem
certificate to all consul agents and place it in /etc/consul.d/certs/consul-agent-ca.pem
Generate agent certificates
On your host Consul server, generate an agent certificate for each Consul agent you want to deploy.
consul tls cert create -server -dc dc1 -domain consul
# ==> WARNING: Server Certificates grants authority to become a
# server and access all state in the cluster including root keys
# and all ACL tokens. Do not distribute them to production hosts
# that are not server nodes. Store them as securely as CA keys.
# ==> Using consul-agent-ca.pem and consul-agent-ca-key.pem
# ==> Saved dc1-server-consul-0.pem
# ==> Saved dc1-server-consul-0-key.pem
Distribute your agent certificate to the respective server
scp 10.1.10.1:/etc/consul.d/certs/dc1-server-consul-1-key.pem .
scp 10.1.10.1:/etc/consul.d/certs/dc1-server-consul-1.pem .
Configure each agent
Again, choose which servers will be used as client or server for your Consul instances.
mkdir -p /etc/consul.d/certs
chown -R consul:consul /etc/consul.d
chown -R consul:consul /opt/consul
rm -f /etc/consul.d/consul.hcl && nano /etc/consul.d/consul.hcl
chmod 640 /etc/consul.d/consul.hcl
# Paste consul keys
/etc/consul.d/certs/dc1-server-consul.pem
/etc/consul.d/certs/dc1-server-consul-key.pem
# Bootstrap ACL
consul acl bootstrap
Consul Client
datacenter = "dc1"
data_dir = "/opt/consul"
bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"
client_addr = "{{ GetPrivateInterfaces | exclude \"name\" \"docker.*\" | join \"address\" \" \" }} {{ GetAllInterfaces | include \"flags\" \"loopback\" | join \"address\" \" \" }}"
retry_join = ["<add all o your consul clients & server ip addresses>"]
ca_file = "/etc/consul.d/certs/consul-agent-ca.pem"
cert_file = "/etc/consul.d/certs/dc1-server-consul.pem"
key_file = "/etc/consul.d/certs/dc1-server-consul-key.pem"
ports {
grpc = 8502
}
connect {
enabled = true
}
dns_config {
allow_stale = true
node_ttl = "5s"
use_cache = true
cache_max_age = "5s"
}
Consul Server
datacenter = "dc1"
data_dir = "/opt/consul"
server = true
bootstrap_expect = 3
bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"
client_addr = "{{ GetPrivateInterfaces | exclude \"name\" \"docker.*\" | join \"address\" \" \" }} {{ GetAllInterfaces | include \"flags\" \"loopback\" | join \"address\" \" \" }}"
retry_join = ["<add all o your consul clients & server ip addresses>"]
ca_file = "/etc/consul.d/certs/consul-agent-ca.pem"
cert_file = "/etc/consul.d/certs/dc1-server-consul.pem"
key_file = "/etc/consul.d/certs/dc1-server-consul-key.pem"
ui_config {
enabled = true
}
acl {
enabled = true
default_policy = "allow"
enable_token_persistence = true
}
ports {
grpc = 8502
}
connect {
enabled = true
}
dns_config {
allow_stale = true
node_ttl = "5s"
use_cache = true
cache_max_age = "5s"
}
Cheat-Sheet
Here are some handy commands I commonly use for debugging.
# nomad cleanup allocation history & summary
nomad system gc
nomad system reconcile summaries
# show service logs
nomad monitor
# attach shell to job
nomad alloc exec -task=<task> <alloc> /bin/bash
# show open ports
lsof -i -P -n | grep LISTEN
ss -tulpn
# test tcp connection
nc -z -v -w 2 <host> <port>
# test consul dns
dig @127.0.0.1 -p 8600 _<service-name>._tcp.service.consul
# query container from inside
curl -H "Host: domain.tld" 10.1.10.1:21021
# query some endpoint
curl -H "Host: domain.tld" -X POST <host>/api
# list service instances "address:port"
curl -s http://127.0.0.1:8500/v1/catalog/service/<service-name>|jq -j '.[] | .ServiceAddress,":",.ServicePort,"\n"'
# consul filtering
curl --get http://127.0.0.1:8500/v1/agent/services --data-urlencode 'filter=Service == "<service-name>"'|jq -j '.[] | .Address,":",.Port,"\n"'