Tail-Hole

I had to tear down my homelab the other weekend and the thing that I have been missing the most is my pi-hole. For reasons I won’t get into, I won’t be redeploying my pi-hole to a raspberry pi this time around. I have been super excited about the edge computing that Fly.io offers, so I figured this would be a good project to get started with.

Project Goals

Get familiar with Fly.io: Learn how to deploy, configure, and troubleshoot real problems on Fly.io
Deploy a Pi-Hole that is accessible to my devices (and my devices only)

Code

If you want to dive straight into the code/configuration, here is the repo holding the project: https://gitlab.com/btaylor5/tail-hole

Solution

Using and Learning Fly.io

The best way to get started with Fly.io is right here:

https://fly.io/docs/hands-on/start/

If you aren’t ready to dive into the docs, know this: Fly.io takes docker images – technically OCI images (a standardized form of docker images) – and creates firecracker VMs out them. Firecracker is the super lightweight VM technology that powers AWS Lambda and AWS Fargate.

With that backgound, know that we will package our project as a Dockerfile, but through the deployment toolchain, our code will run on Fly.io as a firecracker VM.

Securing the Network with Tailscale

I am already a huge fan of tailscale, so it was the obvious choice when approaching the problem of how to secure the network. With tailscale I can lock down access to only my devices. If you haven’t heard of tailscale, or haven’t given it a try yet, check it out: https://tailscale.com/. With tailscale I can hole punch through firewalls (utilizing WireGuard tunnels) to authenticate network connections between my devices. In other words, when I configured the Fly.io deployment I didn’t expose a single service/port to the internet. Instead I utilized the tailscale infrastructure to connect my devices. Only my authenticated devices can use and manage the Pi-Hole software.

Bootstrapping a device to join a tailscale network is super simple – and the tailscale docs explain exactly how to do this. I’ll reiterate those steps here.

Step 1

In the Dockerfile, install the tailscale binaries. While not required by any means, I also snuck in the installation of the jq package here, which should make our lives easier when interacting with JSON responses from the tailscale API.

# This tailscale setup assumes pihole is still based on debian bullseye
RUN curl -fsSL https://pkgs.tailscale.com/stable/debian/bullseye.noarmor.gpg | \ 
        sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg >/dev/null && \
    curl -fsSL https://pkgs.tailscale.com/stable/debian/bullseye.tailscale-keyring.list | \ 
        sudo tee /etc/apt/sources.list.d/tailscale.list

# install tailscale and the jq package used for interacting with the tailscale api
RUN apt-get update && apt-get install -y tailscale jq && rm -rf /var/lib/apt/lists/*

Step 2

Now that tailscale is present on the install, we need to initialize it. In its most simple form, something like this is all it takes:

# Start up the tailscale network process (Needs to run in the background)
tailscaled --state=/var/lib/tailscale/tailscaled.state \ 
           --socket=/var/run/tailscale/tailscaled.sock &

# Join the device to the tailnet
tailscale up \ 
    --authkey=${TAILSCALE_AUTHKEY} \
    --hostname=${FLY_APP_NAME} \
    --advertise-exit-node

# --advertise-exit-node is included above, but it doesn't work yet.

We want to provide a tailscale auth_key that is “reusable, ephemeral, and pre-authorized” since we want the ability to automate (re-)deployments. So be sure to check those boxes when generating the auth key in the tailscale admin portal.

That’s really all it takes to get tailscale initialized, however we can still improve the experience some more. When you deploy this code a couple times you end up attempting to deploy a device with the same name multiple times. Tailscale manages this behavior by appending your device name with a number to make it unique. You start to get “tail-hole”, “tail-hole-1”,“tail-hole-2” etc.

New problem: We want to deploy with a consistent name between deployments, so our tailscale DNS entries can be consistent between deployments.

The good news is, that with a call to the tailscale API we can easily manage devices. When deploying a machine, before we join our machine to the tailnet we need to make sure we delete any existing machines that match our FLY_APP_NAME. We can use curl to interact with the api for simplicity. I included some magic waits (I cringe seeing them too) to make sure we don’t hit the API too quickly, but the waits are likely overkill.

# Retrieve all devices in the tailnet
devices=$(curl -X GET "https://api.tailscale.com/api/v2/tailnet/${TAILNET}/devices" \
  -u "${TAILSCALE_API_KEY}:")
sleep 2s

# Filter down the list of devices to only include machine ids where the machine name contains the
# FLY_APP_NAME
ids_to_delete=$(echo $devices s| jq -r ".devices[] | select(.name | test(\"${FLY_APP_NAME}\")) | .id")

# Loop over all ids corresponding to FLY_APP_NAME machines and delete them
while IFS= read -r line; do
    echo "Attempting to delete machine with id $line"
    curl -X DELETE "https://api.tailscale.com/api/v2/device/$line" \
        -u "${TAILSCALE_API_KEY}:" -v
    sleep 2s
done <<< "$ids_to_delete"

Integrating Tailscale and the Pi-Hole

Now that we know the general steps to deploy tailscale, how do we leverage it in a pi-hole deployment?

It really comes down to two settings, which we can set as environment variables in the Docker file. These settings tell the pi-hole to bind to the tailscale0 interface, and only respond on that interface.

ENV INTERFACE=tailscale0
ENV DNSMASQ_LISTENING=single

With pi-hole running on the tailscale0 interface, we have everything we need to “run” pi-hole on tailscale. This begs the question: How do we utilize the pi-hole on our tailscale devices? Tailscale actually provides some powerful DNS settings. We can actually configure our pi-hole as the DNS server for our entire tailscale network (including overriding local DNS settings). We can’t automate the “override local dns settings” configuration, but we can automate the process of setting the DNS server to our tail-hole deployment through the tailscale API. The good news is, that once you manually set “override local dns settings”, it appears to stick through other dns setting changes made through the API.

# Set this server as the DNS server for the tailnet
curl -X POST "https://api.tailscale.com/api/v2/tailnet/${TAILNET}/dns/nameservers" \
  -u "${TAILSCALE_API_KEY}:" \
  --data-binary "{\"dns\": [\"$(tailscale ip -4)\"]}"

So now that we have everything we need to integrate the pi-hole and tailscale, we just need to figure out a way to wire up everything so that the tailscale network is initialized before the pi-hole is initialized. Let’s take a look at the pi-hole docker image: https://github.com/pi-hole/docker-pi-hole/blob/master/Dockerfile

Relevant bits:

ENTRYPOINT [ "/s6-init" ]

COPY s6/debian-root /
COPY s6/service /usr/local/bin/service

So we know that s6 is used as the entrypoint (technically a script called s6-init), and digging into the repo we find pi-hole specific initialization code here in the s6/debian-root/etc/cont-init.d directory. Based on the naming convention of the directory cont-init.d, and the number prefixed files contained in it, we can assume that anything we want to get executed on startup can be injected into this directory – and dependencies/ordering can be managed by using the proper number prefix. In this case, we see that 20-start.sh is responsible for starting pi-hole, so our tailscale initialization can happen anywhere before that (i.e. a file named 19_start_tailscale.sh)

So if we copy the tailscale initialization code in start_tailscale.sh into that directory, we can manage the startup sequence. We can build that into the docker image using this Dockerfile snippet:

# We have pihole and tailscale in a container now, copy over our
# bootstrapping script that glues it all together on container start
COPY start_tailscale.sh /etc/cont-init.d/19_start_tailscale.sh
RUN chmod u+x /etc/cont-init.d/19_start_tailscale.sh

Wrapping Up – Deploy to Fly.io

This is the easiest part, and therefore the shortest. We can configure our secrets (api keys, auth keys, etc.) using the flyctl secrets set ENV_VARIABLE=super-top-secret-value, and the Fly.io will inject your secrets into the VM as environment variables.

When running fly deploy and fly logs I ran into a crash loop with FTL. It all comes down to this issue, solved by the Fly.io community: https://community.fly.io/t/pihole-is-falling-with-failed-to-create-listening-socket/4054/5.

I was able to diagnose the issue with the following fly commands.

Check the logs on fly: fly logs
Check the pi-hole logs on fly: fly ssh console --command "pihole -d"

While I haven’t taken the time to figure out the correct permission script changes/hacks to get FTL running under the pihole user, setting the user to root certainly solves the problem. I don’t encourage running this process as root long term, but that’s a problem for another day!

# Docker images don't get translated perfectly to fly firecracker vms (permissions and capabilities)
# https://community.fly.io/t/pihole-is-falling-with-failed-to-create-listening-socket/4054/5
ENV DNSMASQ_USER=root

Tail-Hole Deploying Pi-Hole Behind Tailscale on Fly.io