Category: Computing

Basic Umami Setup Guide

March 10, 2024 by Dustin Rue·0 Comments

Some time ago I removed Google Analytics to avoid the tracking that came along with it and it all being tied to Google. I also wasn’t overly concerned about how much traffic my site got. I write here and if it helps someone then great but I’m not out here to play SEO games. Recently, however, I heard of a new self hosted option called Umami that claims to respect user privacy and is GDPR compliant. In this post I will go through how I set it up on the site.

Umami supports both PostgreSQL and MySQL. The installation resource I used, discussed below, defaults to PostgreSQL as the datastore and I opted to stick with that. PostgreSQL is definitely not a strong skill of mine and I struggled to get things running initially. Although I have PostgreSQL installed on a VM already for my Mastodon instance, I had to take some additional steps to get PostreSQL ready for Umami. After some trial and error I was able to get Umami running.

My installation of PostreSQL is done using the official postgres.org resources which you can read about at https://www.postgresql.org. In addition to having PostgreSQL itself installed as a service I also needed to install postgresql15-contrib in order to add pgcrypto support. pgcrypto support wasn’t something I found documented in the Umami setup guide but the software failed to start successfully without it and an additional step detailed below. Below is how I setup my user for Umami with all commands run as the postgres user or in psql. Some info was changed to be very generic, you should change it to suit your environment:

cli: createdb umami
psql: CREATE ROLE umami WITH LOGIN PASSWORD 'password’;
psql: GRANT ALL PRIVILEGES ON DATABASE umami TO umami;
psql: \c umami to select the umami database
psql: CREATE EXTENSION IF NOT EXISTS pgcrypto;
psql: GRANT ALL PRIVILEGES ON SCHEMA public TO umami;

With the above steps taken care of you can continue on.

Since I am a big fan of using Kubernetes whenever I can, my Umami instance is installed into my k3s based Kubernetes cluster. For the installation of Umami I elected to use a Helm chart by Christian Huth which is available at https://github.com/christianhuth/helm-charts and worked quite well for my purposes. Follow Christian’s directions for adding the helm chart repository and read up on the available options. Below is the helm values I used for installation:

ingress:
  # -- Enable ingress record generation
  enabled: true
  # -- IngressClass that will be be used to implement the Ingress
  className: "nginx"
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-production
  # -- Additional annotations for the Ingress resource
  hosts:
    - host: umami.dustinrue.com
      paths:
        - path: /
          pathType: ImplementationSpecific
  # -- An array with the tls configuration
  tls:
    - secretName: umami-tls
      hosts:
        - umami.dustinrue.com

umami:
  # -- Disables users, teams, and websites settings page.
  cloudMode: ""
  # -- Disables the login page for the application
  disableLogin: ""
  # -- hostname under which Umami will be reached
  hostname: "0.0.0.0"

postgresql:
  # -- enable PostgreSQL™ subchart from Bitnami
  enabled: false

externalDatabase:
  type: postgresql

database:
  # -- Key in the existing secret containing the database url
  databaseUrlKey: "database-url"
  # -- use an existing secret containing the database url. If none given, we will generate the database url by using the other values. The password for the database has to be set using `.Values.postgresql.auth.password`, `.Values.mysql.auth.password` or `.Values.externalDatabase.auth.password`.
  existingSecret: "umami-database-url"

The notable changes I made from the default values provided is I enabled ingress and set my hostname for it as required. I also set cloudMode and diableLogin to empty so that these items were not disabled. Of particular note, leaving hostname at the default value is the correct option as setting it to my hostname broke the startup process. Next, I disabled the postgresql option. This disables the installation of PostgreSQL as a dependent chart since I already had PostreSQL running.

The last section is how I defined my database connection information. To do this, I created a secret using kubectl create secret generic umami-database-url -n umami and then edited the secret with kubectl edit secret umami-database-url -n umami. In the secret, I added a data section with base64 encoded string for “postgresql://umami:[email protected]:5432/umami”. The secret looks like this:

apiVersion: v1
data:
  database-url: cG9zdGdyZXNxbDovL3VtYW1pOnBhc3N3b3JkQDEwLjAuMC4xOjU0MzIvdW1hbWk=
kind: Secret
metadata:
  name: umami-database-url
  namespace: umami
type: Opaque

Umami was then installed into my cluster using helm install -f umami-values.yaml -n umami umami christianhuth/umami which brought it up. After a bit of effort on the part of Umami to initialize the database I was ready to login using the default username/password of admin/umami.

I setup a new site in Umami per the official directions and grabbed some information that is required for site setup from the tracking code page.

Configuring WordPress

Configuring WordPress to send data to Umami was very simple. I added the integrate-umami plugin to my installation, activated the plugin and then went to the settings page to input the information I grabbed earlier. My settings page looks like this:

Screenshot of Umami settings showing the correct values for Script Url and Website ID. These values come from the Umami settings screen for a website.

With this information saved, the tracking code is now inserted into all pages of the site and data is sent to Umami.

Setting up Umami was a bit cumbersome for me initially but that was mostly because I am unfamiliar with PostgreSQL in general and the inline documentation for the Helm chart is not very clear. After some trial and error I was able to get my installation working and I am now able to track at least some metrics for this site. In fact, Umami allows me to share a public URL for others to use. The stats for this site is available at https://umami.dustinrue.com/share/GadqqMiFCU8cSC7U/Blog.

Ansible Proxmox Dynamic Inventory

October 15, 2023 by Dustin Rue·2 Comments

One of the challenges or points of friction for me using Proxmox in my home lab has been integrating Ansible with it more cleanly. The issue is I have traditionally maintained my inventory file manually which is a bit of a hassle. Part of the issue is that Proxmox doesn’t really expose a lot of metadata about the VMs you have running to things like tagging don’t actually exist. Despite that I set out to get a basic, dynamically generated inventory system that will work against my Proxmox installation to make the process at least a bit smoother.

For some time, Ansible has supported the idea of dynamic inventory. This type of inventory will query a backend to build out an inventory that is compliant with Ansible. Proxmox, having an API, has a dynamic inventory plugin available from the community. In this post I will showcase how I got started with a basic Proxmox dynamic inventory.

When I set out I had a few requirements. First, I really don’t have a naming convention of my VMs that makes any sense in DNS. Some systems have a fully qualified domain but most do not. The ones that do have fully qualified domain name wouldn’t actually be available over ssh on the IP resolved for that domain. To get around this, I wanted to be able to map the host name in Proxmox to its internal IP address. By default, the dynamic inventory plugin will set ansible_host to the name of the VM. For this I had to provide a compose entry to set the ansible_host which you’ll see below. This feature is made possible because I always install the qemu guest agent.

The second requirement is that ssh connection info was dynamic as well because I use a number of different operating systems. Since all of my systems use cloud-init I am able to set the ssh username to the ciuser value thus ensuring I always know what the ssh user is regardless of the operating system used.

Here is my dynamic inventory file:

plugin: community.general.proxmox
validate_certs: false
want_facts: true
compose:
  ansible_host: proxmox_agent_interfaces[1]["ip-addresses"][0].split('/')[0]
  ansible_user: proxmox_ciuser

I placed this information into inventory/inventory.proxmox.yaml. Most of the entries are self-explanatory but I will go through what the compose section is doing.

The first item in the compose section is setting the ansible_host. When the inventory plugin gathers information from Proxmox it will gather the assigned IP addresses as determined using the Qemu Guest Agent. In all cases that I could see, the first IP address will be localhost and the second one will always be the primary interface in the system. With information known, I was able to create the jinja2 template to grab the correct IP address and strip the netmask off of it.

The next line is setting the ansible_user by just copying the proxmox_ciuser value. With these two variables set, Ansible will use that username when connecting to the host at its internal IP address. Since the systems were brought up using cloud-init, my ssh key is already present on all of the machines and the connection works without much fuss.

To support this configuration, here is my ansible.cfg:

[defaults]
inventory = ./inventory
fact_caching_connection = .cache
retry_files_enabled = False
host_key_checking = False
forks = 5
fact_caching = jsonfile

[inventory]
cache = True
cache_plugin = jsonfile

[ssh_connection]
pipelining = True
ssh_args = -F ssh_config

This configuration is setting a few options for me related to how to find the inventory, where to cache inventory information and where to cache facts about remote machines. Caching this info greatly speeds up your Ansible runs and I recommend it. The ssh_args value allows me to specify some additional ssh connection info.

In addition to the above configuration files, there are environment variables that are set on my system. These variables define where to find the Proxmox API, what user to connect with and the password. The environment variables are defined on the dynamic inventory plugin page but here is what my variables look like:

PROXMOX_PASSWORD=[redacted]
PROXMOX_URL=https://[redacted]:8006/
PROXMOX_INVALID_CERT=True
PROXMOX_USERNAME=root@pam
PROXMOX_USER=root@pam

The user/username value is duplicated because some other tools rely on PROXMOX_USERNAME instead of PROXMOX_USER.

And that’s it! With this configured I am able to target all of my running hosts by targeting “proxmox_all_running”. For example, ansible proxmox_all_running -m ping will ping all running machines across my Proxmox cluster.

WordPress ActivityPub and Cloudflare

September 16, 2023 by Dustin Rue·6 Comments

TLDR; The fix for this is to ensure you are forcing your CDN to properly handle “application/activity+json” in the Accept header vs anything else. In other words, you need to Vary on Accept, but it’s best to limit it to “application/activity+json” if you can.

With the release of ActivityPub 1.0.0 plugin for WordPress I hope we’ll see a surge in the number of WordPress sites that can be followed using your favorite ActivityPub based systems like Mastodon and others. However, if you are hosting your WordPress site on Cloudflare (and likely other CDNs) and you have activated full page caching you are going to have a difficult time integrating your blog with the greater Fediverse. This is because when an ActivityPub user on a service like Mastodon performs a search for your profile, that search will land on your WordPress author page looking for additional information in JSON format. If someone has visited your author page recently in a browser then there is the chance Mastodon will get HTML back instead resulting in a broken search. The reverse of this situation can happen too. If a Mastodon user has recently performed a search and later someone lands on your author page, they will see JSON instead of the expected results.

The cause of this is because Cloudflare doesn’t differentiate between a request looking for HTML or one looking for JSON, this information is not factored into how Cloudflare caches the page. Instead, it only sees the author page URL and determines that it is the same request and returns whatever it has. The good news is, with some effort, we can trick Cloudflare into considering what type of content the client is looking for while still allowing for full page caching. Luckily the ActivityPub has a nice undocumented feature to help work around this situation.

To fix this while keeping page caching you will need to use a Cloudflare worker to adjust the request if the Accept header contains “application/activity+json”. I assume you already have page caching in place and you do not have some other plugin on your site that would interfere with page caching, like batcache, WP SuperCache and more. For my site I use Cloudflare’s APO for WordPress and nothing else.

First, you will want to ensure that your “Caching Level” configuration is set to standard. Next, you will need to get setup for working with Cloudflare Workers. You can follow the official guide at https://developers.cloudflare.com/workers/. Next, create a new project, again using their documentation. Next, replace the index.js file contents with:

export default {
  async fetch(req) {
    const acceptHeader = req.headers.get('accept');
    const url = new URL(req.url);

    if (acceptHeader?.indexOf("application/activity+json") > -1) {
      url.searchParams.append("activitypub", "true");
    }

    return fetch(url.toString(), {
      cf: {
        // Always cache this fetch regardless of content type
        // for a max of 5 minutes before revalidating the resource
        cacheTtl: 300,
        cacheEverything: true,
      },
    });
  }
}

You can now publish this using wrangler publish. You can adjust the cacheTtl to something longer or shorter to suite your needs.

Last step is to associate the worker with the /author route of your WordPress site. For my setup I created a worker route of “*dustinrue/author*” and that was it. My site will now cache and return the correct content based on whether or not the Accept header contains “application/activity+json”.

Remember that Cloudflare Workers do cost money though I suspect a lot of small sites will easily fit into the free tier.

TIL: colima + k3s will use local Docker images

August 2, 2023 by Dustin Rue·0 Comments

When you create a k3s cluster using colima it will default to using Docker for the runtime. This means that any Docker image you build or pull will be available to k3s. This greatly simplifies testing locally built images being referenced by Helm charts or Kustomize (or whatever you are using).

This is not a feature unique to colima, but rather a feature of k3s if it is told to use the Docker runtime. You can read more at https://docs.k3s.io/advanced#using-docker-as-the-container-runtime.

TIL: This plugin for helm exists to fix left behind resources

July 31, 2023 by Dustin Rue·0 Comments

If you have an older Kubernetes cluster with older Helm based software installs and you aren’t paying attention, it can be easy to leave some resources in a state that are impossible to update or remove. This is because of APIs that have been deprecated and then removed. While facing this issue today, I found this Helm plugin exists which can help resolve the problem – https://github.com/helm/helm-mapkubeapis

Updated proxmox-packer project

July 29, 2023 by Dustin Rue·0 Comments

Pushed some updates to my proxmox packer project at https://github.com/dustinrue/proxmox-packer. The updates include effort to support Packer 1.9.2 and the most recent Proxmox plugin. The makefile has been updated to run the necessary init command to get the Proxmox plugin installed as well.

More importantly, the changes fix an issue that prevented the provisioner from running leading to broken or missing cloud-init support in the resulting templates.

Adventures in repairing my WireGuard VPN

July 22, 2023 by Dustin Rue·0 Comments

In this post I’m going to more or less drop some notes about how I went about debugging a WireGuard VPN issue I was having. I have a WireGuard based VPN running on Rocky Linux 9 which is basically a default minimal installation with WireGuard installed. The system ships with firewalld and nftables, which will be important later. firewalld and nftables are used on my installation of WireGuard to work properly and I have a number of PostUp and PostDown commands that are run to insert rules so that VPN clients are NAT’d properly. As an older Linux user, I am very comfortable with iptables but significantly less so with firewalld and especially nftables.

My adventure began after a system update that prevented data from passing through the connection. Due to how WireGuard works, it appears the connection is made but no data would flow. After confirming my IP address had not changed recently and I was indeed connecting to the system at all I still couldn’t get traffic to pass through VPN.

The first thing I set out to do was verify connectivity with the service. Starting with tcpdump -i any port 51820 I was surprised that I wasn’t seeing any traffic to the service. I was surprised by this because there were no rules present in iptables to suggest the port was blocked and yet I would also get messages stating the port was administratively closed. In an effort to confirm this I wanted to see if I could get WireGuard to log what it was doing. As it turns out, despite being in the kernel as a module, there is a way to make it output logs. The following will set the module’s debug mode to on:

echo 'module wireguard +p' | sudo tee /sys/kernel/debug/dynamic_debug/control

Unfortunately I cannot share these logs but it will provide information about peers coming and going and efforts to maintain the connection. After enabling debug I found…no entries. Very unusual. My next step was to remove firewalld, and be extension nftables. Once removed, there were no firewall rules on the system at all and it was wide open. At this point I was able to see debug messages from WireGuard showing that new peers were connecting. As expected, the VPN still didn’t work because the required NAT rules were missing but this did finally confirm that the problem was with firewalld.

At this point it hit me that firewalld works differently than I was thinking and it has a concept of services. These services won’t appear in an iptables listing. At this point I had a feeling I knew what I had to do. I reinstalled firewalld, enabled and started it and looked at the services it was set to allow:

firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: ens18
  sources:
  services: cockpit dhcpv6-client ssh
  ports:
  protocols:
  forward: yes
  masquerade: yes
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

Immediately it became obvious that wireguard was not included in the list. Simply running firewall-cmd --zone=public --add-service=wireguard --permanent added the service to my public zone and from here my VPN started working again.

I have provided my thoughts in other places (Mastodon) about how I dislike firewalld and nftables. I find them much more tedious than iptables and haven’t ever really taken the time to learn them. As distributions change and mature over time, a lot of the default settings I took for granted, like a system’s networking being otherwise wide open after an install, are no longer true. At this point I am being forced to learn these tools, which is actually a good thing.

TIL: a null storageClass is not the same as no storageClass

April 5, 2023 by Dustin Rue·0 Comments

Long title but a relatively quick TIL. In a helm chart, when specifying a storageClass in your templates, it is important that if the user does not set a storageClass that you do not output storageClass in your template for a persistent volume. That is say that this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: volume-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ""
  resources:
    requests:
      storage: 5Gi

Is not equivalent to this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: volume-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

When installed the first time, both will result in Kubernetes selecting your default storage class. When this happens, your resource is actually updated to reflect the selected storageClass in the API. When you apply the same file again during an upgrade, an empty storageClassName will be seen as a change from the default storage to “”, which is not what you want. Instead, you should not send storageClassName at all if the user did not specify one.

TIL: systemd ExecStartPre/ExecStopPost permissions

March 31, 2023 by Dustin Rue·0 Comments

TIL that you can tell systemd to run any ExecStartPre and ExecStopPost scripts as root instead of the user the service is supposed to run under.

At the same time we can touch on how to create an override for a service. In my case, I wanted to override how Redis is started on a system to ensure hugepage was set correctly, per their documentation. Creating an override for a service is super simple:

systemctl edit <service>

In my case this means:

systemctl edit redis

Since I know that the name of the service I want to edit is redis. From here, I am presented with my favorite editor where I can input the following:

[Service]
PermissionsStartOnly=true
ExecStartPre=/usr/local/sbin/hugepage.sh

Here I am defining two things. First, I want the permissions to only apply to the ExecStart command, not the others. Next, I am specifying an ExecStartPre that calls a script. This script is simply outputting:

/bin/echo never > /sys/kernel/mm/transparent_hugepage/enabled

Once complete, save out the file and restart your service. Your changes will now take affect.

If you are running a newer release of systemd (231+) then you can use the following format as well:

ExecStartPre=+/usr/local/sbin/hugepage.sh

Avoiding stampeding Mastodons

February 9, 2023 by Dustin Rue·0 Comments

Yesterday I was reminded that when a URL is shared on Mastodon, every instance that has a user following you, that server will make a request to your site at least once in an effort to get some additional embed information. If your site is WordPress based, like this one, then you will likely see two requests. The first request to your site will request the URL that was added to the post while the second one follows any embed information WordPress is exposing in order to get some additional meta data. Since Mastodon is a federated system, every Mastodon server or instance will need to gather this data in order for it to be displayed to its users.

If you are a user that has a lot of followers then posting a link to your blog or site will likely result in a mini DDoS has hundreds of Mastodon instances request this information from your server. If you have not taken precautions this can potentially take down your site as it is overloaded with requests! Years ago this would have been referred to as being “slash dotted” (links on https://slashdot.org) or “fireballed” (links on https://daringfireball.net).

Fortunately you can very effectively deal with this situation on your own or by working with your hosting provider. In this post, I am going to describe how I handle the situation using Cloudflare, which is the CDN provider I have chosen to put my site behind. I am not going into full detail on how to implement all options and I am not selling Cloudflare or associated with them beyond being a customer. What I share here will be applicable to any CDN or will at least serve as inspiration for how to handle it in your configuration.

As I said previously, this site is using WordPress and is behind Cloudflare. To make it easy on myself I have also purchased their Automatic Platform Optimization for WordPress feature. I got into this option initially because I wanted to understand it better but have since kept it because it works well. The biggest feature of APO for WordPress is that it enables full page caching for your site. This is a must if you want to get the best possible experience for users globally. Using APO is absolutely not necessary, you can simply use Nginx micro caching instead or any other caching solution, but the key here is to have full page caching so that repeated requests to your site do not incur actual processing time by WordPress.

APO will, out of the box, cache full pages of your site but what it will not protect is the meta data URL used to provide additional information for embeds. To prevent Mastodon servers from crushing your site with embed meta data requests, there is one additional endpoint you need to force to be cached. Here is how I forced Cloudflare to cache the correct URL for me.

Login into Cloudflare and click on the domain for your site. Find the caching section of the menu and click on Cache Rules. Add a new rule and define what is shown in the screenshot

Screenshot showing a cloudflare configuration screen for caching an oembed request from Mastodon, or any system that would do this. Add a name for your rule, set the Field to URI Path contains the path /wp-json/oembed

From here, tell Cloudflare what to do with this match

Screenshot showing another Cloudflare configuration screen. Here you should set the Cache status to "Eligible for cache" and "Override origin" set to 2 hours. 2 hours is the minimum option on a free plan

Note that 2 hours is the lowest cache time I can specify on an otherwise free Cloudflare plan so that is what I set it to. With these options filled out you can click save and you are done. Anything looking for this URL will now either get a cached copy of the response or will cause the content to be cached for future requests.

Of course, you don’t need to use Cloudflare to make this work. Savvy users can also translate these URLs to Nginx or Apache configuration to perform the trick. The goal is to ensure your WordPress site is better able to handle when you have shared a link to Mastodon and there are many options. Using Cloudflare is one option that has worked well for me. I encourage everyone that hosts a blog, either self-hosted or through some managed provider, to ensure that page caching and the oembed URL for WordPress is cached.