Since about mid December 2022 I have been running my own private instance of Mastodon. I thought I would detail how I did it and what it has cost me so far.

When I first learned about Mastodon I was excited to get to understand it better, particularly how it is hosted and scaled. For Mastodon, I decided right away that the best way to better my understanding was to host it myself and to do so on my favorite platform, Kubernetes. I started by creating my helm chart (https://github.com/dustinrue/mastodon-helm-chart) and installed the core software in my home lab which consists of k3s. The chart I created is based on the official helm chart (https://github.com/mastodon/chart). I created my own because I, again, wanted to learn about the moving pieces of a Mastodon installation but also because I was unhappy with the official chart integrating Redis and PostgreSQL as dependencies. In addition, it doesn’t break out the Sidekiq processes in a way that makes sense…but more on that later.

Before we can get to deep into what I did, we should probably first discuss some of the major components of a Mastodon instance or server. Mastodon is a collection of services working together to form a full solution which includes:

  • A web service which provides the user interface but is also the sort of API server for all things Mastodon. In a full production setup it is important that this be highly available.
  • A streaming service which feeds data to the web frontend as it arrives and is processed. This is almost important but doesn’t seem to be critical. In other words, you can survive a bit of downtime here, you’ll just have a less than great experience.
  • A number of Sidekiq queues. There are numerous Sidekiq queues which are the heart of how data moves in a Mastodon instance. These queues, as of this writing, include a scheduler, ingress, mailer, push, pull and default. Each queue has a specific purpose and each queue is again not absolutely critical to the availability of your Mastodon instance. This means that you can easily take down each queue temporarily to deal with some issue. While a queue is down know that nothing that queue is responsible for will be processed. The special scheduler queue, if not running, will likely prevent most other queues from doing anything at all.
  • Redis is a glue that keeps data flowing between processes. It is also a critical piece to keep running though losing data within it, while not ideal, is ok. Keeping it running is critical because all of the other Mastodon processes expect it to be available and will fail to start without it. In a full production setup I recommend ensuring it is running in a highly available fashion.
  • PostgreSQL is the last required piece of software when running Mastodon. Like Redis, it is what I could consider to be critical to your setup. If running a full production setup you will want to cluster it to maintain availability first with performance a secondary consideration.
  • You need some system for dealing with email. Mastodon needs to send email for account confirmation and some administrative or moderation work. For my system I am using Send In Blue (https://www.sendinblue.com) which has a free tier.

Mastodon also supports other, optional services which you can read about at https://docs.joinmastodon.org/admin/optional/.

As you can likely see, running Mastodon is not simple yet it isn’t overwhelming either. I believe running Mastodon can be done inexpensively, especially a private instance, but to run it in production correctly, there is definitely a base cost you need to consider so that you can remove as much failure points as possible. In addition, there are many other pieces you will likely want if running a large installation like how to monitor metrics, keeping track of Sidekiq queue depth and processing times and more.

Having spent some time on Mastodon during the great Twitter migration I witnessed some of the struggles of a number of instance admins as a their instances struggled to meet the demands of new users and users who had created accounts before but were suddenly active. I saw a few notable patterns emerge that contributed to their scaling woes including:

  • Not using a CDN or object storage system initially
  • Not installing pgbouncer in front of PostgreSQL
  • Not installing Sidekiq into separate processes running each queue

There are some really excellent guides and references on how to scale Mastodon (https://hazelweakly.me/blog/scaling-mastodon/ to name but one) but many of the recommendations will require you to do or have done one, if not all, of the above mentioned steps. Each of these items are disruptive in a way that you probably do not want to be trying to handle them while in a panic of trying to get your instance running again. If you are running or plan to run a public instance where you allow anyone to sign up then I highly recommend getting at least those three items out of the way from day one. Doing so will help ensure that scaling up from there is much, much easier as most items will then become adding additional servers to run more Sidekiq processes or tuning parameters.

When I created my helm chart, I took these lessons and applied them as conscious decisions in the design of the chart. Though not at all necessary for a small or single user instance, my chart breaks out all of the current Sidekiq queues into separate processes. This layout ensures the hard work of separating the processes out is done and the rest is a matter of scaling and tuning.

As of this writing, my helm chart also installs a weekly cronjob to clean up media files and, optionally, a cronjob for backing up the database to some shared storage in your Kubernetes cluster. Though it is ultimately incomplete, I feel the helm chart is a good start.

As for actually running Mastodon for myself I created a subdomain for my instance to live at. I then installed Mastodon, using my helm chart, into my k3s cluster. Ignoring the cost of my ISP and the computers I have, running Mastodon is quite minimal. My home lab provides everything I need to make Mastodon work including persistent storage using TrueNAS. For media storage, I created a Cloudflare R2 bucket and URL for public access. Mastodon is configured to send media content to R2 which is then served from the CDN URL. This keeps all of the heavy storage separate from the rest of the system. My last bill for R2 was just $0.06 which was for the approximately 20GB of content I have stored there. I do expect my next bill to be more because the average amount of data stored in R2 will be higher.

Since my installation is just a private one, I installed PostgreSQL and Redis as single instances within my k3s cluster. Both instances are extremely basic Bitnami based installed using their available helm charts. PostgreSQL is backed by persistent storage provided by TrueNAS. For email, my k3s cluster runs an installation of Postfix. Postfix is configured to send email through Send In Blue and services that I run in my cluster are configured to talk to Postfix. This allows me to have a single mail relay that I need to maintain the configuration for.

Ingress is provided by Cloudflare and cloudflared tunnels. A tunnel is configured on a different VM I have running and then configured in the Cloudfare side on how to route traffic to the Kube cluster with the correct hostname included.

All said, this setup has proven reliable for me since mid December. In a future post I’ll discuss how I got my private instance to feel a bit more included in the Fediverse by adding relays. Please leave a comment if you feel I missed something or got something wrong.

Quick tip on a rather specific situation I found myself in though I believe it could come up for a lot of people using WordPress trying to integrate with ActivityPub networks. If you are:

  • Running WordPress
  • Using a page caching solution like Cloudflare APO or manually configured
  • Running an ActivityPub plugin and/or webfinger

Then you will likely run into an issue with your site not being reliably discoverable when searched for. Using Matthias Pfefferle‘s ActivityPub, Webfinger and Nodeinfo plugins to get your WordPress site exposed as an ActivityPub server will add a few routes to your site. One of the routes is the author pages of WordPress which exist at /author/<author username>. However, this path when hit with a browser will return HTML. ActivityPub instances on the other hand will be looking for a different content type called application/activity+json. Unfortunately, many caching layers will not provide a Vary on Accept which you will need in order to return different data depending on what type of content the requester is looking for.

To resolve this on my site, which uses Cloudflare for CDN, I added a page rule that disallows caching for my author page. This works because I am the only author on the site. A full “proper” solution would be to set a Vary on the Accept header for that path, which Cloudflare does not support.

You may want to be very specific about what Vary headers are used, on what paths and what you actually accept a Vary header on and so on. Allowing for a wide or unlimited range of values can result in people easily breaking cache at the CDN sending requests to your origin servers.

Last year around this time I realized I wasn’t overly happy with…anything. Work was stressful and I was in a bad mood most of the time. I couldn’t really pinpoint what the cause was so on a whim I checked the app store for an app I could use to track my mood. I wanted to check off a lot of the things I did in a day that I thought my affect me and then review it later.

I landed on Dailyio (I pay to unlock the full reporting) and went to work adding in a few items I wanted to track including whether or not I was frustrated with work, had a lot of meetings or if I was able to focus and make progress on things. There are a lot of default options to start with and then anything I noticed I did often, like drink water or not, watch TV/movies, play games and so on I added.

Here is what I discovered somewhat quickly:

  • Drinking water and taking walks boosted my mood
  • Too much caffeine tanks my mood
  • Video games and TV do not boost my mood a lot but listening to music does
  • Making progress and being able to focus at work boost my mood while too many meetings and interruptions tank my day
  • Going out with my wife on a date is a booster but spending time with extended family quickly drains me (I love you all but I get overwhelmed/drained quickly. It’s not you it’s introvert problems)
  • Related to the previous one, not getting enough time to myself was among the biggest factors in determining how I felt day by day.
  • Cooking, cleaning (easily the biggest surprise for me), writing and travel are all boosters
  • Skipping breakfast is horrible
  • Even a small decrease in sleep quality affects my day. This is something I have always intuitively understood but having data to back it up is great

Aside from quick lessons I have now logged just over a year of data and it has become apparent as well that some patterns follow a yearly cycle. My daily average mood has gone down significantly as my end of year work load has gone up.

After gathering some data for a while I quickly made a few changes early on. I made sure to drink a bit more water than I usually would and have a bit less caffeine (through coffee or soda). I also nearly completely stopped drinking non diet soda as I found the high sugar input on the previous day caused me to feel terrible in the morning very consistently. Whenever I can, I sit and listen to music with no distractions. I take more walks when I can and don’t feel at all bad if I abandon a TV show.

Using Dailyio has been a super interesting experiment that I will continue to do so that I can keep tracking trends and doing what I can to even out my daily mood. Just using what I learned in the first few months I was able to change habits and routines so that I placed an emphasis on doing the things that truly led to better outcomes each day. There are things I assumed were beneficial to me that turned out to be incorrect and removing them from my life didn’t have the impact I thought they would.

If you are at all interested in this sort of thing I highly recommend finding an app, either Dailyio or something like it, that allows you to tracking and report on what affects your mood from day to day.

One of the requirements when setting up a Mastodon instance is that you are able to send outgoing email. If you are running a personal instance you easily get away with running something like Mailhog which will simply capture all emails being sent and present it to you in a nice web interface. While setting up my personal Mastodon instance I decided to setup a real smarthost/relay for my k3s cluster. I did this using Postfix configured to route mail through a smarthost. Search the web for details on how to do this, there are a lot of how-tos out there explaining the process if you are not familiar with it.

In the past I would have used my gmail account as my smtp relay. Earlier in 2022, Gmail removed the ability to do this so I needed to find a replacement. I ended up settling on https://www.sendinblue.com because they offer a free tier that allows for 300 emails per day. Since everything I do in my cluster is personal there is no way I’ll ever hit that limit. Even if I were to hit the limit I don’t mind if the email messages simply stop working until the next day. I found setup to be easy. Simply create an account (giving them a bit of information) and then visiting the SMTP & API page, clicking SMTP and getting my credentials to put into Postfix.

SendInBlue SMTP/API interface

I am not affiliated SendInBlue in any way, just sharing something I found that allows you to quickly setup an SMTP relay for free.

In this post I’m going to go over some of the things I have picked up about Mastodon over the past few weeks. Mastodon is better described right from the source but in very basic terms, it is a sort of distributed, better known as “federated“, Twitter like system comprised of any number of hosts communicating together using a standard set of protocols. Users of Mastodon exist on separate installations called instances that each have their own core topic or target audience. For example, I am currently on https://fosstodon.org which is focused on bringing people interested in Fee Open Source Software together. Being on an instance does not mean you are only able to interact with people on that instance or go off topic. No, every instance (unless blocked by the instance admin, more on that later) is connected to other instances by way of users following each other. What this means is if I follow someone on a different instance, the instance I am on becomes aware of it and will begin to draw in posts form that instance. In a way, the more people cross follow each other across instances the stronger the bond and the more data flows across all of them.

Larger instances with a lot of users will have the most diverse feeds available to you as the user. So there is a certain amount of advantage to being part of a large community as the pool of potential people you can interact with is naturally larger. As each instance has multiple feeds, consisting of your local feed of people you have chosen to follow, the local feed consisting of everyone on your instance and the federated feed containing everything the server has discovered from its users, you can change your social media experience on the fly. In addition to the multiple feeds, you have powerful self moderation tools including the ones you would expect like being able to mute or outright block users but you can also block entire instances with a couple of clicks. Just not into people posting Waifu stuff? Block entire instances dedicated to that material. Even though it may not violate any acceptable use policies you are able to quickly block entire types of content from your feed.

The most recent release of Mastodon allows you to follow hashtags allowing you to easily follow topics you like, so long as users tag their posts properly. This is an extra layer of awesomeness that is the Mastodon system. Which brings me to one of my last points about using Mastodon, there is no “algorithm” pushing content towards you. Gone are posts that rise to the top because someone paid money to put it there. No more toxic, hot take garbage from bots filling your feed and drowning what you’re really looking for.

Owners of Mastodon instances are encouraged to commit to the Mastodon Server Covenant. This covenant is an agreement to certain terms in an effort to give users confidence in the Mastodon network and the instances they join. An instance that commits to the Mastodon Server Covenant agree to, among other things, moderate content, backup the service and provide users with at least a three month warning prior to stopping services. Committing to this covenant is optional but in exchange the instance will be listed as having committed to the covenant giving them much better visibility to potential new users than instances that haven’t.

Moving on to the more technical, behind the scenes side of Mastodon, is just as interesting. The most well known and visible portion of Mastodon is called ActivityPub. ActivityPub is not unique to Mastodon, in fact it can be implemented by any number of systems and indeed there are a number of federated services that use it. ActivityPub is the portion that pushes content around between users and other instances. There is even a WordPress plugin, which this site uses, to allow people to subscribe to a feed of my blog posts. The additional services using ActivityPub are beyond the scope of this post.

Unlike Twitter, you can host your own Mastodon instance using the code available at https://github.com/mastodon/mastodon by following the directions at https://docs.joinmastodon.org/admin/prerequisites/. While I have no interest in hosting my own instance long term for a variety of reasons, I did want to understand how the software works as a whole and what it took to host and scale it. As you can imagine, hosting a large instance is definitely a technical challenge let alone the challenges of cost and moderation.

In an effort to learn how to run Mastodon, I created my own Helm chart available at https://github.com/dustinrue/mastodon-helm-chart. My helm chart is functional but really only for a specific setup and is not currently ready for wide use. However, what it taught me was a lot about the moving parts that make up a full Mastodon instance. Designed to be highly scaleable, Mastodon requires just two external services (ignoring file asset storage) including Redis and PostgreSQL. Redis is used for caching various pieces of data while PostgreSQL of course stores data that must persist including user accounts, their settings, posts and so on. You do also provide storage and this can be done using local storage or object storage like S3. Mastodon itself is shipped with everything else you need which consists of Sidekiq with various queues, a web socket based streaming application an a web app. The web app is what users see when they access your instance, the streaming portion feeds data to your browser as it comes in and Sidekiq handles a bunch of different background tasks including shipping your posts to subscribed instances, ingesting updates from other instances, getting images and videos ready (and pushed into S3 if you are using that). All of the Sidekiq queues can be broken out into individual processes (which is absolutely necessary for large installations) so that more work can be done in parallel. It is very clear that the designers of Mastodon have considered each aspect of hosting a large installation and taken care to ensure it is possible. In fact, there is a sort of “brain dump” style of “this is what it takes to scale Mastodon” from one of the primary stakeholders of the project at https://gist.github.com/Gargron/aa9341a49dc91d5a721019d9e0c9fd11.

In my short time with Mastodon, it has become one of the top three most exciting pieces of technology I have interacted with in my career, the other two being Linux and Kubernetes. The fact that I can run Mastodon on the other two makes it even better. Mastodon, and the protocols that power it, are the real “web 3.0” because it gives power back to the users. Come to think of it, Mastodon is like web 1.0 where its users hold the power and are in control. It was because of the Twitter shakeup that I discovered Mastodon and I hope that the current social media climate continues to bring greater awareness to the Mastodon network and it is able to continue growing. You can find me on fosstodon.org as @[email protected].

For a number of years I have self-hosted a few things out of my house. This has always meant that I need a way to allow public access to resources hosted on my home lab. In the past this has meant having a single system on my network that could act as a sort of gateway to everything else. This system runs Nginx and a number of vhost configurations to route traffic to the correct VM running whatever it is I’m looking for.

For this to work, I have to insert port forwards into my router so that traffic destined for port 80 or 443 is forwarded to the Nginx proxy. In addition to this, I have to run a script on the proxy that checks my current public IP address and if it has changed, update a DNS record within Cloudflare. This has served me well for years but I have always wondered if there was another option.

As it turns out, there is and it is called Cloudflare Tunnels. As someone who has used Cloudflare for years as a free CDN in front of my blog, I was super happy when Cloudflare made most of their Zero Trust functionality free for small users. Their Zero Trust platform consists of a number of elements but today I’m going to concentrate on just the Tunnel piece.

With the sudden increase in interest in Mastodon as a social media platform I decided to play around with the software and running it myself. Even though I don’t plan on running it on my own any time soon and am totally happy throwing money at some instance admins (which you should consider doing https://hub.fosstodon.org/support/), I thought it would be a fun exercise. Anyway, once I had the software running I needed a way to get it accessible to the public. This is where I turned to Cloudflare Tunnels.

Of course, for any of this to work you will need a Cloudflare account with a domain configured (which is free…and I am in no way connected with Cloudflare other than I am a customer). A Cloudflare tunnel works using an agent running on a host inside of your internal network that is then authenticated with your Cloudflare account. Creating the tunnel is simple:

  • Sign into your Cloudflare account
  • Click Zero Trust
  • Access and then Tunnels
  • Click the Create Tunnel button
  • Give the tunnel a name and click Save
  • Choose your operating system from the list and then paste in the commands you see. If you are using a Raspberry Pi you may need to get the latest release from https://github.com/cloudflare/cloudflared/releases. Once installed use the second command for if you have an existing installation of cloudflared. Advanced users can simply add the tunnel themselves.
  • Once cloudflared is running on your system it will show up as connected
  • Click Next

On the next screen you will be given the option to connect a hostname to the tunnel. For this post I will describe how I setup access to my mastodon.dustinrue.com instance. You will be presented with the following screen:

For my mastodon.dustinrue.com instance, I entered mastodon as the subdomain and selected my dustinrue.com domain. In the service section I selected https and gave it the IP address of my internal k3s system. My final config looked similar to this:
I use HTTPS because my internal k3s system’s ingress controller uses cert-manager to create certificates. From here I needed to add some additional application settings to ensure everything worked correctly. Note: you don’t want to use an existing subdomain here (unless you really know what you are doing) because Cloudflare is going to want to insert a new DNS record to make the tunnel work.

Since I am using an SSL service type, I need to specify the name to send as part of the request so that the ingress controller knows what resources is being accessed. Also, although I do use cert-manager I didn’t setup a TLS record for cert-manager to use so it just applied a self-signed cert. This is fine in this case because the public will see the certificate that Cloudflare provides, not what my k3s cluster has. For these reasons, my TLS section is configured like this:

The Origin Server Name and No TLS Verify are the important options.

In addition to the TLS settings, the regular HTTP section also needs some attention. For this, my settings look like this:

From here, click Save. Cloudflare will insert the correct DNS record for your domain and the connection from the public to your resource will be made. If everything is configured correctly then you will be able to access your resource at the subdomain you provided. Remember that all of the same rules apply here, everything in the path needs to be aware of whatever vhost you are using. The trick here is that there are no port forwards in your router and there is a protected path between Cloudflare into your home network, your home network remains unexposed to the public.

I hope this quick guide is enough to get you going on using Cloudflare tunnels. In a future post I will describe how to use the same system but for SSH connections. This is a powerful tool when combined with Cloudflare Zero Trust because it allows you to define who is allowed to access systems. Zero Trust can also be used to protect specific routes on a site (or the entire site) if you want. For my blog, I use Zero Trust to protect the admin and login pages.

In a previous post I quickly mentioned that this site now has pfefferle’s ActivityPub plugin installed. This plugin implements enough of the ActivityPub and associated protocols to allow a WordPress site to look and behave a bit like a user on an ActivityPub compatible platform including Mastodon and more. By installing the plugin, you can search for an author of a site and then follow them so you can see a stream of their content whenever they post it. From there you can comment on the post and interact with it from within your favorite Fediverse platform.

In this post I quickly describe how to get started with the plugin. To start, install the plugin (https://wordpress.org/plugins/activitypub/) using your usual method for installing plugins. For me, that means adding it to a composer.json file, for you it might mean simply searching for the plugin in your WordPress admin -> plugins screen. Once installed, activate the plugin. That’s it! Your site is now ready to be followed by anyone within the Fediverse network.

The plugin implements the bare minimum required, it seems, so it can be a bit confusing when there are no immediately obvious visual changes to anything. Most of what the plugin does is in the background, inserting routes that are necessary to make the webfinger and ActivityPub protocols work on your site. Don’t worry though, the plugin is working!

To get the search string people need to use to follow your blog posts or pages visit your user profile page. You should see something similar to this near the bottom of the page:

These profile identifiers can be pasted into the search bar of an instance and from there you can follow the author. Simply take the @ruedu portion that you see and paste it into the search bar of your Mastodon instance. For simplicity, I put this into my Fosstodon profile https://fosstodon.org/@ruedu. This allows people to easily see my profile.

This final step was not immediately clear to me but I found this in my profile it was super easy to then follow myself. Your followers will appear in Users -> Followers. Anyone that replies to a post on the Fediverse will be added as a comment on your site. On my setup, using Akismet, incoming comments were put into spam for some reason.

I have installed this plugin https://wordpress.org/plugins/activitypub/ and activated it which means you can now follow my blog as if it were any other user on the Fediverse network. Pasting @ruedu into the search bar of your favorite Mastadon (and more) instance will allow you to see posts when I publish them. I will soon stop posting directly on my profile whenever I post content in favor of letting people choose when to get spammed with my content.

Sometimes you need to run or build containers on a different architecture than you are using natively. While you can tap into buildx for building containers, running containers built for a different architecture other than yours requires Docker Desktop with its magic or the image itself needs to have been built in a specific way. This rarely happens.

Using Colima’s built in CPU architecture emulation it is possible to create a Colima instance, or profile, for either arm64 (aarch64) or amd64 (x86_64) on both types of Mac, the M series or an Intel series. This means M series Macs can run x86_64 containers and Intel Macs can run arm64 based images and the containers won’t be aware that they aren’t running on native hardware. Containers running under emulation will run more slowly than they would if run on native hardware but having the ability to run them at all is really useful at times.

Here is what you do to setup a Colima profile running a different CPU architecture. I’m starting with an M1 based system with no Colima profiles created. You can see the current profiles by running colima list.

colima list
WARN[0000] No instance found. Run colima start to create an instance.
PROFILE STATUS ARCH CPUS MEMORY DISK RUNTIME ADDRESS

From here I can create a new profile and tell it to emulate x86_64 using colima start --profile amd64 -a x86_64 -c 4 -m 6. This command will create a Colima profile called “amd64” using architecture x86_64, 4 CPU cores and 6GB of memory. This Colima profile will take some time to start and will not have Kubernetes enabled. Give this some time to start up and then check your available Docker contexts using docker context ls. You will get output similar to this:

docker context ls
NAME TYPE DESCRIPTION DOCKER ENDPOINT KUBERNETES ENDPOINT ORCHESTRATOR
colima-amd64 * moby colima [profile=amd64] unix:///Users/dustin/.colima/amd64/docker.sock
default moby Current DOCKER_HOST based configuration unix:///var/run/docker.sock

From here I can run a container. I’ll start with alpine container by running docker run --rm -ti alpine uname -a to check what architecture it is running under. You should get this in return if you are on an M series Mac – Linux 526bf44161d6 5.15.68-0-virt #1-Alpine SMP Fri, 16 Sep 2022 06:29:31 +0000 x86_64 Linux. Of course, you can run any container you need that is maybe x86_64 only.

Next I am going to run Nginx as an x86_64 container by running docker run --rm -tid -p 80:80 --name nginx_amd64 nginx. Once Nginx is running I will demonstrate how to connect to it from another container running on a different architecture. This is super useful if you are testing different pieces of software together but one isn’t available natively for your platform.

Now I’ll create a native instance of Colima using colima start --profile arm64 -c 4 -m 6. When this completes, docker context ls will now show a new context that it has switched to. Running docker ps will also show there is nothing running. You can switch between contexts using docker context use followed by the name of the context you want to use.

With the new context available I will start a copy of Alpine Linux again and add the curl package using apk add curl. With curl available, running curl host.docker.internal will show a response from Nginx!

Now that I am done testing I can remove the emulated profile using colima delete amd64 and the profile will be removed and cleaned up. Easy.