TLDR; The fix for this is to ensure you are forcing your CDN to properly handle “application/activity+json” in the Accept header vs anything else. In other words, you need to Vary on Accept, but it’s best to limit it to “application/activity+json” if you can.
With the release of ActivityPub 1.0.0 plugin for WordPress I hope we’ll see a surge in the number of WordPress sites that can be followed using your favorite ActivityPub based systems like Mastodon and others. However, if you are hosting your WordPress site on Cloudflare (and likely other CDNs) and you have activated full page caching you are going to have a difficult time integrating your blog with the greater Fediverse. This is because when an ActivityPub user on a service like Mastodon performs a search for your profile, that search will land on your WordPress author page looking for additional information in JSON format. If someone has visited your author page recently in a browser then there is the chance Mastodon will get HTML back instead resulting in a broken search. The reverse of this situation can happen too. If a Mastodon user has recently performed a search and later someone lands on your author page, they will see JSON instead of the expected results.
The cause of this is because Cloudflare doesn’t differentiate between a request looking for HTML or one looking for JSON, this information is not factored into how Cloudflare caches the page. Instead, it only sees the author page URL and determines that it is the same request and returns whatever it has. The good news is, with some effort, we can trick Cloudflare into considering what type of content the client is looking for while still allowing for full page caching. Luckily the ActivityPub has a nice undocumented feature to help work around this situation.
To fix this while keeping page caching you will need to use a Cloudflare worker to adjust the request if the Accept header contains “application/activity+json”. I assume you already have page caching in place and you do not have some other plugin on your site that would interfere with page caching, like batcache, WP SuperCache and more. For my site I use Cloudflare’s APO for WordPress and nothing else.
First, you will want to ensure that your “Caching Level” configuration is set to standard. Next, you will need to get setup for working with Cloudflare Workers. You can follow the official guide at https://developers.cloudflare.com/workers/. Next, create a new project, again using their documentation. Next, replace the index.js file contents with:
export default {
async fetch(req) {
const acceptHeader = req.headers.get('accept');
const url = new URL(req.url);
if (acceptHeader?.indexOf("application/activity+json") > -1) {
url.searchParams.append("activitypub", "true");
}
return fetch(url.toString(), {
cf: {
// Always cache this fetch regardless of content type
// for a max of 5 minutes before revalidating the resource
cacheTtl: 300,
cacheEverything: true,
},
});
}
}
You can now publish this using wrangler publish. You can adjust the cacheTtl to something longer or shorter to suite your needs.
Last step is to associate the worker with the /author route of your WordPress site. For my setup I created a worker route of “*dustinrue/author*” and that was it. My site will now cache and return the correct content based on whether or not the Accept header contains “application/activity+json”.
Remember that Cloudflare Workers do cost money though I suspect a lot of small sites will easily fit into the free tier.