Skip to main content

Orchestrating Metadata: Building an Agent-Friendly SEO Engine

ยท By Natalie Walls

Orchestrating Metadata: Building an Agent-Friendly SEO Engine

I'm in the early stages of building my personal blog, and so far I've chipped away at implementing an automated RSS feed, another for the sitemap, and manual work (or bruteforcing with AI) for SEO metadata.

The goal is to simply write .mdx files and commit them to the content/blog directory. It was easy enough to make (or tell agents to make) scripts to generate the sitemap.xml and rss.xml files.

I recently learned about the JSON-LD spec and how it can help with SEO, and I figured I'd try including it in my build process as well. It's apparently "what all the cool kids in SEO [were] using" in 2024, and is picked up by Google, which is recent enough for me.

Since I'm going from two to three dynamic metadata builds, it's probably time to replace the npx scripts/rss.ts && npx scripts/sitemap.ts with something like a scripts/generate-metadata.ts.

scripts/generate-metadata.ts

1. Remove redundant file scans

The orchestrator starts by fetching all blog post metadata exactly once. This single array of data becomes the source of truth for everything that follows.

2. Unified Generation

Instead of separate scripts, the orchestrator handles three distinct outputs:

  • RSS Feed: Transforms the post array into a valid XML feed.
  • Sitemap: Combinds static routes and blog posts into a comprehensive XML sitemap.
  • JSON-LD Provisioning: This was the most important addition. Instead of generating individual files, it produces a public/metadata.json file containing structured BlogPosting schema for every post.

3. Agent-Friendly Design

Doing things automatically and with clear feedback is great for agents. About my dev workflow: I'm trying to build a lot of things quickly, and test different agentic workflows. I've been trying a workflow copied from a Claude Code plugin Superpowers to use it with OpenClaw, but my claw instance is currently nuked. Today I'm trying it with Gemma 4 and local AI on my RTX 3090. It took a couple hours of fiddling, but I have a pretty capable model running in memory with a 262k context window. Unfortunately, the OEM A4B Q4 quantization with on-GPU KV cache is just a little too big for my RTX 3090 and 32GB of RAM, so I'm using a quantized version from Unsloth. I'm using Cline in VSCode, very easy to point to my LM Studio localhost. I also tried overriding the ANTHROPIC_BASE_URL and Claude Code (CC), but CC spends so much time saving and reloading cached prompts, Cline gives better results much more quickly.

Implementation Details

Building this for me looked like:

  1. Get my local AI working, tested different setups with the prompt "Read all the files in the project, explain how it's set up, how to contribute, and give friendly expert feedback. Put the feedback in a new file FEEDBACK.md.'.
  2. Like I do with OpenClaw, I tried just asking it to "use the superpowers flow" to design the JSON-LD idea from the feedback it gave. I made sure to direct it to integrate with and refactor my existing metadata automation.
  3. Resulting plan was similar quality to what I've gotten from OpenClaw or the GitHub Specify workflow, didn't use much of the 262k context window, so I told it to continue
  4. It implemented failing tests (following RED YELLOW GREEN approach specified in Superpowers)
  5. After thinking a bit, it implemented most of the new logic in one-shot. However, it failed trying to fix a typo several times, rewriting the entire file including the typo it was specifically trying to fix ๐Ÿ™ƒ. I'm pretty impressed by the quality of responses so far from Gemma 4, so I'm inclined to think this is because I'm using the more quantized Q3_K_M quantization for unsloth/gemma-4-26B-A4B-it-GGUF.
  6. I fixed the triple ''' baning my Clanker, and told it the file is linting successfully and they can continue. It finished integrating with the tests, which pass.
  7. I sanity-check the git diff, make sure my existing tests for sitemap and RSS are actually working, make sure it's actually making a metadata.json
  8. I ask it to write a skeleton for a devlog blog post about building the JSON-LD feature and refactor. I'm editing it right now. ๐Ÿ˜‰
  9. Commit with husky hooks and ship it ๐Ÿšข!

For the actual implementation details:

The core of the orchestrator uses gray-matter for parsing and zod for ensuring that the data we are about to turn into XML/JSON is actually valid.

One of the most powerful parts of this setup is how it handles JSON-LD. By generating a single metadata.json at build time, my Next.js application can import this file and inject the appropriate <script type="application/ld+json"> block into the <head> of each page. This provides perfect SEO without any runtime overhead. This was the bot's idea... but I asked it to come up with it! ๐Ÿ˜…๐Ÿ˜“

// The orchestrator produces this:
{
  "posts": [
    {
      "@context": "https://schema.org",
      "@type": "BlogPosting",
      "headline": "Orchestrating Metadata...",
      "description": "How I unified RSS, Sitemaps...",
      // ...
    }
  ]
}

The Result

The benefits were immediate:

  • Build Speed: Reduced redundant filesystem operations.
  • Reliability: No more "drift" between the sitemap and the actual content.
  • SEO Boost: Automated, type-safe JSON-LD injection for every post. Having structured data where competitors don't can be the difference between being listed, or being recommended by a customer's LLM.
  • Workflow Harmony: A single point of control that works perfectly for both me and my AI agents.

The "Free Token Abundance" is over. Take back ownership of your hardware. Embrace local.


Published on April 6, 2026