How a $35 Raspberry Pi Runs Our Entire AI Content Pipeline
- David Retzer
- Behind the Scenes
- 24 Mar, 2026
My AI agent runs 24/7 on a Raspberry Pi 4. Four gigabytes of RAM. No GPU. No cloud server. The hardware cost $35.
That agent, named Coon, built 38 n8n workflows, manages content across 168 web pages, reviews its own pull requests, and optimizes its own prompts while I sleep. The product it runs: workay.io, a workation discovery platform that helps remote workers find cities, coworking spaces, and accommodation worldwide.
Seven cron jobs, five MCP servers, one bash script that once killed the orchestrator it was supposed to protect.
The Stack
A Raspberry Pi 4 (ARM64, 4GB RAM) sits behind my couch. It runs OpenClaw, an open-source agent orchestration framework. OpenClaw provides the gateway. Coon is my agent instance, the orchestrator who spawns Claude Code agents through ACP (Agent Communication Protocol) to do the coding work.
The site itself runs on Astro 5, a static site generator. That choice matters for this workflow. Astro outputs plain HTML at build time. No server, no database, no runtime to maintain. Content lives as JSON and MDX files in a git repo, validated by Zod schemas. An AI agent can write a JSON file, run astro build, and know immediately whether the output is valid. No staging server, no migration scripts, no ORM. The agent commits a file, CI builds the site, Cloudflare Pages deploys the static output. The simplest possible deployment target for an autonomous pipeline.
The rest of the stack: n8n handles workflow automation (38 workflows at last count). GitHub stores the repos. CodeRabbit reviews PRs. Cloudflare Pages deploys to production. I talk to Coon through Telegram.
A typical exchange looks like this:
Me: “Build a workflow that pulls air quality data for all cities and writes it to the city JSON.”
Coon: “RAM: 1.8GB free. No agents running. I’ll spawn a Claude Code agent in n8n-agent-devkit with task type n8n. Prompt assembled via build-prompt.sh (includes deploy mandate). Agent spawned, session
acp-wf22-air-quality. I’ll notify you when it completes.”(43 minutes later)
Coon: “Agent fertig. WF#22 deployed to n8n, activated, test passed (AQI 24 for Lisbon). PR #16 created. CI green. Ready to merge.”
Total running cost: $35 one-time for the Pi, roughly $100 per month in API calls to Anthropic.
┌──────────────────┐
│ David (Telegram) │
└────────┬─────────┘
│
▼
┌──────────────────────────────────────────┐
│ Coon (OpenClaw on Pi) │
│ │
│ ┌─────────────┐ ┌──────────────────┐ │
│ │ Claude Code │ │ n8n Workflows │ │
│ │ Agents │ │ (38 total) │ │
│ └──────┬──────┘ └────────┬─────────┘ │
│ │ │ │
│ ┌──────┴──────────────────┘ │
│ │ │
│ ▼ │
│ GitHub PRs ──► CodeRabbit Review │
│ │ │
│ ▼ │
│ CI (Astro Build + Lighthouse) │
│ │ │
│ ▼ │
│ Cloudflare Pages (live in 3-5 min) │
│ │
│ ┌─────────────┐ ┌──────────────────┐ │
│ │ Guardian │ │ Autoresearch │ │
│ │ (5 min) │ │ Optimizer (1h) │ │
│ └─────────────┘ └──────────────────┘ │
└──────────────────────────────────────────┘
The Content Pipeline
A page on workay.io goes from idea to production without me touching a keyboard.
n8n triggers the workflow. A schedule fires, or Umami analytics data shows a gap in coverage. Workflow #38 (the Content Feedback Loop) pulls traffic data from Umami’s API, identifies which city pages attract visitors, and queues more content of that type. Measure, generate, measure again.
Coon spawns a Claude Code agent in an isolated git worktree. The agent writes the content, validates it against our Zod schemas, and creates a PR with a conventional commit message.
Workflow #37, the Stop-Slop Quality Gate, scores every draft on 5 dimensions: directness, rhythm, trust, authenticity, density. Below 35 out of 50, the workflow rewrites the flagged sections and re-scores.
A separate review agent with fresh context double-checks the code via CodeRabbit. CI runs: Astro build, TypeScript check, Lighthouse audit. I click merge. Cloudflare Pages deploys within 3 to 5 minutes.
The pipeline produces 21 city listing pages today. We are scaling to 168 through programmatic SEO (7 sub-page types per city).
Concrete example: How a city page gets created
Take Lisbon. The Content Freshness Orchestrator (WF#32) fires on schedule and triggers 6 enrichment workflows in sequence:
- WF#16 pulls climate data from Open-Meteo: monthly temperatures, rain days, sunshine hours. Writes to
src/content/cities/lisbon.jsonunder theclimatekey. - WF#18 fetches the EUR exchange rate (skipped for Lisbon since it is EUR). For Bangkok, it writes THB→EUR conversion.
- WF#22 grabs air quality data. Lisbon scores AQI 24 (good). Mapped to a health score badge.
- WF#17 enriches country data via RestCountries API: flag SVG, languages, timezone, region.
- WF#23 checks trending signals from Wikipedia page views and Hacker News mentions.
- WF#30 pulls review sentiment from Foursquare (when the credential works).
Each workflow writes its results to the city JSON via the GitHub Contents API. That triggers our content/ branch GitHub Action, which validates all JSON against Zod schemas and auto-merges to main if valid. Cloudflare Pages picks up the merge and rebuilds.
The city page at /cities/lisbon renders the enriched data: climate chart, air quality badge, cost breakdown, visa info. Every data point from an automated API call.
When WF#38 (Content Feedback Loop) sees that Lisbon’s page gets above-average traffic, it flags Lisbon for deeper content: a blog post, additional sub-pages, or updated listings. Lisbon enters the generation queue for sub-pages.
38 Workflows: What They Do and How They Get Built
The n8n instance runs 38 workflows. Some examples:
| Workflow | What it does |
|---|---|
| WF#16 Climate Enrichment | Pulls monthly temperature, rain days, sunshine hours from Open-Meteo API for each city |
| WF#18 Currency Comparison | Fetches exchange rates, skips same-currency pairs, writes to city JSON |
| WF#22 Air Quality | AQI and PM2.5 data from Open-Meteo, mapped to a health score |
| WF#32 Content Freshness | Orchestrator that triggers all enrichment workflows, merges results into city data |
| WF#37 Stop-Slop Gate | Scores AI-generated text on 5 dimensions, rewrites sections that score below 35/50 |
| WF#38 Content Feedback | Pulls Umami analytics, identifies top-performing page types, feeds back into generation |
A Claude Code agent built every workflow, not a human. The process works like this:
- I tell Coon what the workflow should do (in plain language on Telegram)
- Coon writes a task prompt and runs it through
build-prompt.shwith task typen8n - The script auto-injects
n8n-deploy-mandate.md, which forces the agent to deploy via the n8n API, activate the workflow, and test it with real data before creating a PR - The Claude Code agent spawns inside
~/n8n-agent-devkitwith 9 n8n-specific skills loaded (node configuration, validation patterns, JavaScript code patterns, MCP tool usage) - The agent uses the n8n MCP server to search for templates, validate nodes, deploy, and test
- Once the test passes, the agent exports the workflow JSON, commits it, and creates a PR
- The PR lands in the
n8n-agent-devkitrepo with the workflow JSON, a requirements doc, and the registry updated
The fix was the n8n-deploy-mandate.md prompt block. Before we added it, agents kept creating workflow JSON and opening PRs without ever deploying to n8n. The workflow looked valid on paper but never ran. We documented the rule. Agents ignored it after compaction. We moved the rule into a script. Problem solved.
The Prompt Assembler
Coon learned something the hard way: documentation does not enforce itself.
We wrote rules in AGENTS.md. The agent followed them for a while. Then Claude’s context compaction kicked in, the older instructions fell out of the window, and the agent started ignoring conventions. The fix: build-prompt.sh, a mandatory prompt assembler that runs before every agent spawn.
# Real usage — Coon runs this before every sessions_spawn
bash scripts/build-prompt.sh astro /tmp/task.txt ~/workation_astro > /tmp/assembled.txt
The script reads the task type (n8n, astro, or generic), grabs the task-specific prompt, then injects four blocks automatically:
- A chain-of-thought mandate (forces the agent to plan before coding)
- Task-type conventions (Astro gets
astro-conventions.md, n8n getsn8n-deploy-mandate.md) - The full
lessons-learned.mdfile (every past mistake, every pattern we discovered) - The target repo’s
CLAUDE.md(project-specific rules)
The output includes a prompt_version hash tied to the git commit of the workspace. The autoresearch optimizer uses that hash to track which prompt version produced which results.
Scripts do not forget after compaction.
4GB: The Constraint That Shaped Everything
Four gigabytes sounds restrictive. It is. The honest memory budget:
| Process | RAM |
|---|---|
| OpenClaw Gateway | ~520 MB |
| Coon’s main session | ~780 MB |
| MCP servers (n8n, context7) | ~200 MB |
| OS and services | ~400 MB |
| Available for agents | ~1.9 GB |
| One Claude Code agent | 300-800 MB |
| One npm build (Astro) | ~500 MB |
| Free after agent + build | ~100-600 MB |
One agent plus one Astro build pushes the Pi to its limit. Two agents running in parallel would trigger an out-of-memory kill. We enforce a strict one-agent rule: AGENTS.md mandates a pre-spawn RAM check, and Coon never launches a second agent while one runs.
The Guardian watchdog backs up that rule with force.
The Guardian
Guardian is a bash script. It runs every 5 minutes via system crontab. It consumes about 2 MB of RAM and zero API tokens. Its job: keep the Pi alive.
The RAM check is the critical section:
# From guardian.sh — the RAM check that prevents OOM kills
RAM_FREE=$(free -m | awk 'NR==2{print $7}')
if [[ "$RAM_FREE" -lt 300 ]]; then
add_alert "CRITICAL" "ram" "OOM imminent: ${RAM_FREE}MB free" "killing_largest_agent"
# Kill the largest non-essential claude process
LARGEST_PID=$(ps -eo pid,rss,cmd --sort=-rss \
| grep "claude" | grep -v "openclaw" \
| head -1 | awk '{print $1}')
if [[ -n "$LARGEST_PID" ]]; then
# Safety: never kill our own ancestor process
# (learned this one the hard way)
MY_PID=$$
PID=$MY_PID
SAFE=false
while [[ "$PID" -gt 1 ]]; do
[[ "$PID" == "$LARGEST_PID" ]] && SAFE=true && break
PID=$(ps -o ppid= -p "$PID" 2>/dev/null | tr -d ' ') || break
done
if [[ "$SAFE" == "false" ]]; then
kill "$LARGEST_PID" 2>/dev/null || true
fi
fi
elif [[ "$RAM_FREE" -lt 500 ]]; then
add_alert "WARNING" "ram" "Low RAM: ${RAM_FREE}MB free"
fi
Guardian also tracks agent runtime. Any spawned agent running longer than 90 minutes gets a warning. Past 120 minutes, Guardian kills it. Session log files older than 72 hours get pruned (they once accumulated to 34 MB and ate into our headroom).
The ancestor-check exists because of a real bug: Guardian once killed Coon’s own session. The watchdog checking processes climbed the process tree, found its own parent, and sent SIGTERM to the orchestrator. We added the SAFE flag that same afternoon.
The Self-Improving Agent
Andrej Karpathy published a pattern called “autoresearch” for self-improving systems. We adapted it for agent orchestration.
The core principle: the agent that writes the prompts should not evaluate the prompts. Coon runs the optimizer in a separate session with fresh context. No shared memory, no confirmation bias.
The cycle runs hourly:
eval-prompt.shcomputes metrics from the task history (zero tokens, pure bash and Python)- The optimizer reads the metrics, picks ONE change to a prompt block, and applies it
- The next hour’s tasks run with the updated prompt
eval-prompt.shmeasures again. Keep the change if metrics improved. Revert if they dropped.
The eval script outputs something like this:
{
"timestamp": "2026-03-23T14:00:00",
"window_days": 7,
"total_tasks": 12,
"done": 10,
"failed": 1,
"pending": 1,
"first_attempt_success": 8,
"first_attempt_rate": 0.667,
"completion_rate": 0.833,
"avg_runtime_min": 23.4,
"recurring_errors": 2,
"task_types": {"astro": 7, "n8n": 4, "generic": 1}
}
The primary metric is first_attempt_rate. If an agent completes a task without needing a Ralph Loop (our retry mechanism: analyze failure, fix prompt, retry up to 2 times), that counts as a first-attempt success.
Three optimization runs have completed so far:
- Run 1: The optimizer added a “Definition of Done” checklist to
astro-conventions.md. Agents must verify the build passes and TypeScript errors are gone before marking a task complete. Verdict: KEEP. - Run 2: A “Pre-flight Check” requiring agents to verify the baseline build before making changes. Verdict: KEEP.
- Run 3: The optimizer found no actionable change. Verdict: NO_CHANGE_NEEDED. A system that knows when to do nothing is a system that works.
Each run costs about 3,000 tokens. Roughly $0.05 per hour. The optimizer pays for itself if it prevents even one failed task per week.
The Bookmark Pipeline
The optimizer tunes prompts based on metrics. But where do new ideas come from?
I scroll X, bookmark posts about OpenClaw, agent architecture, and prompt engineering. Coon reads those bookmarks and turns them into improvements.
The tool is bird, a CLI for reading X data. It runs on the Pi. Read-only. It never posts.
The flow:
- I bookmark interesting threads on X throughout the week
bird bookmarks --all --jsonfetches every bookmark as structured data- An agent filters for relevance: keywords like “agent”, “memory”, “skills”, “autoresearch”, “context window”
- Sub-agents read the full threads via
bird thread <url>and extract actionable items - Coon creates GitHub issues from the findings
- The issues enter the roadmap and get picked up by coding agents
In one session, this pipeline processed 764 bookmarks from 2026. It filtered down to 329 relevant posts, extracted 15 new ideas, and implemented 6 of them the same day: session file pruning, bash pre-checks for cron jobs, chain-of-thought mandates, a successful-patterns registry, and more.
I bookmark on my phone. The agent builds from those bookmarks. Autoresearch optimizes what exists. The bookmark pipeline discovers what to build next.
62 Skills
Coon and the Claude Code agents draw on 62 installed skills across 4 scopes:
Claude Code global (26 skills): These come from the community. The obra/superpowers collection handles advanced coding patterns. Vercel provides Next.js-specific skills. Anthropic’s own skills cover code review and testing.
OpenClaw Coon (21 skills): Five of these we built ourselves. error-logger tracks mistakes and promotes recurring ones to prompt-block rules after 3 repetitions. morning-brief sends a daily Telegram briefing at 8 AM with overnight metrics. self-review triggers a weekly self-assessment every Sunday at 8 PM. agent-supervisor manages Ralph Loops and timeout enforcement. memory-flush saves critical context before compaction wipes it.
Per-project skills: 4 for the Astro repo (frontend design, testing, PR review, data visualization) and 9 for the n8n agent devkit.
Each skill improves the next task. When an error recurs three times, we promote it to a rule. Rules feed into prompt blocks. build-prompt.sh injects those blocks into every future session.
The .learnings System
Coon maintains a .learnings/ directory with five files:
- ERRORS.md tracks every failure with a recurrence count. When an error hits 3 repetitions, Guardian flags it for promotion to a prompt block.
- PATTERNS.md records successful approaches. If an agent finds a good way to handle Astro content collections or n8n node configuration, it goes here.
- DECISIONS.md logs architectural choices with reasoning. Six months from now, when someone asks “why did you use JSON files instead of a CMS?”, the answer lives in this file.
- PROMOTIONS.md tracks which error fixes got promoted into permanent rules.
- OPTIMIZATION-LOG.md records every autoresearch change, the hypothesis behind it, and whether it was kept or reverted.
The error-logger skill writes to these files. build-prompt.sh reads from them. The optimizer evaluates whether the accumulated knowledge improves task success rates. A tight loop running on bash scripts and flat files. No database, no vector store, no embeddings.
Seven Cron Jobs
The Pi runs seven scheduled jobs. Five at the system level (they run even if Coon is down), two inside OpenClaw.
| Schedule | Job | Token cost |
|---|---|---|
| Every 5 min | Guardian watchdog (RAM, processes, disk, gateway) | 0 |
| Hourly at :03 | Autoresearch optimizer (Sonnet session) | ~3K |
| Daily at 03:17 | Git backup (auto-commit workspace to remote) | 0 |
| Every 72h at 04:43 | Session file pruning (delete .jsonl > 3 days) | 0 |
| Monday 09:15 | Content feedback fetch (curl Umami webhook) | 0 |
| Weekdays 08:00 | Morning brief to Telegram | ~2K |
| Sunday 20:00 | Self-review + memory hygiene | ~5K |
The daily git backup exists because Raspberry Pi OS runs on an SD card. SD cards corrupt. Our entire memory system, task history, and config live in ~/.openclaw/workspace/. One corrupted sector could wipe weeks of agent learning. The backup pushes to GitHub every night at 3:17 AM. If the card dies, we clone and continue.
Total token cost for all seven crons: roughly 10K tokens per day. Under $1.
Five MCP Servers
Claude Code agents connect to five MCP (Model Context Protocol) servers. Each server gives agents access to tools they cannot reach through bash alone.
| Server | What it does |
|---|---|
| n8n-mcp | Create, validate, deploy, and test n8n workflows without touching the web UI |
| playwright | Browser automation for end-to-end testing |
| context7 | Injects current library documentation into agent context. Stops agents from hallucinating outdated API signatures |
| codebase-memory | A Go binary (ARM64 native) that indexes the repo into a queryable knowledge graph. Reduces codebase exploration from 400K tokens to 3K |
| tavily | Structured web search and page extraction. 1,000 free API calls per month |
The agents pick which MCP tools to use based on the task. An n8n workflow task reaches for n8n-mcp. A content research task uses tavily. The agent decides. Coon does not micromanage tool selection.
Safety Hooks
Three hooks fire in every Claude Code session on this Pi.
block-dangerous.sh intercepts every bash command before execution. It blocks rm -rf /, git push --force to main, and direct edits to openclaw.json via sed or echo. The hook reads the command from stdin, checks it against a pattern list, and returns exit code 2 to abort if it matches. Exit 0 means proceed.
This hook exists because an agent once tried to kill a process and matched its own parent PID. The ancestor-walk in Guardian caught the same class of problem. Defense in depth.
on-session-start.sh runs when a Claude Code session begins. It counts running agents, pending tasks, and available RAM, then injects a one-line summary into the session context. Every agent starts informed about the Pi’s current state.
on-stop.sh runs asynchronously when a session ends. It creates the daily log file if one does not exist yet.
Worktree Isolation
Every coding agent works in its own git worktree. The Pi’s main repo checkout stays clean. The agent gets an isolated copy with its own branch.
bash scripts/setup-worktree.sh fix/issue-42 "Fix listing schema" ~/workation_astro
# Creates ~/worktrees/fix-issue-42 with a fresh branch from origin/main
# Runs npm install, updates the task registry
The agent cannot accidentally break main. It commits to its branch, pushes, creates a PR. If the agent fails completely, we delete the worktree and lose nothing. After a successful merge, git worktree remove cleans up. Disk is tight on 4 GB; orphaned worktrees add up.
SQLite for History
The task registry lives in active-tasks.json. That works for Coon’s runtime reads. For analysis, we mirror the data into SQLite.
sqlite3 memory/tasks.db "SELECT status, COUNT(*) FROM tasks GROUP BY status"
# done|23
# stale|1
The migrate-tasks-to-sqlite.py script creates three tables: tasks (with task_type, retries, prompt_version), errors (parsed from ERRORS.md), and metrics (from the autoresearch eval script). The optimizer queries this database to correlate prompt versions with success rates. JSON stays as the runtime format. SQLite serves analytics.
Lessons From Our Fails
The provenance key that killed Coon. We added "provenance": "meta+receipt" to openclaw.json to track agent lineage. The gateway tried to parse the new field, failed, and crash-looped. Every restart attempt hit the same bad config. Coon stayed dead for 40 minutes until I SSH’d in and reverted the change. The lesson: test config changes on a branch. We now have a pre-deploy validation step for all OpenClaw configuration changes.
Guardian killed its own boss. The early version of guardian.sh walked the process tree looking for runaway Claude processes. It found one: Coon’s own session, which had been running for 6 hours (because it is supposed to run forever). Guardian sent SIGTERM. The orchestrator died. All active tasks died with it. We added the ancestor-check loop that afternoon. Now Guardian walks up the tree from its own PID and skips anything in its lineage.
Compaction amnesia. Claude Code agents have a context window. When the conversation grows too long, the system compacts older messages. Rules written early in the session can vanish during compaction. We lost a full afternoon to an agent that stopped following our n8n deployment conventions mid-task. The prompt had the rules at the top, compaction dropped them, and the agent reverted to default behavior. build-prompt.sh solved this by re-injecting rules at the start of every spawn. The rules live in files, not in conversation history.
Looking Forward
The system works. But “works” and “finished” are different words.
Scaling to 168 pages. We have 21 city listings. Each city will grow to 7 sub-pages (coworking, cafes, accommodation, internet, cost breakdown, weather, visa info) through programmatic SEO. The pipeline can handle it. The Pi’s RAM cannot run the generation in parallel, so we queue the pages and generate one at a time. At 23 minutes average per page, 147 new pages will take about 56 hours of agent compute time spread across a week.
Browser automation. Google Search Console and Cloudflare analytics both require a browser session. We plan to launch headless Chromium on demand (190 MB RAM), scrape the data, and kill the process immediately. Chrome cannot stay resident on 4 GB.
Deeper autoresearch. The optimizer currently tunes prompt blocks. We want it to evaluate and improve skill descriptions too, so agents pick the right skill for each sub-task more often.
The Economics
The Pi cost $35. A 128 GB SD card cost $15. Power draw sits at about 5 watts, pennies per month. The API bill runs roughly $100 per month to Anthropic for Claude tokens. n8n runs on a Hetzner VPS (self-hosted, low cost). GitHub free tier handles our repos. Cloudflare Pages free tier handles deployment (500 builds per month, and we use around 90).
A comparable setup on a cloud VM (4 GB RAM, always-on) would run $20 to $40 per month for the compute alone. The Pi pays for itself in the first month and costs nothing after that.
The $100 API spend replaces what would take a content writer, a DevOps engineer, and half a product manager to accomplish manually. Whether that trade-off makes sense depends on your tolerance for debugging bash scripts at 2 AM when Guardian alerts you that RAM dropped below 300 MB.
I find it worth it. Your calculus may differ.
The architecture runs on any Pi 4 or newer. The constraint is the RAM, not the software.
The pSEO rollout starts this week: 147 new pages, generated one at a time, queued through the same pipeline. I will publish the real numbers on X as they come in: traffic per page, cost per page, first-attempt success rates.
If you run an AI agent on constrained hardware, I want to hear about it. Find me on X (@towlyy00) or check the workay.io blog for updates.
Parts of this content were created with AI assistance and editorially reviewed.