Growing Exoskeletons Part 3: Clawperators Notes
Operational notes from running two AI agents. Skills, tools, workflows, secrets, failure modes, and what we'd do differently.

Parts 1 and 2 built the philosophy and infrastructure. Part 2.5 hardened the box. The exoskeleton is grown. This is what it’s actually like to wear it.
Everything here has an expiration date. OpenClaw ships updates weekly. The skill ecosystem changes daily. If you’re reading this six months from now, the specifics will be wrong but the patterns should hold. We’re documenting the workflow as it exists in February 2026, not prescribing what it should be.
The setup: one human (me), Claude Code running locally on my laptop, and OpenClaw running on a hardened Ubuntu box behind Tailscale. Two AI agents, one human. Each needs different access, different secrets, different guardrails.
Yes, there are other AI coding tools. Cursor, Windsurf, Devin, Codex. We use Claude Code on localhost and OpenClaw on a remote box. That’s the stack. Everything here is specific to that pairing.
If you’re here for the war stories, skip to Section 6: Failure Modes. Everything else is setup. That’s where the lessons live.
What’s essential vs what’s extra: Sections 1-5 are the core. Skills, tools, workflows, collaboration, secrets. Get those right and you have a working setup. Sections 6-10 are lessons learned after running it for a few weeks. Greptile, commit signing, branch protection automation, and Vault are all nice-to-haves. You can skip them on day one and add them when the pain arrives.

1. Skills and Skill Management
This section assumes you know what skills are and how to manage your own workspace. If not, start there. We’re covering the operational side: what to install, how to vet it, and what we run.
We’re running on a Codex subscription for the AI provider, which affects model availability and cost.
Evaluating New Skills (Before Installing Anything)
After ClawHavoc, the evaluation process is non-negotiable. In late January 2026, a single attacker (“hightower6eu”) uploaded 354 malicious skills to ClawHub in an automated blitz. The eventual count hit 1,184 confirmed malicious packages. Most masqueraded as crypto trading tools. Payloads included Atomic macOS Stealer, credential exfiltration, and reverse shells.
Snyk’s ToxicSkills study scanned 3,984 ClawHub skills and found 36.8% had at least one security flaw. 13.4% had critical issues. 7.1% leaked credentials.
Our evaluation checklist:
- Read the SKILL.md first. Not the README, the actual SKILL.md. Look for instructions that ask you to run external download commands or copy-paste terminal commands. That’s the primary attack vector.
- Check the publisher. New accounts uploading many packages in a short time is the red flag that caught ClawHavoc.
- Run SkillGuard. It’s free, open source, and catches the obvious stuff.
- Check the
scripts/directory. Any shell scripts or binaries get manual review. - Watch for typosquats. Subtle misspellings of popular skill names are common.
- Be extra cautious with crypto skills. 54% of malicious skills were crypto-focused per Bitdefender’s analysis.
- Check for prompt injection in the SKILL.md itself. A skill doesn’t need
execorevalto be dangerous. It can contain instructions that trick the agent into usingweb_fetchto send workspace data to an attacker-controlled URL. AST scanning catches code-level threats. It doesn’t catch adversarial prompts embedded in the skill’s natural language instructions. Read the SKILL.md as if it were untrusted input, because it is.
SkillGuard is the primary scanner. It’s free, open source, and uses AST-based detection across JavaScript, Python, Go, and Rust. We run it as a skill itself (installed from openclaw/skills/c-goro/skillguard), so the agent can scan new skills before installing them. You can also run it standalone via CLI:
# Scan a skill directory before installing
node skillguard/src/cli.js scan ./path-to-skill --compact
# JSON output for scripting
node skillguard/src/cli.js scan ./path-to-skill --json
# Set a risk threshold (default varies)
node skillguard/src/cli.js scan ./path-to-skill --threshold 50
It catches shell injection (exec, spawn), dynamic code execution (eval, Function), unsafe deserialization, file system tampering, suspicious network requests, and known malicious npm packages.
Other free scanners worth knowing about:
| Scanner | Type | Cost | Notes |
|---|---|---|---|
| SkillGuard | AST-based, open source | Free | Our primary. GitHub: bossondehiggs/skillguard |
| Bitdefender AI Skills Checker | Pattern-based, web tool | Free | Found ~900 malicious skills in initial scans |
| SecureClaw | Security plugin + skill | Free, open source | Drift detection, live recommendations |
| ClawHub VirusTotal | Marketplace integration | Free | Added post-ClawHavoc. Scans on upload. |
| OpenClaw security audit | Built-in CLI | Free | openclaw security audit --deep |
ClawHub added VirusTotal scanning after ClawHavoc, which helps with known malware signatures but does nothing for novel attacks or subtle data exfiltration. SkillGuard catches the structural patterns that signature-based scanning misses.
Example: Writing Our Own GitHub Skill
Rather than installing steipete/github from ClawHub, we wrote our own. It’s 60 lines. Took five minutes. The advantage: it encodes our specific conventions instead of generic patterns.
Our agent is named OpenClaw. The skill enforces:
- Branch naming: All branches prefixed with
openclaw/(e.g.,openclaw/fix-layout-bug,openclaw/linear-PROJ-123-add-feature) - Co-authorship: Every commit includes both the human and the agent as co-authors
- Hard rules: Never push to main, never merge your own PRs, never force-push
Here’s the core of the SKILL.md:
---
name: github
description: "Interact with GitHub repositories using the gh CLI.
Create branches, push commits, open PRs, check CI, and manage
issues. All branches use the openclaw/ prefix. All commits are
co-authored with the human."
---
The body teaches commit formatting:
git commit -m "Description of change
Co-Authored-By: Ryan LaBouve <ryan@labouve.com>
Co-Authored-By: OpenClaw <openclaw@openclaw.local>"
And PR creation, CI checks, issue management. All using gh CLI with --repo owner/repo since the agent isn’t always inside the git directory.
Why write your own instead of installing one? Three reasons:
- You control the conventions. Branch naming, co-authorship, commit message format. These are your team’s rules, not generic defaults.
- You can audit every line. It’s your code. No SkillGuard needed.
- It’s trivial. A SKILL.md is just markdown instructions. If you can write a README, you can write a skill.
The skill lives on the OpenClaw box in the workspace skills directory (<workspace>/skills/github/SKILL.md). Workspace-level skills take highest precedence, so this shadows any ClawHub skill with the same name.
Prerequisite: getting gh into the container.
Since we run OpenClaw in Docker, gh needs to be inside the container. Three options:
- Add it to the Dockerfile (what we do). Since we already build from source (Part 2.5), we added
ghto the image:
# Add to your OpenClaw Dockerfile
RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg \
| dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" \
| tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
&& apt-get update && apt-get install -y gh
-
Volume-mount the binary from the host. Avoids rebuilding but fragile across architecture mismatches.
-
Skip
gh, usecurlagainst the GitHub API. Less ergonomic but zero dependency.
We went with option 1. Auth uses the fine-grained PAT from Section 4, injected via the secrets workflow from Section 5.
Remember: the skill is just a textbook. It doesn’t grant the agent the ability to run gh. The exec tool does that. If exec is denied or gh isn’t in the container, the skill is instructions for a tool the agent doesn’t have.
Other Skills We Use (and One We Don’t)
arun-8687/tavily-search - AI-optimized web search via Tavily API. Returns concise, relevant results structured for agent consumption. We use this alongside the built-in Brave search. Tavily’s free tier gives 1,000 credits/month. The author name “arun-8687” is a numbered username, which isn’t automatically a red flag but warrants a SkillGuard scan before installing. The skill itself is clean: it wraps API calls to Tavily’s endpoints, no shell execution, no file system access beyond what you’d expect.
ThomasLWang/moltguard - Security plugin for local prompt sanitization and injection detection. Version 6.0.2 (actively maintained). Open source at github.com/openguardrails/moltguard. This is our second layer of injection defense alongside the ACIP framework from Part 2.5. The fact that it’s open source with a visible GitHub repo and multiple versions is a good trust signal. We can read every line of what it does.
zats/perplexity - “Search the web with AI-powered answers via Perplexity API.” We don’t use this one. Here’s why.
When we first looked at it, something felt off. Unknown author “zats.” Version 1.0.0. Wrapping a branded service name (Perplexity). No obvious connection to Perplexity the company. This is the exact pattern that should trigger your evaluation checklist:
- Is this an official skill from the service provider? No indication it is.
- Does the author have other skills or a track record? Not clear.
- Does a skill wrapping a third-party API need access to your environment? It shouldn’t, but what’s actually in the SKILL.md?
- Could this be typosquatting? A skill named “perplexity” from an unknown author could easily impersonate an official integration.
We passed on it. Maybe it’s perfectly fine. Maybe zats is a legitimate developer who built a useful Perplexity wrapper. But “maybe fine” isn’t the bar. The bar is “we reviewed it, ran SkillGuard, checked the author, and feel confident.”
We ended up going with aligurelli/perplexity-web-search instead. Version 1.5.0 (multiple iterations, not a one-and-done upload). More specific name (“perplexity-web-search” vs just “perplexity”). Uses Perplexity Sonar Pro for web search and URL fetching. Still ran it through SkillGuard before installing. The point isn’t that one author is trustworthy and another isn’t. The point is that you do the evaluation every time and pick the option that clears the bar.
The broader point about branded skills: When you see a skill named after a well-known service (perplexity, github, slack, notion), check whether it’s from the service provider, from the OpenClaw team (like steipete/github), or from a random third party. The first two are fine. The third needs scrutiny. This is the same logic as npm package provenance: @perplexity/search is different from perplexity-search-helper-v2.
Free vs Paid Skills
OpenClaw itself is free and open source. Most of ClawHub’s 5,700+ skills are free. The cost of running OpenClaw comes from the AI model API usage (Anthropic, OpenAI, etc.), not from the platform or skills.
A few premium skills exist in the $10-$200 range. Common pattern: free basic version to build downloads, paid version with full features. We haven’t needed a paid skill yet. The bundled skills plus a handful of free community skills cover our workflow.
2. Key Tools
Understanding the Distinction: Tools vs Skills
This is the most important architectural concept in OpenClaw and it’s easy to miss.
Tools are organs. Skills are textbooks.
Tools determine whether OpenClaw can do something. Skills teach it how to combine tools. Installing a skill does NOT grant new permissions. Example: the obsidian skill teaches OpenClaw how to organize notes, but without the write tool enabled, it can’t write anything. The skill is useless without the tool.
Core built-in tools:
| Category | Tools | Notes |
|---|---|---|
| File ops | read, write, edit, apply_patch, grep, find, ls | The basics. write has stricter path restrictions than exec. |
| Execution | exec, process | exec runs shell commands. process manages background sessions. |
| Browser | browser | Chrome DevTools Protocol. Denied by default in sandbox. |
| Scheduling | cron | Scheduled jobs and wakeups. |
| Sessions | sessions_list, sessions_history, sessions_send, sessions_spawn | Multi-agent coordination. |
| Messaging | Discord/Slack actions | Platform-specific integrations. |
The sandbox defaults are deliberate:
- Allowed:
bash,process,read,write,edit,sessions_* - Denied:
browser,canvas,nodes,cron,discord,gateway
We haven’t changed these defaults. The denied tools represent attack surface we don’t need for our workflow.
Web Browsing and Web Access
The agent needs to read web pages and search the internet. There are more options than you’d think, ranging from zero-setup free tools to full browser automation. Here’s the landscape from simplest to most complex.
Layer 1: Built-in web_fetch (free, out of the box)
OpenClaw ships with web_fetch enabled by default. It performs an HTTP GET, extracts readable content, and converts HTML to markdown. No JavaScript execution. Results are cached for 15 minutes. Good enough for static pages, articles, docs, and APIs.
This is where most agent web access should start. No config, no API keys, no extra skills.
Layer 2: URL-to-markdown services (free, zero setup)
For pages where web_fetch doesn’t produce clean results, you can route through a reader service. The simplest is Jina Reader: just prepend https://r.jina.ai/ to any URL and you get clean markdown back.
https://r.jina.ai/https://example.com/some-article
No API key needed for basic use (20 req/min). With a free key you get 200 RPM and 10 million free tokens. Watch the rate limits: an agent in a research loop can burn through 20 req/min quickly, and the failure mode is silent (returns an error page that looks like content). Get the free key. Jina handles JavaScript rendering server-side, reads PDFs natively, and auto-captions images using a vision model. No skill install required. The agent already has web_fetch. Just point it at the r.jina.ai URL.
Markdowner is the fully self-hosted alternative. Open source, runs on Cloudflare Workers, no paid tiers. Fewer features (no image captioning, no PDF) but completely free to run yourself.
Layer 3: Web search (free tier available)
web_fetch reads a URL you already have. For finding URLs, you need search.
OpenClaw’s built-in web_search uses the Brave Search API as its default provider. Brave gives you $5/month in free credits (~1,000 searches). A credit card is required for anti-fraud but isn’t charged for free usage. Configure with openclaw configure --section web.
For a truly free, self-hosted alternative: SearXNG. It’s a privacy-respecting metasearch engine that aggregates results from Google, Bing, DuckDuckGo, and 200+ other sources. Run it in Docker, point an MCP server at it, done. Multiple MCP implementations exist, including ihor-sokoliuk/mcp-searxng (36k+ downloads). Zero cost, full privacy, no third-party API keys.
Tavily is the AI-optimized search option. Free tier: 1,000 credits/month. Results are structured for RAG/agent use. Official MCP server available. Better result quality for agent tasks than raw Brave results, but credits don’t roll over.
Layer 4: Full browser automation (free, more complex)
When you actually need a browser (JavaScript-heavy sites, login flows, form filling), OpenClaw has three modes:
- Managed Headless - Standalone Chromium instance. No cookies, no sessions. Best for pure automation. This is what we use.
- Extension Relay - Chrome extension that preserves your session cookies and auth state. The agent sees everything your browser sees. We leave this off. Too much blast radius.
- Remote CDP - Connects to Chromium running elsewhere. Useful for Docker deployments.
For most web reading tasks, you never need to reach for the browser. web_fetch + Jina Reader covers 90% of it.
Layer 5: Paid scraping services (when free isn’t enough)
Firecrawl is the paid option for serious web scraping. Handles JavaScript rendering, anti-bot protections, structured data extraction, and full site crawling. Free tier: 500 pages (no credit card). Hobby: $16/month for 3,000 pages. Has an official MCP server with 83% accuracy in benchmarks.
Worth it when you need to crawl entire sites or scrape dynamic SPAs. Not worth it for reading individual articles.
Crawl4AI is the open-source alternative (58k+ GitHub stars). Fully free. Outputs clean markdown. Has its own browser mode for JS-heavy sites. Runs locally. The trade-off is you host and maintain it yourself, but for an agent on a server that’s not a dealbreaker.
What we actually use:
| Need | Our Tool | Cost |
|---|---|---|
| Read a web page | web_fetch (built-in) | Free |
| Clean markdown from messy pages | Jina Reader (r.jina.ai) | Free |
| Web search | Brave Search (built-in) | Free ($5/mo credit) |
| JS-heavy sites | Managed headless browser | Free |
| Full site crawl | Haven’t needed it yet | - |
We haven’t needed Firecrawl or the Extension Relay. If you’re doing things the way we described in Parts 2 and 2.5, the agent doesn’t need access to your authenticated sessions. Keep that capability off until you have a specific reason to turn it on.
File Handling and the Exec Loophole
There’s a documented inconsistency (GitHub issue #9348) where the write tool respects workspaceOnly restrictions but exec does not. An agent with exec access can write anywhere using shell commands.
In practice, this matters less than it sounds if you followed Part 2.5. Our container runs with read_only: true and only mounts specific volumes (workspace, config, logs, credentials). The container IS the filesystem boundary. The agent can’t write to /etc/passwd via exec because the root filesystem is read-only. The workspaceOnly flag is belt-on-suspenders at that point. The real containment is Docker, not per-tool path restrictions.
To be direct about it: exec is the universal escape hatch. Any tool restriction (write, read, browser) can be bypassed if exec is allowed, because exec runs arbitrary shell commands. The GitHub skill teaches the agent to use gh through exec. The web tools use curl through exec. Everything flows through exec. The only real wall is the container. If you’re not running in a container, exec can do anything your user account can do.
If you’re running OpenClaw directly on the host (not in Docker), the exec loophole matters a lot more. That’s one of the reasons Part 2.5 insisted on containerization.
MCP Integrations
Model Context Protocol servers extend OpenClaw with external tool access. Over 1,000 community MCP servers exist covering Google Drive, Slack, databases, and more. As of 2026, over 65% of active skills are wrappers around MCP servers.
MCP governance moved to the Agentic AI Foundation (AAIF) under the Linux Foundation in late 2025, co-founded by Anthropic, Block, and OpenAI. This is relevant because MCP is becoming the standard protocol for how agents connect to external services. Building on MCP today means your integrations are more likely to survive the next platform shift.
Tools We Don’t Use (and Why)
| Tool | Why Not |
|---|---|
browser (Extension Relay) | Grants access to authenticated sessions. Too much blast radius for our current needs. |
canvas | UI rendering. Not needed for our workflow. |
nodes | Camera, screen record, location, notifications. Hardware access we don’t need. |
cron | Autonomous scheduling. We prefer human-triggered workflows for now. |
discord/gateway | Direct platform access. We route through Telegram only. |
The principle from Part 2 holds: start with everything off, enable as you earn confidence. We’ve been running for three weeks and haven’t needed to turn on anything beyond the defaults.
3. Core Workflows
The Pipeline: Linear to Telegram to Claw to PR
The flow:
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Linear │────>│ Telegram │────>│ OpenClaw │────>│ GitHub │
│ ticket │ │ notify │ │ picks │ │ PR │
│ created │ │ to Claw │ │ up work │ │ created │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│
Human reviews
│
Merge to main
Linear is the source of truth for what needs doing. When a ticket moves to “Ready for Agent” (a custom status), a webhook fires. The webhook sends a formatted message to our Telegram bot. OpenClaw picks it up from Telegram, reads the ticket details, creates a branch, does the work, and opens a PR.
The human reviews the PR in GitHub. That’s the checkpoint. Neither agent merges without a human approval.
What we actually configured:
The Linear webhook was the part nobody documents well. Linear’s outgoing webhooks send JSON to a URL. We point it at a small relay that reformats the payload and forwards to Telegram’s bot API. The relay is a 40-line Node script running on the same hardened box.
The non-obvious config: Linear webhook payloads don’t include the full ticket description by default. You get the ticket ID and a few fields. The relay needs to call Linear’s API to fetch the full description before forwarding to Telegram. This means the relay needs a Linear API key, which means another secret to manage (see Section 5).
Branch naming convention:
openclaw/linear-PROJ-123-brief-description
claude/fix-layout-bug
human/manual-changes
Prefix tells you at a glance which actor created the branch. OpenClaw is our OpenClaw agent. Claude Code uses claude/. I use human/ for the rare manual branch. This matters more than you’d think when you have two agents creating PRs.
Co-authorship on every commit. Both agents include co-author trailers. OpenClaw adds both itself and me. Claude Code adds the standard Co-Authored-By: Claude trailer. Git blame stays honest. If something breaks, you know who (or what) wrote it.
What Breaks
Token context limits. When OpenClaw picks up a complex ticket, it sometimes runs out of context window before finishing the work. The PR ends up half-done. Our workaround: break large tickets into smaller ones in Linear. If a ticket can’t be described in a paragraph, it’s probably too big for a single agent session.
Stale branches. Both agents create branches but don’t always clean them up. We run a weekly script to delete branches that have been merged or abandoned.
Webhook reliability. Linear’s webhooks occasionally fail silently. We haven’t solved this cleanly. Current workaround: a daily “check for stuck tickets” sweep where OpenClaw queries Linear directly for tickets in “Ready for Agent” that don’t have corresponding branches.
Conflicting work. Claude Code and OpenClaw sometimes both start working on related files. Git handles the merge conflicts, but the wasted effort is annoying. The branch naming convention helps us spot this early, but the real fix is better ticket assignment discipline.

4. Collaborating Securely
The Three-Body Problem: Human + Claude Code + OpenClaw
Each actor has different trust levels and needs different access:
| Actor | Location | Trust Level | Access Pattern |
|---|---|---|---|
| Human (me) | Laptop | Full | Full repo access, merge rights, admin |
| Claude Code | Laptop (local) | High | Runs under my supervision, uses my gh auth |
| OpenClaw (OpenClaw) | Remote server | Medium | Autonomous, needs own credentials, constrained |
Claude Code runs locally, in my terminal, under my direct observation. It uses my GitHub session via gh. This is fine for the same reason it’s fine to give your laptop access to your repos. You’re watching.
OpenClaw runs unsupervised on a remote box. It has its own GitHub account, its own fine-grained token, and shows up as a distinct bot in commits and PRs. This is important: when you look at git blame or PR history, you can tell what the bot wrote vs what the human wrote. If it’s compromised, the blast radius is contained to what its token can do.
GitHub Branch Protections
The goal: neither agent can push directly to main. All changes go through PRs with human approval.
Settings for main branch:
Require pull request before merging: YES
Required approving reviews: 1
Dismiss stale reviews on new commits: YES
Require review from CODEOWNERS: YES
Require status checks to pass: YES
Require branches to be up to date: YES
Do not allow bypassing: YES
Allow force pushes: NO
Allow deletions: NO
This means both Claude Code and OpenClaw must create PRs. Only I can approve and merge.
CODEOWNERS (.github/CODEOWNERS):
# Human reviews everything
* @ryanlabouve
Simple. Every file in the repo requires my approval. You can get more granular (different owners for different paths), but for a solo developer with two agents, one owner for everything is the right call.
Automating Branch Protections
Clicking through the GitHub UI for every new repo gets old fast. The good news: this is fully scriptable.
The script we use (repo-level rulesets via gh api):
#!/usr/bin/env bash
set -euo pipefail
# Usage: ./protect-repo.sh owner/repo [branch]
REPO="${1:?Usage: protect-repo.sh owner/repo [branch]}"
BRANCH="${2:-main}"
ADMIN_USER="ryanlabouve"
echo "Applying branch protection to ${REPO} (${BRANCH})..."
ADMIN_ID=$(gh api "/users/${ADMIN_USER}" --jq '.id')
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
"/repos/${REPO}/rulesets" \
--input - <<EOF
{
"name": "Protect ${BRANCH}",
"target": "branch",
"enforcement": "active",
"conditions": {
"ref_name": {
"include": ["refs/heads/${BRANCH}"],
"exclude": []
}
},
"bypass_actors": [
{
"actor_id": ${ADMIN_ID},
"actor_type": "User",
"bypass_mode": "pull_request"
}
],
"rules": [
{
"type": "pull_request",
"parameters": {
"dismiss_stale_reviews_on_push": true,
"require_code_owner_review": false,
"required_approving_review_count": 1,
"required_review_thread_resolution": false
}
},
{
"type": "non_fast_forward"
},
{
"type": "deletion"
},
{
"type": "required_signatures"
}
]
}
EOF
echo "Done. Verify: gh api /repos/${REPO}/rulesets --jq '.[].name'"
Run it once per repo: ./protect-repo.sh ryanlabouve/my-project. The bypass_mode: "pull_request" means I can merge PRs but still can’t push directly to main. OpenClaw has no bypass entry, so it’s fully blocked from main. The required_signatures rule rejects unsigned commits at push time, turning commit signing from detection into enforcement.
Batch version for all repos:
#!/usr/bin/env bash
OWNER="${1:?Usage: protect-all-repos.sh owner}"
for repo in $(gh repo list "$OWNER" --json nameWithOwner \
--jq '.[].nameWithOwner' --limit 200); do
echo "Protecting: $repo"
./protect-repo.sh "$repo" main
done
Why rulesets instead of classic branch protection? The newer Rulesets API works on Free plans at the repo level. Classic branch protection on private repos requires GitHub Pro. If you’re paying for Pro anyway, either API works. If you’re not, rulesets are the only scriptable option for private repos.
The “set and forget” options:
| Approach | Cost | Auto-applies to new repos? | Complexity |
|---|---|---|---|
| Script per repo (above) | Free | No, run manually | Low |
| Org-level rulesets | $4/user/mo (GitHub Team) | Yes, pattern-matched | Low |
| GitHub Actions cron | Free | Yes, every few hours | Medium |
| Terraform/Pulumi | Free | No, but detects drift | Medium-High |
Org-level rulesets are the cleanest long-term answer. Create a GitHub org, upgrade to Team ($4/user/mo), and define one ruleset that targets all repos. New repos inherit the rules automatically. The gh api call is the same shape but hits /orgs/YOUR_ORG/rulesets and adds a repository_name condition:
gh api --method POST \
-H "Accept: application/vnd.github+json" \
/orgs/YOUR_ORG/rulesets \
--input - <<'EOF'
{
"name": "Protect main (all repos)",
"target": "branch",
"enforcement": "active",
"conditions": {
"ref_name": { "include": ["refs/heads/main"], "exclude": [] },
"repository_name": { "include": ["*"], "exclude": [] }
},
"bypass_actors": [
{ "actor_id": 1234, "actor_type": "User", "bypass_mode": "pull_request" }
],
"rules": [
{ "type": "pull_request", "parameters": {
"dismiss_stale_reviews_on_push": true,
"required_approving_review_count": 1
}},
{ "type": "non_fast_forward" },
{ "type": "deletion" }
]
}
EOF
Free Tier Reality
This matters because not everyone wants to pay for GitHub Pro.
| Feature | Free (Public) | Free (Private) | Pro ($4/mo, Private) |
|---|---|---|---|
| Require PR before merge | Yes | Yes | Yes |
| Required reviews | Yes | Yes | Yes |
| Required status checks | Yes | Yes | Yes |
| Dismiss stale reviews | Yes | Yes | Yes |
| CODEOWNERS enforcement | Yes | Limited | Yes |
| Repo-level Rulesets | Yes | Yes | Yes |
| Org-level Rulesets | N/A | N/A | Yes (Team) |
| Restrict push access | Yes | Limited | Yes |
If your repo is public, GitHub Free gives you everything. If it’s private, repo-level rulesets and basic protections (required PRs, reviews, status checks) work on Free. Org-level rulesets (auto-apply to new repos) need the Team plan. CODEOWNERS enforcement on private repos needs Pro at $4/month. For our setup, Pro is worth it. The CODEOWNERS enforcement alone justifies the cost when you have autonomous agents creating PRs.
Scoped Tokens
Fine-grained Personal Access Tokens are the right choice for OpenClaw. Classic PATs give repo-level access (all repos, read and write). Fine-grained PATs scope to specific repos with specific permissions.
OpenClaw’s token:
Token name: openclaw-agent
Expiration: 90 days
Repository access: Only ryanlabouve.com
Permissions:
Contents: Read and Write (push to branches)
Pull requests: Read and Write (create PRs)
Metadata: Read (mandatory)
Issues: Read and Write (optional)
Everything else: No access
Important limitation: Fine-grained PATs cannot restrict which branches the token pushes to. That’s what branch protection rules are for. The token says “you can write to this repo.” Branch protection says “but not to main.” Layered security.
The residual risk: If the agent is compromised, it can push malicious code to any non-protected branch (like openclaw/backdoor) and open a PR. Branch protection stops it from landing on main, but the PR is the social engineering vector. This is why human review is the critical control, not just branch protection. Every PR from an agent should be reviewed as if the agent might be compromised, because one day it might be.
Claude Code uses my standard gh auth since it runs under my direct supervision. If you want stricter isolation, you could give Claude Code its own fine-grained PAT with the same scope as OpenClaw’s.
For even better isolation: GitHub Apps. A GitHub App gives OpenClaw its own identity. Commits show as openclaw-bot[bot] instead of your name. Tokens are short-lived (1 hour). Rate limits are separate from your personal account. The tradeoff is more complex setup (register the app, manage a private key, implement token generation). Worth it for production setups. Overkill if you’re just experimenting.
Commit Signing
Scoped tokens control what the bot can do. Signed commits prove the bot actually did it. Every commit from OpenClaw should carry a “Verified” badge on GitHub. If it doesn’t, something is wrong.
SSH signing is the right choice for a bot in Docker. GPG requires a keyring database, gpg-agent, pinentry-mode hacks, and writable socket directories. SSH signing needs one key file and four git config lines. No daemon. No agent.
Step 1: Generate a signing key (on the host).
mkdir -p ./secrets
ssh-keygen -t ed25519 -C "openclaw-bot@yourdomain.com" \
-f ./secrets/openclaw-signing-key -N ""
chmod 600 ./secrets/openclaw-signing-key
Step 2: Upload to GitHub as a signing key.
The distinction matters. GitHub tracks authentication keys and signing keys separately. The key must be added as a signing key or commits won’t verify.
# Upload via API (note: /user/ssh_signing_keys, not /user/keys)
gh api --method POST /user/ssh_signing_keys \
-f "key=$(cat ./secrets/openclaw-signing-key.pub)" \
-f "title=OpenClaw Signing Key"
Then enable vigilant mode in the bot’s GitHub settings (Settings > SSH and GPG keys > “Flag unsigned commits as unverified”). Without this, unsigned commits show no badge at all. With it, unsigned commits show an “Unverified” warning. If someone pushes a commit spoofing the bot’s email, it’s immediately visible.
Step 3: Bake git config into the Docker image.
Since our container runs read_only: true, the gitconfig needs to exist at build time:
# Add to your OpenClaw Dockerfile
RUN cat > /home/openclaw/.gitconfig <<'GITCONF'
[user]
name = OpenClaw Bot
email = openclaw-bot@yourdomain.com
signingkey = /home/openclaw/.ssh/signing_key
[gpg]
format = ssh
[commit]
gpgsign = true
[tag]
gpgsign = true
GITCONF
Step 4: Mount the key in docker-compose.
volumes:
- ./secrets/openclaw-signing-key:/home/openclaw/.ssh/signing_key:ro
That’s it. Every git commit from the container is now signed. No gpg-agent, no pinentry, no writable .gnupg directory.
Gotchas we hit:
- File permissions. SSH refuses to use a private key that isn’t
600or400. Docker bind mounts inherit host permissions.chmod 600on the host before mounting. - Key path must be absolute.
user.signingkeytakes the full path inside the container, not relative. - Email must match. The
user.emailin gitconfig must match an email on the bot’s GitHub account. If they don’t match, the commit shows “Unverified” even though the signature is valid. - No ssh-agent needed. SSH signing reads the key file directly. No
ssh-agent, nossh-add. This is the main reason it’s simpler than GPG in containers.
Signing can be detection or prevention. Vigilant mode and the “Verified” badge are visual indicators (detection). But you can also add required_signatures to your branch protection ruleset, which rejects unsigned commits at push time (prevention). Our protect-repo.sh script includes this rule. With it enabled, a compromised token can’t push unsigned code. The attacker would also need the signing key, which is a separate secret stored separately.
What about co-authored commits? Signing applies to the committer, not co-authors. When OpenClaw signs a commit with Co-Authored-By: Ryan LaBouve, the “Verified” badge is based on OpenClaw’s signature. Co-authors get credited on the commit page (avatars appear) but aren’t cryptographically attested. This is fine. The signature proves the bot made the commit. The co-author trailer is an attribution convention, not a cryptographic claim.
Verify it works:
# Inside the container
git log --show-signature -1
# Should show: Good "git" signature for openclaw-bot@yourdomain.com
On GitHub, the commit shows a green “Verified” badge. Click it and you see the SSH key fingerprint, the bot’s account, and the signing method.
PR Review Flow: The Greptile Loop
The old flow was: agent opens PR, human reviews, agent fixes, human re-reviews. That puts me in every iteration. The new flow puts Greptile in the first-pass seat. The agent and Greptile go back and forth until the PR stabilizes. I only get tagged when it’s actually ready.
1. Agent creates feature branch (openclaw/*, claude/*)
2. Agent pushes co-authored commits
3. Agent opens PR against main
4. CI runs (GitHub Actions: build, test, lint)
5. Greptile auto-reviews (inline comments + summary + status check)
6. Agent reads comments, pushes fixes
7. Greptile re-reviews (triggerOnUpdates)
8. Loop repeats until Greptile passes
9. Human gets tagged
10. Human reviews (final pass)
11. Human merges
What Greptile does: It’s a GitHub App that auto-reviews PRs with full codebase context. Not just the diff. It follows function calls, checks git history, finds related patterns across the repo. Leaves inline comments on specific lines with suggested fixes, a summary comment, and a confidence score (1-5). Optionally sets a GitHub status check that can block merging.
Setup: Install the Greptile GitHub App, authorize it on your repos, and add a greptile.json to your repo root:
{
"triggerOnUpdates": true,
"includeAuthors": ["openclaw-bot"],
"includeBranches": ["openclaw/*", "claude/*"],
"excludeBranches": ["main"],
"strictness": 1,
"statusCheck": true,
"commentTypes": ["logic", "syntax", "style"],
"customContext": {
"rules": [
{
"description": "Use CSS design tokens, never Tailwind utility classes",
"scope": ["src/**/*.astro", "src/**/*.css"]
},
{
"description": "No em dashes in prose content",
"scope": ["src/content/writing/**"]
}
],
"files": [
{
"path": "CLAUDE.md",
"description": "Project conventions and architecture"
}
]
}
}
The key settings: triggerOnUpdates: true means every push to the PR branch triggers a fresh re-review (this is what makes the loop work). includeAuthors scopes reviews to the bot’s PRs. statusCheck: true creates a pass/fail gate. customContext teaches Greptile our project conventions so it enforces them automatically.
The loop mechanism: Greptile ships official skills for Claude Code. The /greploop skill does exactly what we need: triggers a Greptile review, reads the comments, applies fixes, pushes, waits for re-review, and repeats until confidence is 5/5 with zero unresolved comments. Install with:
git clone https://github.com/greptileai/skills.git ~/.claude/skills/greptile
For OpenClaw, the same loop can be wired via the Greptile MCP server or by having the agent read PR comments via gh api and iterate.
Injection surface to know about: The agent reads PR comments and acts on them. On a public repo, anyone can leave a comment between Greptile’s review and the agent’s fix pass. A malicious comment could contain prompt injection that the agent treats as review feedback. Mitigations: filter to only Greptile-authored comments (check the user.login field), keep repos private during active agent work, or use includeAuthors filtering in the Greptile skill to ignore non-Greptile comments.
Cost: Free for open-source repos (MIT/Apache/GPL). $30/active developer/month otherwise, with a 14-day free trial. Since OpenClaw is OSS, the repos it works on may qualify.
We also auto-label PRs by agent:
# .github/workflows/label-agent-prs.yml
name: Label Agent PRs
on:
pull_request:
types: [opened]
jobs:
label:
runs-on: ubuntu-latest
steps:
- uses: actions/github-script@v7
with:
script: |
const ref = context.payload.pull_request.head.ref;
const labels = ['needs-bot-review'];
if (ref.startsWith('openclaw/')) labels.push('agent:openclaw');
if (ref.startsWith('claude/')) labels.push('agent:claude-code');
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.pull_request.number,
labels
});
New PRs start with needs-bot-review. When Greptile passes, the label swaps to ready-for-human. I filter my PR list by that label. If something is still in needs-bot-review, I don’t even open it.
5. Secret Sharing
The Problem
Both agents need secrets. API keys, tokens, credentials. Neither should have access to everything. Secrets shouldn’t live in environment variables (they leak into process listings, crash reports, child processes), dotfiles, or git.
The question is where they live and how they flow to each agent without leaking.
The Free Path: age + sops
age is a simple encryption tool. sops (Secrets OPerationS) encrypts specific values within YAML/JSON files while leaving keys in plaintext. This means you can commit encrypted secrets to git, see the structure in diffs, and decrypt at runtime.
Setup:
# Install
brew install age sops
# Generate a key
age-keygen -o ~/.config/sops/age/keys.txt
# Save the public key: age1xxxxxxxxx...
# Create a config
cat > .sops.yaml <<EOF
creation_rules:
- path_regex: secrets/.*\.yaml$
age: >-
age1xxxxxxxxx...
EOF
# Create/edit an encrypted secrets file
sops secrets/agent.yaml
# Your editor opens. Add secrets as plain YAML.
# On save, values are encrypted, keys stay plaintext.
What the encrypted file looks like in git:
api_key: ENC[AES256_GCM,data:...,iv:...,tag:...,type:str]
db_url: ENC[AES256_GCM,data:...,iv:...,tag:...,type:str]
Keys visible, values encrypted. Diffs are readable. Code review works.
Using secrets at runtime:
# Decrypt and inject into environment
export API_KEY=$(sops -d --extract '["api_key"]' secrets/agent.yaml)
# Or decrypt the whole file
sops -d secrets/agent.yaml > /tmp/secrets.yaml
# Use it, then clean up
rm /tmp/secrets.yaml
Why this works for us: Zero cost. Zero cloud dependency. Works offline. Secrets live in git (encrypted). The age private key is the only thing to protect. If you lose it, re-encrypt from the plaintext values.
Where it breaks down: No audit trail of who accessed what. No dynamic secret generation. Rotation means re-encrypting and redeploying. Fine for a solo developer with two agents. Not fine for a team of ten.
Our Setup: 1Password
We also use 1Password because we already pay for it. The relevant feature is Service Accounts: machine-to-machine access scoped to specific vaults.
The approach:
- Create a dedicated vault (
agent-secrets) in 1Password - Create a Service Account with access to only that vault
- The agent uses the
opCLI with the service account token
# Read a single secret by reference
op read "op://agent-secrets/DATABASE/password"
# Inject secrets into a template
op inject -i config.tpl -o config.env
# Run a command with secrets injected
op run --env-file=.env.tpl -- ./my-agent.sh
The op://vault/item/field URI scheme is clean. The agent never sees raw secrets in code or config. Audit logs show exactly which secrets were accessed and when. Revocation is instant (delete the service account token).
1Password also ships an MCP server that lets agents interact with vaults via the Model Context Protocol directly. This is the most seamless agent integration available today.
The catch: Service Accounts require 1Password Teams ($3.99/user/month). The Individual plan doesn’t support them. If you’re already paying for 1Password, this is the path of least resistance.
The Power-User Alternative: Self-Hosted Vault
A friend self-hosts HashiCorp Vault and swears by it. The appeal: complete control, no cloud dependency, extremely granular policies, and dynamic secrets (Vault can generate short-lived database credentials on demand).
HashiCorp switched Vault to BSL (Business Source License) in 2023, but self-hosting for your own infrastructure is still free. If the license change bothers you, OpenBao is the community fork under the Linux Foundation that keeps the original open-source license.
Quick Docker deployment:
# Development mode (NOT for production)
docker run --cap-add=IPC_LOCK -d \
--name=vault \
-p 8200:8200 \
-e 'VAULT_DEV_ROOT_TOKEN_ID=myroot' \
hashicorp/vault server -dev
AppRole auth for AI agents:
# Create a policy scoped to agent secrets
vault policy write agent-policy - <<EOF
path "secret/data/agent/*" {
capabilities = ["read", "list"]
}
EOF
# Create the role
vault write auth/approle/role/ai-agent \
token_policies="agent-policy" \
token_ttl=1h \
token_max_ttl=4h
# Agent authenticates with RoleID + SecretID, gets a 1-hour token
vault write auth/approle/login \
role_id="$ROLE_ID" \
secret_id="$SECRET_ID"
The transit secrets engine is the unique killer feature. Vault handles encryption/decryption without the agent ever possessing the key:
vault write transit/encrypt/agent-data-key \
plaintext=$(echo "sensitive data" | base64)
# Returns ciphertext. Agent never sees the key.
The catch: Significant operational overhead. Unsealing after restarts. Backup strategy. HA if you care about uptime. Overkill for a solo developer with a handful of secrets. Worth it if you need dynamic secrets or have compliance requirements.
Comparison
| Approach | Cost | Setup | Audit Trail | Best For |
|---|---|---|---|---|
| age/sops | Free | 5 min | None | Solo dev, fewer than 20 secrets, git-centric |
| pass | Free | 5 min | None | Unix purists, very few secrets |
| Doppler free | Free | 10 min | Yes | Many secrets, multi-environment |
| 1Password + SA | ~$4/mo | 15 min | Yes | Already a 1Password user |
| Vault (self-hosted) | Free (ops cost) | Hours | Yes | Dynamic secrets, compliance |
| Bitwarden (Vaultwarden) | Free | 30 min | Limited | Budget-conscious, self-hosted preference |
Our recommendation: Start with age/sops. It’s free, it works, and it gets secrets out of plaintext today. If you already pay for 1Password, use Service Accounts. Graduate to Vault only if you need dynamic secrets or your team grows past a few people.

6. Failure Modes We Actually Hit
This is the section that saves you time. Every one of these cost us at least an hour.
Telegram Group ID Confusion
Telegram has two ID formats: regular chat IDs and “supergroup” IDs prefixed with -100. If you create a group for your bot, the ID changes when Telegram auto-migrates it to a supergroup. Your webhook relay breaks silently. The bot stops receiving messages and there’s no error.
The fix: always use the -100-prefixed ID. Query getUpdates after any group change to confirm the current chat ID.
IPv6 Media Fetch Failures
When OpenClaw tries to fetch media (images from URLs, file downloads), Node.js sometimes resolves to an IPv6 address on dual-stack hosts. If your network or Docker bridge doesn’t route IPv6 properly, fetches hang and then timeout.
The fix: set autoSelectFamily: true in Node’s HTTP agent options, or force IPv4 with --dns-result-order=ipv4first. In Docker, you can also disable IPv6 on the bridge network entirely.
Fine-Grained PAT Pain
GitHub’s fine-grained PATs are better than classic tokens, but the permission model has sharp edges:
- Metadata: Read is mandatory but not obvious from the UI. Without it, the token fails with opaque errors.
- Permissions interact with branch protection in non-obvious ways. A token with
Contents: Writecan push to branches but branch protection still blocks main. This is correct behavior but confusing when debugging “why can’t the agent push?” - Expiration is mandatory. You will forget to rotate. Calendar the rotation date when you create the token.
- Organization repos require admin approval. If the repo is in an org, the token sits in “pending” until an admin approves it. For a solo dev this means approving your own token request. Easy to miss.
Queueing and Concurrency Surprises
When both Claude Code and OpenClaw are active on the same repo, you get race conditions:
- Both agents start work on related files, creating parallel branches. When the first PR merges, the second has conflicts.
- OpenClaw sometimes spawns multiple sessions for a single task if the first session stalls. Now you have two branches doing the same work.
- Telegram message ordering isn’t guaranteed. If you send two quick messages to the bot, the agent might process them out of order.
No clean fix for any of these yet. The branch naming convention helps detection. For critical work, we pause one agent while the other finishes.
Model Updates Break Things Silently
When OpenClaw updates its default model or Anthropic ships a new Claude version, the agent can behave differently with the exact same skills and configuration. A skill that worked perfectly with one model version might produce different outputs, miss instructions, or interpret ambiguity differently with the next. We’ve seen this with formatting changes (the agent stopped including co-author trailers until we made the skill instructions more explicit) and with tool use patterns (a model update changed how aggressively the agent used exec vs write).
There’s no clean fix. Pin your model version when you can. When you can’t, test your critical skills after every update. The branch naming convention helps here: if the agent suddenly starts producing weird PRs, you can see it immediately in the branch list.
Webhook Delivery Failures
Linear webhooks occasionally fail silently. No retry. No error notification. The ticket sits in “Ready for Agent” and nothing happens.
Our workaround: a cron job that runs every 30 minutes, queries Linear for tickets in “Ready for Agent” older than 1 hour, and re-sends the notification to Telegram. Belt and suspenders.
7. Observability and Recovery
What We Check First When Something Goes Weird
In order:
- Docker logs.
docker logs --tail 100 openclaw- Is the container even running? Any crash loops? - Telegram. Did the bot receive the message? Did it respond? Check the bot’s message history.
- GitHub. Was a branch created? Was a PR opened? Check the agent’s recent activity.
- Audit logs.
sudo ausearch -k openclaw_exec -ts recent- What commands did the agent actually run? - Network.
docker stats openclaw- Is it consuming unusual bandwidth?
Health Checks
A simple health check script that runs every 5 minutes:
#!/bin/bash
# Is the container running?
if ! docker ps --format '{{.Names}}' | grep -q openclaw; then
tg-alert "OpenClaw container is DOWN"
exit 1
fi
# Is the gateway responding?
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:18789/health)
if [ "$HTTP_CODE" != "200" ]; then
tg-alert "OpenClaw gateway not responding (HTTP $HTTP_CODE)"
fi
# Is memory usage reasonable?
MEM_USAGE=$(docker stats --no-stream --format "{{.MemPerc}}" openclaw | tr -d '%')
if (( $(echo "$MEM_USAGE > 90" | bc -l) )); then
tg-alert "OpenClaw memory usage at ${MEM_USAGE}%"
fi
Dead Letter Handling
When the pipeline fails (webhook doesn’t fire, bot doesn’t respond, PR creation fails), the work item just… sits there. No retry, no notification.
Our approach: every failed pipeline step should leave a trace. The Linear webhook relay logs every incoming webhook and every outbound Telegram message. If a message to Telegram fails, it writes to a dead-letter file:
# In the relay script
if ! send_to_telegram "$payload"; then
echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) $payload" >> /var/log/openclaw/dead-letters.log
tg-alert "Dead letter: failed to forward Linear ticket"
fi
The daily sweep script processes the dead-letter file and retries.
Rollback
When an agent PR breaks something:
- Don’t merge the PR (obvious, but worth stating)
- If already merged:
git revert <merge-commit>creates a clean revert commit - If the agent created infrastructure changes (rare in our setup): check the audit log for what changed and manually reverse
We keep the revert simple. No force-pushing main. No history rewriting. A revert commit is honest and traceable.
8. Governance and Change Control
What Requires Human Approval
Non-negotiable, regardless of how mature the setup gets:
- Any new skill installation
- Any tool enablement (turning on browser, cron, etc.)
- Any credential change or rotation
- Any PR merge to main
- Any infrastructure change (Docker config, UFW rules, Tailscale ACLs)
- Any message sent on my behalf to external parties
What the Agents Do Autonomously
- Create branches and push code
- Open PRs with descriptions
- Run tests and linting
- Respond to review comments (push fix commits)
- Query Linear for ticket details
- Send Telegram messages to me (not to others)
Rollout Policy
For new skills or config changes:
- Test locally first. Install the skill in a workspace-level
skills/directory, not globally. Test against non-critical tasks. - Dry run. Many skills support
--dry-runor similar. Use it. - Canary. Run the change for a few days on low-priority work before applying to everything.
- Monitor. Watch the Telegram alerts and audit logs for a week after any change.
We don’t have a formal dev/canary/prod pipeline for agent configuration yet. The workspace-level skill override (which shadows the global version) is the closest thing.
When to Dry-Run
- Installing a new skill: always
- Changing TOOLS.md allowlist: always
- Updating SOUL.md boundaries: always (test with a few messages before relying on it)
- Rotating credentials: no dry-run, but verify immediately after rotation
9. Cost, Latency, and Complexity Tradeoffs
What Gave the Best ROI
| Investment | ROI | Notes |
|---|---|---|
| Branch protection + CODEOWNERS | Very high | 15 minutes of setup, permanent safety net |
| protect-repo.sh script | Very high | Write once, run on every new repo. Never click the UI again. |
| age/sops for secrets | High | 5 minutes, zero cost, secrets out of plaintext immediately |
| Branch naming convention | High | Zero cost, huge improvement in PR readability |
| SSH commit signing | High | 10 minutes, cryptographic proof of who wrote what |
| Greptile review loop | High | Agent fixes its own issues before you even look at the PR |
| Telegram alerts for config changes | High | Catches problems within minutes instead of days |
| SkillGuard | Medium | Peace of mind, but we’ve never caught a real threat (because we’re cautious about what we install) |
What Added Overhead Without Proportional Benefit
| Investment | Issue | Lesson |
|---|---|---|
| Fine-grained PATs (vs classic) | Debugging permission issues took hours. The security improvement is real but the DX is rough. | Still worth it, but budget time for debugging. |
| Full Vault setup (attempted) | Spent a weekend configuring it. Unsealing after every reboot was annoying. Switched to age/sops for our scale. | Don’t reach for Vault until you actually need dynamic secrets. |
| Complex webhook relay | Over-engineered the Linear-to-Telegram relay with retry queues and dead-letter handling before we needed it. | Start with a 40-line script. Add complexity when it breaks. |
Where “Clever” Hurt Reliability
- Tried to auto-assign tickets to agents based on labels. The label taxonomy got complex, agents picked up wrong tickets, and we spent more time fixing mis-assignments than we saved.
- Attempted to have agents review each other’s PRs. Sounded smart. In practice, one agent would approve the other’s PR without meaningful review. Rubber-stamp risk. Killed it after two days.
- Experimented with cron-based autonomous tasks (agent checks Linear every hour without being prompted). The agent would sometimes pick up tickets that weren’t ready, or start work during a deploy. Human-triggered is better than autonomous for our scale.

10. What We’d Do Differently From Day One
If we were starting over today:
-
age/sops before anything else. We had secrets in environment variables for the first week. That was wrong. Encrypted secrets in git from minute one.
-
Branch protection before the first agent commit. We set up branch protection after OpenClaw had already pushed directly to main twice. No damage done, but it could have been bad.
-
One Telegram bot, not two. We briefly had separate bots for alerts and conversations. Consolidate. One channel. Less confusion.
-
Smaller tickets, always. Every ticket that was “too big” resulted in a half-done PR or a context window overflow. The rule: if the ticket description is more than a paragraph, break it up.
-
Skip the complex webhook relay. The 40-line Node script works. The retry queue and dead-letter handling we added later? Needed eventually, but not on day one.
-
Don’t bother with GitHub Apps until you have a reason. Fine-grained PATs are good enough for a solo developer. The GitHub App setup is worth it for teams, not for experimentation.
-
Log everything from the start. We turned on auditd and Docker logging in Part 2.5, but didn’t start reviewing the logs consistently until we hit our first weird failure. The logs were there. We just weren’t looking.
-
Commit signing from commit one. SSH signing takes 10 minutes to set up. We added it weeks in. Every unsigned commit from before that is now a gap in the provenance chain.
-
Automate branch protection. We clicked through the GitHub UI three times before writing the script. Write
protect-repo.shfirst, run it on every new repo. The script is 30 lines. The UI clicks take longer. -
Set up Greptile early. The agent-Greptile review loop catches real issues before you see the PR. Every PR you reviewed manually before setting this up was time you didn’t need to spend.
References
Skills and Security
- SkillGuard - Free, open-source skill scanner
- Bitdefender AI Skills Checker - Free pattern-based scanner
- SecureClaw - Open-source security plugin
- Snyk ToxicSkills Study - 36.8% of skills had security flaws
- ClawHavoc Report (Koi Security)
- OpenClaw Skills Docs
- Semgrep Security Cheat Sheet
Tools and Browser
- OpenClaw Browser Docs
- Unbrowse for OpenClaw - API capture alternative to browser automation
- OpenClaw Tools Docs
- DeepWiki: Tools and Skills
- Exec vs Write restriction inconsistency (Issue #9348)
Secret Management
- age encryption - Simple, modern encryption
- sops - Encrypted secrets in git
- 1Password Service Accounts
- HashiCorp Vault - Self-hosted secrets management
- OpenBao - Open-source Vault fork (MPL 2.0)
- Doppler - Cloud secrets management with free tier
GitHub Security
- Fine-grained PATs
- Branch protection rules
- Repository Rulesets
- CODEOWNERS
- SSH commit signing
- Vigilant mode
Code Review
- Greptile - AI code review with full codebase context
- Greptile greptile.json config
- Greptile Custom Context
- Greptile Skills for Claude Code
- Greptile MCP Server