f geofrey.ai — Claude Code with a Local Safety Layer

Your AI agent just ran
rm -rf /
— and you weren't asked. (deletes everything)

That's life with unguarded AI agents. Geofrey wraps Claude Code in a local safety layer. So nothing dangerous executes without your explicit approval.

$0
Safety layer / month
90%
Classified locally
L0–L3
Risk tiers
6
Messaging platforms

Way better than OpenClaw — wait, sorry, BoltBot — no wait, what are they called now… Claudebot? Whatever. We just ship.

Why not OpenClaw?

Three rebrandings later and still the same problems: critical CVEs, fire-and-forget approvals, and $200–600/month in API bills. We built something fundamentally different.

Attack Vector
OpenClaw
geofrey.ai
Monthly cost
$200–600
$0 orchestrator
Network exposure
42,000+ exposed instances
0 exposed ports
RCE vulnerabilities
CVE-2026-25253 (CVSS 8.8)
No web UI = no attack surface
Command injection
CVE-2026-25157, CVE-2026-24763
4-layer defense + shlex decomposition
Approval mechanism
Fire-and-forget (Issue #2402)
Structural blocking (Promise)
Marketplace security
7.1% of skills leak credentials
MCP with allowlist, no marketplace
Prompt injection defense
None
3-layer + MCP sanitization
Secret handling
Plaintext in local files
Env-only, Zod-validated, no logging
Image metadata defense
None
EXIF/XMP/IPTC stripping + injection scan
Audit trail
Basic plaintext logs
SHA-256 hash-chained JSONL
Data anonymization
None — all data sent unfiltered
Privacy rules + PII detection + email anonymization

Four-tier risk classification

Every action is classified before execution. 90% handled instantly by deterministic patterns. No single point of failure.

L0 — Auto-Approve
Execute immediately
Safe read-only operations. Zero latency, zero cost. The agent proceeds without interrupting you.
read_filegit statuslscat
L1 — Notify
Execute + inform
Safe write operations. Executed immediately, but you get a notification about what happened.
write_filegit addgit branch
L2 — Require Approval
Block until approved
Dangerous operations. The agent is structurally suspended until you tap Approve or Deny.
delete_filegit commitnpm installshell_exec
L3 — Block Always
Refuse & log
Destructive commands. Always blocked, always logged. No override, no bypass mode, no exceptions.
rm -rfsudocurl | shpush --force
1

Command decomposition

Shlex-style split on &&, ||, ;, |, \n — each segment classified individually.

2

Deterministic classifier

Regex patterns block known dangerous commands in <1ms. Handles ~90% of all classifications.

3

LLM classifier

Qwen3 8B evaluates ambiguous commands (~10%). XML output format, JSON fallback.

4

Structural approval gate

Promise-based blocking. The agent is suspended — not polling, not timing out. No code path from "pending" to "execute" without the Promise resolving.

Structural blocking,
not policy checking

OpenClaw's approval is fire-and-forget — the tool returns before the user approves (Issue #2402). geofrey.ai uses a JavaScript Promise that structurally suspends the agent loop. Not a policy that can be overridden — a property of the execution flow.

// OpenClaw: fire-and-forget (broken)
void (async () => { /* returns in ~16ms */ })();

// geofrey.ai: structural blocking
const { nonce, promise } = createApproval(tool, args);
const approved = await promise; // suspended here
if (!approved) throw new Error('Denied');

Built for control

Claude Code does the heavy lifting. Your local LLM guards every action. You approve via your messaging app.

</>

Claude Code integration

Complex coding tasks delegated to Claude Code CLI with risk-scoped tool profiles. Live streaming to your messaging app.

#_

Multi-platform messaging

Telegram, WhatsApp, Signal, Slack, Discord, WebChat. Approve or deny from the app you already use. No web UI to expose.

MCP ecosystem

10,000+ community tool servers via Model Context Protocol. Every call wrapped by risk classifier. Explicit allowlist.

🔗

Hash-chained audit

Every action logged with SHA-256 hash chain. Tamper-evident: one modified entry breaks the entire chain.

🧠

Hybrid classification

Deterministic regex handles 90% of classifications in <1ms. LLM fallback only for edge cases.

🛡

Prompt injection defense

3-layer isolation: user input, tool output, model response. MCP responses Zod-validated and instruction-filtered.

🔒

Secret isolation

All credentials from env vars only. No token logging. Sensitive paths (.env, .ssh) are L3-blocked.

📁

Filesystem confinement

All file operations pass through confine() — paths outside the project directory are rejected.

🖼

Image metadata defense

EXIF/XMP/IPTC stripped before images reach the LLM. Metadata scanned for prompt injection patterns.

🌐

Privacy layer

Privacy rules DB with per-entity allow/anonymize/block decisions. Local vision model classifies images (faces → block). Emails anonymized before cloud APIs. Output filter catches leaked credentials.

💰

Cost transparency

Per-request cost display. Budget alerts. Every API call tracked with cloud vs. local token breakdown.

🌍

i18n support

German + English with typed translation keys. Setup wizard, approvals, and errors in your language.

🔧

Auto-tooling

Detects capability gaps and builds standalone programs in Docker-isolated Claude Code. Registers as cron job or background process automatically.

Proactive agent

Morning briefings, calendar reminders, email monitoring. All privacy-filtered through the local orchestrator. Runs on your schedule.

💻

20 local-ops tools

File, directory, text, system, and archive operations handled natively. Zero cloud tokens, instant execution, no API cost.

Stop paying for orchestration

The local LLM handles intent classification, risk assessment, and communication. Cloud APIs only for complex coding tasks.

OpenClaw
$200–600
per month (moderate use)
  • 10K token system prompt resent every API call
  • 4,320+ background API calls/month (monitoring)
  • Every classification = paid cloud API roundtrip
  • Power users report up to $3,600/month
geofrey.ai
$0–30
per month (same workload)
  • Orchestrator runs locally (Qwen3 8B, loaded once)
  • Zero background API calls (event-driven)
  • 90% of classifications handled by regex (<1ms, free)
  • Cloud API only for complex coding tasks

Runs on your machine

One tested default today, configurable via ORCHESTRATOR_MODEL. Fits on an M-series MacBook.

Coming Soon
Power
64GB+
Qwen3 8B + Qwen3-Coder-Next
$0 / month
Tiered routing: simple code tasks handled locally, complex tasks to Claude API. Saves ~30–40% API costs.

Got a machine with 64GB+ RAM? Qwen3-Coder-Next is an 80B MoE model with only 3B active parameters, achieving 70.6% on SWE-Bench Verified at near-3B cost (~52GB Q4). Zero API costs for simple coding tasks.

Up and running in 5 minutes

Interactive setup wizard handles prerequisites, credentials, and platform configuration.

1

Clone & install

Clone the repository and install dependencies with pnpm.

2

Pull the model

Download Qwen3 8B via Ollama (~5GB, one-time download).

3

Run setup wizard

pnpm setup — auto-detects prerequisites, validates credentials, configures your messaging platform.

4

Start the agent

pnpm dev for development or pnpm build && pnpm start for production.

terminal
# Clone & install $ git clone https://github.com/slavko-at-klincov-it/geofrey.ai.git $ cd geofrey.ai && pnpm install # Pull orchestrator model $ ollama pull qwen3:8b pulling manifest... done pulling model... 100% ████████████ 5.0GB # Interactive setup wizard $ pnpm setup ✔ Node.js 22.4.0 detected ✔ Ollama running (qwen3:8b loaded) ✔ Claude Code CLI authenticated ? Platform: Telegram ✔ Bot token validated ✔ .env generated — run pnpm dev to start # Start the agent $ pnpm dev geofrey.ai v1.0.0 — listening on Telegram risk classifier: loaded (53 patterns) audit log: ./data/audit/2026-02-14.jsonl

Take back control

geofrey.ai is MIT-licensed. Claude Code does the work. A local LLM makes sure nothing goes wrong. Read the code, verify the claims, run it on your machine.