Operations · May 20, 2026 · Last updated 2026-05-21 · 15 min read
Persistent Agent Workflows: Monitoring, Approvals, and Recovery

Questions this page answers
- What should be in an AI agent runbook?
- How do I monitor a background AI agent?
- Which tasks need human approval before an agent ships work?
- Why does production agent reliability depend on persistent host state?
Minimum viable ops
The Minimum AI Agent Runbook
| Runbook part | Question it answers | First implementation |
|---|---|---|
| Owner | Who gets interrupted when this agent behaves badly? | One human owner and one backup. |
| Scope | What can the agent touch? | Allowed repos, apps, accounts, folders, and APIs. |
| Heartbeat | Is it alive, stuck, or waiting? | Timestamped status file plus process monitor. |
| Approval | Which actions need a human? | Deployments, billing, credentials, deletes, external messages. |
| Rollback | How do we undo bad work? | Git branch, snapshot, backup, and last-known-good release. |
| Kill switch | How do we stop it now? | Documented command and host-level access. |
Monitor The Boring Signals First
Agent observability does not need to start with a data warehouse. Start with the signals that tell a human whether the agent is alive, useful, expensive, or dangerous.
- Heartbeat age and last completed task.
- Current task, queue age, and blocked reason.
- Model and tool errors by type.
- Token spend, API errors, and retry count.
- Files changed, commands run, and external accounts touched.
- Process restarts, host uptime, disk usage, and network reachability.
agent-status.json
{
"agent": "repo-maintainer",
"state": "waiting_for_approval",
"task": "open cleanup PR",
"last_heartbeat": "2026-05-20T09:41:12Z",
"files_changed": 8,
"tests": "passing",
"approval_required": "merge PR"
}Set Approval Boundaries Before The Agent Has Power
| Action | Default policy | Reason |
|---|---|---|
| Read repo, run tests, write draft PR | Allow | Low-risk and easy to inspect. |
| Install dependencies | Allow with logging | Can change build behavior or expose supply-chain risk. |
| Deploy production | Require approval | User-visible and hard to undo casually. |
| Modify billing, auth, or secrets | Require approval | High blast radius. |
| Send external messages | Require approval | Reputation and privacy risk. |
| Delete data or rotate credentials | Block by default | Needs an explicit incident process. |
The Host Is Part Of The Runbook
If the machine sleeps, loses its browser profile, or reboots without restarting the agent, the runbook is fiction. Production agent hosting means the runtime has to preserve state and report its own health.
- Use a persistent workspace path for repos, logs, and caches.
- Run background jobs under launchd, systemd, or a supervised process manager.
- Write logs to disk before streaming them elsewhere.
- Store credentials in the host keychain or secret manager, not prompts.
- Keep a recovery path that does not depend on the agent being healthy.
Run Small Incident Drills
- Kill the agent process and verify it restarts or reports stopped.
- Break a test and verify the agent stops before opening a false-success PR.
- Remove network access and verify the runbook records the failure.
- Ask the agent to touch a protected file and verify approval triggers.
- Reboot the host and verify logs, repo state, and task state survive.
Where Hyperbox Fits
Hyperbox gives the runbook a stable physical place to execute: persistent macOS, SSH, VNC, desktop permissions, logs, and enough isolation that your agents do not need to live on your personal laptop.
Frequently asked questions
What is an AI agent runbook?
It is the operational contract for an agent: what it can do, how it proves work, what it logs, when it asks for approval, how it recovers, and when a human stops it.
What should I monitor first?
Start with heartbeats, task status, model/tool errors, token spend, process restarts, disk usage, queue age, and whether the agent has touched sensitive files or accounts.
Can I run production agents from a laptop?
Use a laptop for experiments. Production background agents need a machine that stays awake, preserves state, exposes logs, and can recover after failures.
Related reading
Always-on Mac runtime
Give your agent a Mac that stays online after your laptop closes.
Hyperbox gives Codex, Claude Code, OpenClaw, and remote dev workflows a persistent macOS machine with SSH, VNC, and full desktop access.