Watchdog¶
The AgentWeave watchdog is a background process that monitors the collaboration state and triggers agents when new messages or tasks arrive.
Overview¶
The watchdog (agentweave-watch) runs as a detached daemon process, continuously:
- Monitoring for new messages and tasks
- Triggering agents via their configured runners
- Managing retry logic for failed triggers
- Tracking agent session state
Starting and Stopping¶
# Start the watchdog
agentweave start
# Start with custom retry interval (default: 600 seconds = 10 minutes)
agentweave start --retry-after 300
# Stop the watchdog
agentweave stop
The watchdog PID is stored in .agentweave/watchdog.pid and logs to .agentweave/logs/watchdog.log.
How It Works¶
1. Monitoring Loop¶
The watchdog runs a continuous loop with a 5-second poll interval:
while running:
# Check each agent for pending work
for agent in session_agents:
if has_pending_work(agent):
trigger_agent(agent)
sleep(5)
2. Transport-Specific Monitoring¶
LocalTransport:
- Watches .agentweave/messages/pending/ directory
- Scans for new JSON message files
GitTransport:
- Polls the remote orphan branch
- Fetches new messages via git fetch
- Processes files not yet seen locally
HttpTransport: - Polls Hub API for pending messages - Receives real-time updates via SSE when connected - Pushes agent output back to Hub
3. Agent Triggering¶
When work is detected, the watchdog triggers the agent based on its runner configuration:
| Runner | Trigger Mechanism |
|---|---|
native/claude |
Launches claude subprocess with relay prompt |
claude_proxy |
Launches with ANTHROPIC_BASE_URL override |
kimi |
Launches kimi subprocess with context |
manual |
Logs notification (requires human action) |
4. Auto-Ping and Retry Logic¶
If an agent doesn't respond to a message within the retry window (default 10 minutes), the watchdog re-triggers:
This ensures messages aren't lost if an agent crashes or the CLI isn't running.
5. Stale Process Cleanup¶
When starting, the watchdog:
- Checks for existing watchdog processes
- Sends SIGTERM to any stale daemons
- Removes stale PID files
- Starts fresh
This prevents multiple watchdogs from conflicting.
Pilot Mode Handling¶
When an agent is in pilot mode, the watchdog:
- Skips auto-triggering
- Still tracks pending work
- Logs that manual intervention is required
The agent can be triggered manually via the Hub UI or CLI.
Session Routing¶
New Sessions¶
By default, triggered agents start fresh sessions. The watchdog:
- Generates a relay prompt with pending work
- Launches the CLI without
--resume - The agent reads context files fresh
Resuming Sessions¶
For claude_proxy agents with saved session IDs:
- Retrieves session ID from
.agentweave/agents/{agent}-session.json - Launches with
--resume <session_id> - Conversation context is preserved
For jobs with session_mode: resume:
- Uses the last_session_id from the job record
- Falls back to new session if none exists
Output Streaming (HTTP Transport)¶
When using the Hub with HTTP transport:
- Watchdog captures agent stdout/stderr
- Parses JSONL stream from Claude CLI
- Extracts content and thinking blocks
- Pushes to Hub via
POST /api/v1/agents/{name}/output - Dashboard displays real-time output
Process Architecture¶
┌─────────────────┐
│ Watchdog │ (daemon process)
│ (agentweave- │
│ watch) │
└────────┬────────┘
│
┌────┴────┬────────────┬──────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌───────┐ ┌───────┐ ┌──────────┐ ┌─────────────┐
│ Local │ │ Git │ │ Hub │ │ Agent │
│ Files │ │Remote │ │ API │ │ Processes │
└───────┘ └───────┘ └──────────┘ └─────────────┘
Configuration¶
The watchdog reads configuration from:
.agentweave/session.json— agent list and runner configs.agentweave/transport.json— transport settings- CLI flags (
--retry-after) — retry behavior
Troubleshooting¶
Watchdog not starting¶
Check if .agentweave/ directory exists:
Stale PID file¶
If the watchdog crashed, remove the PID file manually:
Agents not being triggered¶
-
Check watchdog is running:
-
Verify agent runner configuration:
-
Check watchdog logs:
Multiple watchdogs running¶
The watchdog automatically kills stale processes on startup. If issues persist:
Then restart: