Building AgentHive

The problem isn't that AI agents are bad at tasks. It's that the interfaces we use to work with them were designed for something else.

General-purpose chat gives you one human talking to one model in an ephemeral conversation. Messaging bots drop agents into platforms built for humans — rate-limited, text-only, second-class. Developer tools like LangGraph Studio are debuggers, not products. None of them are built for operating a team of agents across ongoing work.

What's missing is a purpose-built surface. A control plane, not a chat window.

That's what I built AgentHive to be.

What AgentHive Is

The core idea: rooms, not conversations.

A room is a persistent workspace. A "production infrastructure" room accumulates context across months — agents that monitor, alert, diagnose, and remediate, with a human who reviews and approves. A "shipping this feature" room captures decisions, drafts, tool calls, and outputs for the lifetime of a project. The room persists. The agents and humans come and go.

Inside a room, interaction isn't just text. AgentHive has a set of interaction primitives:

Message — text or markdown, the baseline
Action Card — structured card with data and buttons the human can act on
Approval Gate — a blocking checkpoint; the agent pauses until the human approves or rejects
Agent Status — a non-interactive update showing what an agent is currently doing
Hive App — an inline sandboxed application the agent generates (a chart, a form, a dashboard)

The difference from chat: the human makes decisions on structured surfaces, not by parsing paragraphs. The agent doesn't just reply — it presents, waits, continues.

The five interaction primitives

Each one is what the agent posts; each one is what the human acts on.

Blocking checkpoint. Agent pauses. Human approves or rejects. Session resumes.

production-infra · room feed

Permission requested · Bash

Run terraform apply -auto-approve against stg/eu-west-1?

ApproveDenyid: tkqra

What AgentHive is not: it's not an agent framework (not competing with LangChain), not an observability tool (not competing with LangSmith), not a messaging app (not competing with Slack). It's the interface layer between humans and their deployed agent teams.

Phase 1: Building the Platform

The first thing to get right was the approval gate. Everything else depends on it — the gate is what makes agents safe to run autonomously. When an agent wants to do something consequential, it emits a gate. A card appears in the room. The human decides. The agent waits.

Getting the approval primitive right meant thinking carefully about the protocol. I wanted AHP (AgentHive Protocol) to be a first-class deliverable — if the protocol is right, clients can be built for any platform. So the backend speaks WebSocket + AHP, and the web client is just one implementation of that.

Phase 1 stack: FastAPI + Python, LangGraph for agent orchestration, Next.js front-end, Postgres for persistence. In a few days: rooms with persistent feeds, streaming agent responses, action cards, approval gates, hive apps.

Two conductor types shipped in Phase 1:

Anthropic conductor — streaming Claude responses, direct model interaction
LangGraph conductor — a ReAct agent that uses MCP tools, observable step by step

The LangGraph conductor was where the "visible autonomy" promise started to feel real. You could watch the planner decide what to do, see the analyst query data, follow the executor running tools. Each step showed up in the feed. You could see where the agent was and intervene at any point.

Phase 2: Connecting Claude Code

Once the platform existed, I wanted to connect a real agent to it. The most natural choice: Claude Code, because it's what I use all day.

Claude Code isn't an agent you instantiate — it's a process running in your terminal with its own permission system, its own tool loop, its own session state. I needed a bridge.

Claude Code has a channels API: connect an MCP server with the claude/channel capability and it forwards notifications into the session and receives messages back through it. Two components make the bridge:

channel.ts — An MCP stdio server that joins an AgentHive room on startup. It receives messages from the room and forwards them to Claude Code as channel notifications. When CC replies, it calls a reply tool that posts the response back to the feed. Bidirectional.

hook.py — A Claude Code hook that intercepts PermissionRequest events. When CC needs to use a tool, instead of blocking at the terminal it emits an approval gate to AgentHive. The gate card appears in the room feed. The human approves or rejects from wherever they are. The verdict returns to CC and the session continues.

The result: a Claude Code session running in a terminal, fully observable and interactive from a browser. The terminal stays unblocked. You can be on your phone via Tailscale and still approve tool calls.

The most recent addition is the native permission relay from CC v2.1.81. Instead of the hook intercepting and re-emitting, CC sends permission_request notifications directly through the channel. The relay in channel.ts receives them and shows the approval card natively. The hook path still exists as a fallback — if you're running the channel without the hook, a 1.5s timer creates its own gate so nothing hangs. If you have both, native relay takes precedence and deduplicates.

// 2.1.81+: CC ships a notification straight through
// the channel. channel.ts forwards it as an approval card.

server.on(
"notifications/claude/channel/permission_request",
async (msg) => {
  const card = await room.openApprovalGate({
    tool: msg.params.tool_name,
    description: msg.params.description,
    input_preview: msg.params.input_preview,
  });

  await server.notification({
    method: "notifications/claude/channel/permission",
    params: {
      request_id: msg.params.request_id,
      behavior: card.verdict, // "allow" | "deny"
    },
  });
},
);

What It Looks Like

The sidebar shows all rooms grouped by type — CC Channels, AI Rooms, mock sessions. Each room shows a live status indicator: idle, working, or gated. The notification bell shows pending gates across all rooms.

Inside a room, the feed mixes messages and gate cards. Human messages on the right, agent responses on the left, approval cards in the middle. A pending gate is highlighted. When resolved, it shows the verdict and fades into the history. The room header shows the active CC session status in real time — a pulsing dot when CC is working, the current tool when a permission request is in flight.

What's Next

The current setup is personal — running locally on my own projects. A few things I want to validate:

AgentHive is not open source yet. Following along here or on X while I keep shipping.

Related: The Messaging-Native Agent Moment — the broader context for why this exists. Claude Code Channels — the API that powers the CC bridge.