OpenClaw Architecture: How One AI Agent Powers All Messaging Channels

2025-01-10 636 words 3 minutes

Contents

The Real Problem

Once you’ve built an AI Agent, you need to connect it to a conversation channel.

Simplest approach: Telegram Bot. But when you need Feishu, Discord, and Slack simultaneously—and want the Agent to maintain consistent state and memory across channels—things get complicated.

Every channel has a different API: Telegram Bot API, Discord webhooks, WhatsApp via Baileys, Feishu via HTTP API. Error handling, reconnection logic, message formats—all different.

Your agent core is written once, but you need a separate adapter for every channel.

OpenClaw’s Solution

OpenClaw architecture:

User Message → Channel Layer → Gateway (WebSocket) → Agent Runtime → Your Agent
                    ↑                                      ↓
                    └──────── Response ←───────────────────┘

Core is a Gateway daemon that:

Manages all channel connections (WhatsApp via Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, etc.)
Exposes WebSocket API to agent runtime
Unified message protocol—incoming messages from any channel get normalized before reaching your Agent

Your Agent only needs to implement the Gateway WebSocket API. No need to care if the client is Telegram or Feishu.

Core Architecture

Per OpenClaw docs, Gateway structure:

Components

Gateway (daemon): Main process, maintains all provider connections
Clients: macOS app, CLI, web admin UI—control plane clients
Nodes: macOS/iOS/Android/headless nodes exposing device capabilities (canvas, camera, screen record, location)

Connection Flow

sequenceDiagram participant Client participant Gateway Client->>Gateway: req:connect Gateway-->>Client: res (ok) Note right of Gateway: or res error + close Gateway-->>Client: event:presence Gateway-->>Client: event:tick Client->>Gateway: req:agent Gateway-->>Client: res:agent Gateway-->>Client: event:agent (streaming) Gateway-->>Client: res:agent (final)

Client sends connect request, then exchanges messages over WebSocket. Agent requests via req:agent, results streamed back via event:agent.

Message Protocol

        
        
        
    
// Request
{
  "type": "req",
  "id": "unique-id",
  "method": "agent",
  "params": {
    "message": "check the weather for me"
  }
}

// Response
{
  "type": "res",
  "id": "unique-id",
  "ok": true,
  "payload": { ... }
}

// Server event
{
  "type": "event",
  "event": "agent",
  "payload": { ... }
}

All frames are JSON over WebSocket. No HTTP polling, no REST—full duplex streaming.

Key Design Decisions

1. One Gateway, One Port

One Gateway instance per host. It listens on a single port (default 127.0.0.1:18789). All clients and nodes connect to this port.

Benefit: simple deployment. No per-channel ports, no NAT configuration.

2. Device-based Pairing

Nodes (phones, desktop clients) connect to Gateway via device pairing, not user accounts. Approval is stored locally in Gateway’s pairing store.

This means: 5 devices can all connect to the same Gateway, but each device needs separate pairing authorization.

3. Canvas Host

Gateway also serves HTTP:

/__openclaw__/canvas/ — Agent-editable HTML/CSS/JS
/__openclaw__/a2ui/ — A2UI host

This lets Agents generate web pages and serve them to users. Example: data analysis Agent generates charts, serves the page directly.

Comparison with Traditional Bot Frameworks

	Traditional Bot Frameworks	OpenClaw
Multi-channel	Separate project per channel	Unified接入
State management	Per-channel independent	Shared Agent state
Deployment	Each bot deployed separately	One Gateway
Device capabilities	None	camera/screen/location
Protocol	Channel-specific	Unified WebSocket

Who Should Use This

OpenClaw fits best when:

Personal AI Assistant: Run AI Agent across multiple devices, interact via different channels
Multi-channel Customer Service: One Agent connects to all external channels, unified response logic
Agent needing device capabilities: Requires phone camera, screen recording, location access

If you just need a Telegram Bot, OpenClaw is overkill. But when you need “one AI Agent, multiple entry points”, OpenClaw’s value is clear.

Limitations

Learning curve: Understanding Gateway + Client + Node relationships takes time
Not a visual platform: Configuration via JSON, not drag-and-drop
Production deployment: Need to consider Gateway HA and horizontal scaling
Channel support limited: Not all channels supported; WhatsApp support depends on Baileys library

Conclusion

OpenClaw’s core abstraction: Gateway as the hub.

Your AI Agent only implements the Gateway WebSocket API. Gateway handles all downstream channel integrations. Much cleaner than writing custom adapters for each channel.

That said, it’s a relatively new project (from GitHub activity), and production use requires evaluating stability and community support.

Repo: github.com/openclaw/openclaw