Building a 5-Role AI Collaborative Operating System with OpenClaw — Complete Technical Breakdown

Author: @gkxspace | Source: X/Twitter | Date: 2026-02-18

I spent a long time transforming OpenClaw from a single assistant into a multi-role collaborative operating system. Not the kind where you "run a few bots and let them chat separately."

5 AI roles. Shared gateway. Running on Discord and Telegram simultaneously. Clear division of labor, routing, memory isolation, and collaboration rules — working together like a real team.

In this article, I'm breaking down the entire build process: every layer of design decisions, specific configurations, and the pitfalls I fell into. All of it, laid open.

If you're playing with OpenClaw, or you're interested in "how to make multiple AIs actually collaborate," this should save you a lot of detours.

The Conclusion First: This Isn't "Multiple Bots" — It's a Multi-Agent OS Under a Single Gateway

When most people hear "5 AI roles," their first thought is: you're running 5 independent bots, right?

Yes, but also no.

My architecture:

1 Gateway process, unified channel ingestion and routing
5 independent Agents: Commander, Strategist, Engineer, Creator, Think Tank
Each Agent has its own independent workspace (persona, rules, memory, sessions — all isolated)
Running Discord + Telegram simultaneously, with bindings for precise message dispatch
Private chats and group chats run on completely different mechanisms

An analogy: this isn't hiring 5 people and throwing them in a room to do whatever. This is building a company — with an org chart, job descriptions, communication protocols, private offices, and meeting rules.

OpenClaw itself is an open-source personal AI assistant framework supporting multiple platforms (Discord, Telegram, WhatsApp, etc.), multiple models (Claude, GPT, Gemini, etc.), with fully local data. Its multi-agent capability is the core reason I chose it — native support for multi-agent independent workspaces + bindings routing gives me the foundation to build a real collaborative system on top.

I. Overall Architecture: Single Gateway + Multi-Agent + Multi-Workspace + Multi-Channel

Let me start with the most fundamental architectural decisions.

1) Single Gateway Carries Everything

My current setup has one OpenClaw Gateway process carrying all capabilities — message ingestion, routing, session management, tool calls, memory indexing, state management — all in one gateway.

Why not run a separate service for each role? Three reasons:

Centralized ops: Only one Gateway to maintain, no separate services per role
Unified config: One master config governs global policy; monitoring and debugging are centralized
Collaboration foundation: For roles to collaborate, they need to be in the same runtime for efficient communication

2) 5 Agents in Parallel — Not 5 Loose Bots

My 5 fixed roles:

Role	Responsibilities
Commander (zongzhihui)	Global situational awareness, task decomposition, dispatch, correction, wrap-up
Strategist (junshi)	Strategic analysis, solution evaluation, risk forecasting
Engineer (engineer)	Technical execution, code implementation, system maintenance
Creator (creator)	Content creation, expression optimization, external output
Think Tank (zhiku)	Knowledge review, quality control, compliance checks

Each agent has its own workspace — workspace-engineer, workspace-junshi, etc. Persona files, rule files, memory files, and script assets are all independent and don't pollute each other.

3) Multi-Channel Dual-Stack: Discord + Telegram

The same Gateway simultaneously connects Discord and Telegram, with each role bound at the accountId level on both channels.

This isn't "multi-platform redundant deployment" — it's "same brain cluster, different ingress layers." I've configured Discord as the main collaboration arena.

If you want multiple 🦞 to work together in a group, just pick Discord. One platform is enough. Everything else is imperfect — I've tried them all.

II. Routing Layer: Bindings Map "Accounts" to "Roles"

This is the entry logic for the entire system.

I configured explicit binding strategies for both channels: channel + accountId -> agentId.

Specifically:

discord + zongzhihui -> zongzhihui
discord + engineer -> engineer
telegram + creator -> creator
… 10 mappings total (5 roles × 2 channels)

Why do this?

Because the system decides "who should handle this message" at the entry layer, rather than having all agents hear it and compete to respond. If you don't get this right, all the collaboration downstream falls apart.

Think of bindings as the system's "front desk triage." Message comes in, check which channel and which account received it, route directly to the corresponding role. Clean and simple.

III. Session Isolation: How I Achieve "No Cross-Contamination in DMs, No Chaos in Groups"

This is one of the most critical engineering points in my system.

Core config: session.dmScope = per-account-channel-peer

This parameter means: DM context is isolated along three dimensions — "account + channel + peer user."

Why this choice?

The same person reaching the same role via Discord vs. Telegram won't have their contexts mixed
Different users reaching the same role are completely isolated
In multi-agent + multi-account scenarios, the risk of "cross-contamination" drops to near zero

In other words, I didn't just build "multiple roles" — I built "context isolation strategy engineering."

Many people build multi-agent systems with clear role separation but terrible context management — User A's DM content leaks into User B's replies, or Discord conversation memory pollutes Telegram context.

per-account-channel-peer is OpenClaw's officially recommended isolation strategy for multi-account scenarios. In my testing, it's the most stable choice.

IV. Group Chat Orchestration: Not "Let AIs Chat Freely" — "Rule-Driven Collaboration"

This is the most interesting part of the whole system, and also where the most pitfalls are.

Core Strategy: Commander Listens Globally + Other Roles Trigger on @mention

My Discord group chat strategy:

Commander: requireMention = false (global listening)

Can see all messages in the group by default
Responsible for global situational awareness, deciding whether collaboration is needed, task decomposition and dispatch

Other 4 roles: requireMention = true (@mention trigger)

Only act when explicitly @mentioned
Reduces noise, prevents talking over each other
Each role has configured mentionPatterns — for example, the Engineer can be triggered by @Engineer or @engineer

What's the essence of this combination?

Commander "sees the whole picture" — like the PM on a team
Specialist roles "trigger on demand" — like domain experts
Group conversation shifts from "free-form scatter" to "controlled relay"

In practice: you raise a question in the group, Commander first determines what type of task it is, then @mentions the appropriate role to handle it. The role handles it, Commander wraps up. The whole process feels like a real team in a meeting.

V. Discord vs. Telegram: Why I Made Discord the Main Arena

Strictly speaking, it's not that "only Discord can do collaboration." It's that in my current configuration, Discord is best suited for multi-role public collaborative orchestration.

Specific reasons:

I have 5 accounts running in parallel on Discord + a clear @collaboration mechanism
Role identities are visible, conversation chains are visible, relay process is visible — it looks and feels like a team discussion
The Commander global-listen + other roles mention-gate strategy is more intuitive in group chat scenarios
My Discord groupPolicy is currently set to open, giving more flexibility

On the Telegram side, my strategy leans toward allowlist + mention gate — more controlled, more secure, better suited as a "controlled production channel."

So the summary: Discord is the collaboration stage.

VI. Config Layer + Prompt Layer: Dual-Track Governance

This is the biggest difference between my system and "just playing around."

I don't rely only on configuration, and I don't rely only on prompts. I run two tracks in parallel.

Track A: Configuration (Platform-Level Control)

These are hard configurations at the OpenClaw platform layer:

channel policy: groupPolicy, dmPolicy — controls basic strategies for group and private chats
requireMention: who must be explicitly @mentioned to respond by default
bindings: message routing mappings
dmScope: session isolation granularity
agentToAgent ping-pong limit: I set this to 0, directly suppressing meaningless back-and-forth between agents

That last one is critical. If you don't limit agent-to-agent ping-pong, you'll see two AIs in the group exchanging pleasantries, confirming things, looping infinitely. Setting it to 0 tells the system: agents don't auto-ping each other.

Track B: Rules (Behavior-Level Control)

These are the rule files I've written in each workspace:

SOUL.md: The role's soul file — persona, tone, responsibilities, output quality baseline
AGENTS.md: Operations manual — collaboration checklist, memory read/write standards, lazy-loading strategy
ROLE-COLLAB-RULES.md: Role-specific collaboration boundaries and red lines
TEAM-RULEBOOK.md: Team-wide hard collaboration rules (shared by all roles)
TEAM-DIRECTORY.md: Mapping table of roles to real IDs, preventing wrong @mentions

The effect of layering both tracks: platform layer hard-limits first + behavior layer soft-guides second.

Not putting everything on the model's "self-discipline." Models make mistakes, drift, forget rules. So you must do hard constraints at the config layer first, then soft guidance at the prompt layer. Double insurance.

VII. Workspace File System: Each Role's "Private Office"

Each workspace has essentially the same file skeleton. This is important — it means I'm doing standardization, not just piling random files into each role.

Standard File Structure

File	Purpose
SOUL.md	Role soul: persona definition, behavior patterns, quality baseline
AGENTS.md	Operations manual: collaboration process, memory standards, checklists
ROLE-COLLAB-RULES.md	Collaboration boundaries: what this role can and can't do
IDENTITY.md	Identity definition: name, positioning, capability scope, external messaging
USER.md	User profile: preferences, goals, taboos, common terminology
TOOLS.md	Tool list: which tools are allowed, permission boundaries
MEMORY.md	Long-term memory: stable preferences, long-term decisions, reusable experience
GROUP_MEMORY.md	Group chat memory: only retains group-reusable and safe information
HEARTBEAT.md	Heartbeat spec: periodic self-check, failure recovery, state maintenance
memory/YYYY-MM-DD*.md	Daily log: today's task process, context fragments, on-the-spot decisions

VIII. Memory System: Lazy Loading + Layering + Archiving

Memory management is the most overlooked but most error-prone part of multi-agent systems.

My strategy isn't "remember as much as possible" — I've built explicit layers:

1) Short-term log (daily memory) Records today's task process, context fragments, on-the-spot decisions. Files named by date, naturally creating a timeline.

2) Long-term memory (MEMORY.md) Distills stable preferences, long-term decisions, reusable experience, hard rules. Not everything goes in — only validated, stable information gets written.

3) Group chat long-term memory (GROUP_MEMORY.md) Only retains group-reusable and safe information. Private chat content never mixes in. This is a privacy red line.

4) Cold archive (archive) Old data gets periodically archived to prevent active context from bloating out of control. Not deletion — moved to lower-priority storage.

5) Retrieval mechanism (memory_search + memory_get) Semantic recall first, then precise read. Avoid full loading — context windows are limited resources, not to be wasted.

The core value of this layering:

Private chat quality isn't polluted by group chat history
Group collaboration isn't disrupted by personal private context
Context window is "loaded on demand," not "fully injected"

I treat context budget as a resource management problem. Tokens are finite. Every memory you stuff in is consuming inference space. You have to be precise.

IX. DM Mode vs. Group Mode: Same Role, Two Operating Strategies

This is something many people don't think about: the same role should behave differently in DMs vs. group chats.

I've explicitly distinguished two modes in each role's SOUL.md:

DM mode:

Each role acts as a solo expert, handling user problems end-to-end
No collaboration process needed, just give a complete answer
Quality standard: "one person can handle it"

Group mode:

Follow team collaboration protocol for incremental relay
Each role only handles what it's best at
Commander is responsible for connecting and wrapping up

Specifically for each role:

Commander: Silently observes in groups by default, only forcefully intervenes when necessary — avoids talking over others
Engineer: Deliverables must be executable, verifiable, and rollback-able — not just giving a rough idea
Strategist: Conclusions must include assumptions and validation paths — not just gut feelings
Think Tank: Reviews must include issue severity levels + fix plans — not just saying "there's a problem"
Creator: Expression can't sacrifice truthfulness and executability — not just chasing aesthetics

This is where "the same role behaves differently in different contexts" comes from. Not relying on the model to judge on its own — relying on rule files to explicitly tell it.

Final Thoughts

Multi-agent isn't just running multiple bots. It's an entire engineering system — from architecture design, routing strategy, session isolation, collaborative orchestration, memory management, rule governance, to automated checks. Every layer needs careful design.

OpenClaw provides a great foundation, but the engineering gap between "it runs" and "it runs well" is much larger than most people imagine.

If you're doing something similar, I hope this gives you some useful reference. This article is just the beginning — I'll be sharing more specific and detailed content in follow-up posts. 🦞