Architecture Diagram
The full system architecture from the README shows how all components interact:Core Components
1. Channels Layer
Purpose: Handle inbound/outbound communication with users across platforms. Key files:src/channels/TelegramChannel.ts— Telegram integration via Telegrafsrc/channels/WhatsAppChannel.ts— WhatsApp integration via Baileyssrc/channels/DiscordChannel.ts— Discord integration via discord.jssrc/channels/Gateway.ts— REST API + WebSocket gateway
- User sends message on a channel (e.g., Telegram)
- Channel writes message to short-term memory
- Channel pushes task to ActionQueue with metadata (source, chatId, userId)
2. Action Queue
Purpose: Durable priority queue for tasks with retry, TTL, and chaining support. Key file:src/memory/ActionQueue.ts
Features:
- Priority-based execution (0-10, higher = more urgent)
- Automatic retry with exponential backoff
- Task dependencies (
dependsOnfield) - TTL for completed/failed actions
- Atomic disk persistence via JSONAdapter
3. Agent Core (Action Loop)
Purpose: Orchestrate the ReAct reasoning loop (see agent-loop.mdx). Key file:src/core/Agent.ts (lines 67-797)
Responsibilities:
- Pop next action from queue
- Run SimulationEngine (pre-task planning)
- Call DecisionEngine in a loop (reason → tool calls → observations)
- Execute skills via SkillsManager
- Write observations to memory
- Terminate when task is complete
4. Decision Engine
Purpose: Assemble prompts and call LLM with retry logic and guardrails (see decision-pipeline.mdx). Key file:src/core/DecisionEngine.ts
Components:
- PromptRouter — Selectively activates 8 modular prompt helpers based on task intent
- DecisionPipeline — Applies guardrails (deduplication, loop detection, safety checks)
- ParserLayer — Normalizes LLM output to structured JSON with 3-tier fallback
- ContextCompactor — Automatically compacts prompt when context overflow occurs
5. Memory System
Purpose: Multi-tier memory storage with consolidation and semantic search (see memory-system.mdx). Key file:src/memory/MemoryManager.ts
Memory Types:
- Short-term — Recent step observations (last ~20 entries)
- Episodic — LLM-summarized conversation batches
- Long-term — Persistent markdown files (JOURNAL.md, LEARNING.md, USER.md)
- Vector — Semantic embeddings for full-history search
6. Skills System
Purpose: Extensible tool registry with hot-reloadable plugins (see skills-system.mdx). Key file:src/core/SkillsManager.ts
Features:
- Core skills registered at startup (web_search, browser_navigate, run_command, etc.)
- Dynamic TypeScript/JavaScript plugins loaded from
~/.orcbot/plugins/ - SKILL.md agent skills (agentskills.io format) with progressive disclosure
- Intent-based skill routing
- Admin-only elevated skills
7. Multi-LLM Provider
Purpose: Abstract LLM provider routing with automatic fallback. Key file:src/core/MultiLLM.ts
Supported Providers:
- OpenAI (GPT-4o, o1)
- Google Gemini
- AWS Bedrock
- OpenRouter (200+ models)
- NVIDIA NIM
- Local Ollama
- Automatic provider selection based on model prefix (e.g.,
gemini-2.0-flash→ Google) - Fallback chain on rate limits or errors
- Native tool calling for OpenAI and Google
- Token tracking and cost estimation
Data Flow Example
Here’s a complete flow for a user request:Configuration
All configuration is centralized inConfigManager (src/config/ConfigManager.ts), which loads from:
- Environment variables
- Local
./orcbot.config.yaml - Home
~/orcbot.config.yaml - Global
~/.orcbot/orcbot.config.yaml
Security Architecture
Local-First Design:- All memory, logs, and profiles stored in
~/.orcbot/ - No hidden uploads or telemetry
- Secrets loaded from config (never hardcoded)
- Admin-only elevated skills (run_command, write_file, manage_config)
- Cross-channel send blocking (non-admin tasks can’t send to other channels)
- Plugin allow/deny lists
- Safe mode to disable dangerous operations
- Non-admin tasks don’t see journal/learning/episodic context (prevents cross-user leakage)
- Session scoping (per-channel-peer by default, configurable to main/per-peer)
- Memory deduplication prevents repeated storage of identical events
Performance Optimizations
Token Reduction:- Compact skills prompt (names + usage only after step 1)
- Step history compaction (preserve first N + last M, summarize middle)
- Aggressive pruning of large tool outputs in history
- Per-action prompt caching (core instructions cached per action)
- Parallel async retrieval (semantic recall + episodic + RAG run concurrently)
- Context assembly caching (recent/episodic/extended cached for 5-30s)
- Vector memory background indexing (runs every 5 minutes)
- Atomic disk writes with
.bakbackups (JSONAdapter) - Action queue persistence (survives crashes)
- LLM retry with exponential backoff
- Automatic context compaction on overflow
- Circuit breaker pattern in browser operations
Extension Points
Adding a New Channel:- Implement
BaseChannelinterface insrc/channels/ - Call
agent.pushTask()for inbound messages - Register channel skills (e.g.,
send_slack,react_slack) - Add channel detection logic to
skills-system
- Register via
agent.skills.registerSkill()(for core skills) - Or drop a
.tsfile in~/.orcbot/plugins/(for dynamic plugins) - Skill interface:
- Add provider logic in
MultiLLM.ts - Implement
callWithTools()for native tool calling (optional) - Add model prefix detection (e.g.,
claude-→ Anthropic) - Update config schema
Further Reading
- Agent Loop (ReAct) — Step-by-step reasoning process
- Memory System — Multi-tier storage and consolidation
- Decision Pipeline — Guardrails and safety layers
- Skills System — Tool registry and plugin architecture