Documentation Index
Fetch the complete documentation index at: https://docs.orcbot.buzzchat.site/llms.txt
Use this file to discover all available pages before exploring further.
OrcBot’s memory system is the foundation of its contextual awareness. Unlike stateless LLM wrappers, OrcBot maintains persistent, multi-tier memory that enables true continuity across conversations, tasks, and sessions.
Memory Architecture
Memory Types
1. Short-Term Memory
Purpose: Store recent step observations within the current action.
Implementation: MemoryManager.ts (lines 186-210)
Characteristics:
- Lives in-memory and on-disk (
memory.json)
- Default limit: 20 entries (configurable via
memoryContextLimit)
- Cleaned up after action completion
- Includes tool observations, user messages, system injections
Entry format:
interface MemoryEntry {
id: string; // e.g., "abc123-step-1" for action-scoped
type: 'short';
content: string; // Observation text (max 1500 chars by default)
timestamp: string; // ISO 8601
metadata?: {
actionId?: string; // Links memory to action
step?: number; // Step number within action
skill?: string; // Tool that was executed
source?: string; // 'telegram' | 'whatsapp' | etc.
role?: 'user' | 'assistant' | 'system';
[key: string]: any;
};
}
Example:
{
"id": "task-42-step-3",
"type": "short",
"content": "Observation: Tool web_search returned 10 results. Top result: 'OpenAI releases GPT-5'...",
"timestamp": "2025-01-15T14:23:01.234Z",
"metadata": {
"actionId": "task-42",
"step": 3,
"skill": "web_search"
}
}
Retrieval:
// Get recent short-term context
const recent = memory.getRecentContext(20); // Last 20 entries
// Get all memories for a specific action
const actionMemories = memory.getActionMemories(actionId);
// Search by type
const shortMemories = memory.searchMemory('short');
2. Episodic Memory
Purpose: LLM-generated summaries of conversation batches for durable thread context.
Implementation: MemoryManager.ts (lines 273-332)
Characteristics:
- Created via automatic consolidation when short-term memory exceeds threshold (default: 30 entries)
- Grouped by platform + contact (e.g.,
telegram:123456789)
- Batch size: 12 exchanges (configurable via
interactionBatchSize)
- Summarized with structured JSON:
{summary, facts, pending, tone, preferences, confidence}
Consolidation trigger:
// Automatically runs when short-term memory > consolidationThreshold
const shortCount = memory.searchMemory('short').length;
if (shortCount >= 30) {
await memory.consolidate(llm);
}
// Manual consolidation at action completion
await memory.consolidateInteractions(llm, 'session_end');
Example episodic entry:
{
"id": "episodic-telegram-123456789-1705330981234",
"type": "episodic",
"content": "Interaction summary (telegram/123456789): {\"summary\": \"User asked about AI news. Agent searched and delivered 3 top articles.\", \"facts\": [\"User interested in AI/ML topics\"], \"pending\": [], \"tone\": \"informative\", \"preferences\": {\"format\": \"concise\"}, \"confidence\": 0.9}",
"timestamp": "2025-01-15T14:29:41.234Z",
"metadata": {
"source": "telegram",
"sourceId": "123456789",
"reason": "threshold",
"interactionCount": 12,
"structured": true,
"messageTypes": ["text", "photo"],
"timeRange": {
"from": "2025-01-15T14:00:00.000Z",
"to": "2025-01-15T14:29:41.234Z"
}
}
}
Retrieval:
// Get recent episodic summaries
const episodic = memory.searchMemory('episodic').slice(-5);
// Get semantically relevant episodic memories for a task
const relevant = await memory.getRelevantEpisodicMemories(
"How do I deploy to production?",
5
);
3. Long-Term Memory
Purpose: Persistent markdown files for durable facts, learning, and reflections.
Implementation: File-backed storage in ~/.orcbot/
Files:
| File | Purpose | Updated By |
|---|
JOURNAL.md | Agent’s self-reflections and activity logs | update_journal skill |
LEARNING.md | Structured knowledge on various topics | update_learning skill |
USER.md | User preferences and profile | update_user_profile skill |
WORLD.md | Environment state and governance | update_world skill |
MEMORY.md | General long-term facts | memory_write skill |
Example LEARNING.md entry:
# Agent Learning Base
## Topic: Docker Deployment
**Last Updated:** 2025-01-15
**Key Facts:**
- Docker images are built from Dockerfiles
- Use `docker-compose.yml` for multi-container apps
- Volumes persist data between container restarts
- Expose ports with `-p host:container`
**Resources:**
- Official docs: https://docs.docker.com
- Best practices: Use multi-stage builds for smaller images
**When to use:**
- User asks about Docker
- Deployment tasks involving containers
Retrieval:
// Loaded automatically during prompt assembly
const journalContent = fs.readFileSync(journalPath, 'utf-8');
const learningContent = fs.readFileSync(learningPath, 'utf-8');
const userContext = memory.getUserContext();
4. Vector Memory (Semantic Search)
Purpose: Semantic embeddings for full-history similarity search.
Implementation: VectorMemory.ts (lines 8-940)
Characteristics:
- Uses
text-embedding-3-small (OpenAI) or text-embedding-004 (Google)
- Stores embeddings in file-backed JSON (
vector_memory.json)
- Background indexing every 5 minutes
- Cosine similarity search
- Max entries: 10,000 (configurable via
vectorMemoryMaxEntries)
Initialization:
memory.initVectorMemory({
openaiApiKey: config.get('openaiApiKey'),
googleApiKey: config.get('googleApiKey'),
preferredProvider: 'openai',
maxEntries: 10000
});
Indexing:
// Automatic: All memories saved via saveMemory() are queued for indexing
memory.saveMemory({ ... });
// → Queued for next background index cycle
// Manual: Force immediate indexing
await memory.vectorMemory.processQueue();
Retrieval:
// Semantic search across all indexed memories
const results = await memory.semanticSearch(
"How do I configure the database?",
limit: 10,
filter: { source: 'telegram' } // Optional
);
// Deep recall (cross-session, excludes already-shown IDs)
const recalled = await memory.semanticRecall(
"Previous deployment issues",
limit: 8,
excludeIds: new Set(['id1', 'id2'])
);
Response format:
interface ScoredVectorEntry {
id: string;
content: string;
score: number; // Cosine similarity (0-1)
type: string; // 'short' | 'episodic' | 'long'
timestamp?: string;
metadata?: any;
}
5. Daily Memory Logs
Purpose: Append-only markdown logs organized by date.
Implementation: DailyMemory.ts (lines 38-742)
Characteristics:
- Files stored in
~/.orcbot/daily_memory/YYYY-MM-DD.md
- Categorized entries (System, Research, Communication, Consolidation)
- Automatically appended for important events
- Read into extended context for awareness
Example log:
# Daily Memory - 2025-01-15
## 14:23 - Research
Searched for "latest AI news 2025". Found 10 articles. Top: "OpenAI releases GPT-5".
## 14:29 - Consolidation
Consolidated 12 memories from Telegram user 123456789:
- User asked about AI news
- Agent delivered 3 top articles
- User satisfied with response
## 15:10 - System
Memory flush triggered. Response: No important information to store.
Retrieval:
const dailyContext = memory.getDailyMemory().readRecentContext();
// Returns last 3 days of logs (configurable)
Memory Lifecycle
1. Creation
// User sends message
memory.saveMemory({
id: `telegram-in-${Date.now()}`,
type: 'short',
content: 'User: How do I deploy to production?',
metadata: {
source: 'telegram',
chatId: '123456789',
role: 'user'
}
});
// → Saved to memory.json
// → Queued for vector indexing
// → Tracked for consolidation
2. Consolidation
// When short-term memory > 30 entries
await memory.consolidate(llm);
// 1. Takes oldest 20 short-term memories
// 2. Calls LLM to summarize
// 3. Saves episodic summary
// 4. Deletes those 20 short-term entries
// 5. Also appends to daily log
3. Retrieval
// During prompt assembly (parallel)
const context = await memory.assemblePromptContext(taskDescription);
// Returns:
// {
// recent: MemoryEntry[], // Last 20 short-term
// episodic: MemoryEntry[], // Last 5 episodic or semantically relevant
// semantic: ScoredVectorEntry[], // Top 8 similar memories
// extended: string // Daily logs + long-term files
// }
4. Cleanup
// After action completion
const removed = memory.cleanupActionMemories(actionId);
logger.info(`Cleaned up ${removed} step memories for action ${actionId}`);
Memory Limits & Configuration
# orcbot.config.yaml
# Short-term memory
memoryContextLimit: 20 # Max recent entries in prompt
memoryContentMaxLength: 1500 # Max chars per memory entry
# Episodic memory
memoryEpisodicLimit: 5 # Max episodic entries in prompt
interactionBatchSize: 12 # Entries per consolidation batch
interactionStaleMinutes: 10 # Auto-consolidate after N minutes
# Consolidation
memoryConsolidationThreshold: 30 # Trigger at N short-term entries
memoryConsolidationBatch: 20 # Consolidate oldest N entries
# Memory flush (reminder before consolidation)
memoryFlushSoftThreshold: 25 # Trigger at N entries
memoryFlushCooldownMinutes: 30 # Min interval between flushes
# Long-term context
memoryExtendedContextLimit: 2000 # Max chars of long-term files
journalContextLimit: 1500 # Max chars of JOURNAL.md tail
learningContextLimit: 1500 # Max chars of LEARNING.md tail
# Vector memory
vectorMemoryMaxEntries: 10000 # Max embeddings stored
# Thread context
threadContextRecentN: 8 # Recent messages from same contact
threadContextRelevantN: 8 # Semantically relevant messages
threadContextMaxLineLen: 420 # Truncate long lines
# Step history compaction
stepCompactionThreshold: 10 # Compact when > N steps
stepCompactionPreserveFirst: 2 # Keep first N steps
stepCompactionPreserveLast: 5 # Keep last N steps
Memory Deduplication
OrcBot prevents storing duplicate events within a 5-minute window:
// Deduplication strategy (MemoryManager.ts:227-249)
function isDuplicateMemory(entry, existingMemories) {
const cutoff = Date.now() - (5 * 60 * 1000); // 5 minutes
return existingMemories.some(candidate => {
// 1. Check stable event ID (messageId, eventId, statusMessageId)
const entryId = entry.metadata?.messageId;
const candidateId = candidate.metadata?.messageId;
if (entryId && candidateId && entryId === candidateId) return true;
// 2. Fallback: same source + contact + content
const sameSource = candidate.metadata?.source === entry.metadata?.source;
const sameContact = candidate.metadata?.sourceId === entry.metadata?.sourceId;
const sameContent = candidate.content === entry.content;
const recentEnough = new Date(candidate.timestamp) > cutoff;
return sameSource && sameContact && sameContent && recentEnough;
});
}
Session Scoping
OrcBot supports three session scoping modes (configurable via sessionScope):
1. main (Single Global Session)
All conversations share the same memory pool.
Use case: Single-user deployment, no multi-tenancy.
Memories are scoped to a user across all channels.
sessionScope: per-peer
identityLinks:
telegram:123456789: user-alice
whatsapp:+1234567890: user-alice
Use case: Same user contacts you on multiple platforms.
3. per-channel-peer (Default)
Memories are scoped to a user on a specific channel.
sessionScope: per-channel-peer
Use case: Multi-tenant deployment, strict isolation.
Session ID format:
// main
resolveSessionScopeId('telegram', { chatId: '123' })
// → "scope:main"
// per-peer
resolveSessionScopeId('telegram', { userId: '123' })
// → "scope:peer:telegram:123" (or "scope:peer:user-alice" if linked)
// per-channel-peer
resolveSessionScopeId('telegram', { chatId: '123' })
// → "scope:channel-peer:telegram:123"
Thread Context Retrieval
For follow-up messages, OrcBot retrieves thread context (recent + relevant messages from the same contact):
// Automatically injected during prompt assembly
const threadContext = await buildThreadContext({
source: 'telegram',
sourceId: '123456789',
taskDescription: 'Continue the research'
});
// Strategy:
// 1. Get last 8 messages from same contact (RECENT)
// 2. Semantic search for 8 relevant messages (RELEVANT)
// 3. Merge and deduplicate
// 4. Truncate lines > 420 chars
// Result:
// [
// "[2025-01-15 14:00] (user) Can you research Docker?",
// "[2025-01-15 14:01] (assistant) Sure, I'll look into Docker...",
// "[2025-01-15 14:05] (user) What about Kubernetes?",
// "[2025-01-15 14:06] (assistant) Kubernetes is...",
// "[2025-01-15 14:10] (user) Continue the research"
// ]
This enables pronouns and context continuity:
User: "Tell me about Docker"
Agent: "Docker is a containerization platform..."
User: "How do I use it?" ← Thread context knows "it" = Docker
OrcBot maintains per-contact profiles in ~/.orcbot/profiles/:
{
"jid": "telegram:123456789",
"displayName": "Alice",
"identity": {
"primary": { "platform": "telegram", "id": "123456789" },
"aliases": [
{ "platform": "telegram", "id": "123456789", "seenAt": "2025-01-15T14:00:00Z" },
{ "platform": "whatsapp", "id": "+1234567890", "seenAt": "2025-01-14T10:30:00Z" }
]
},
"platform": "telegram",
"platformIds": {
"telegram": "123456789",
"whatsapp": "+1234567890"
},
"lastSeenAt": "2025-01-15T14:23:01Z",
"lastMessageType": "text",
"createdAt": "2025-01-10T08:00:00Z",
"lastUpdated": "2025-01-15T14:23:01Z"
}
Retrieval:
const profile = memory.getContactProfile('telegram:123456789');
// Returns JSON string or null
Memory Flush System
Inspired by OpenClaw, OrcBot proactively reminds the LLM to write important memories before consolidation:
// Triggered when short-term memory > 25 entries
if (shortCount >= 25 && now - lastFlushAt > 30 * 60 * 1000) {
await memory.memoryFlush(llm);
}
// Sends this prompt to LLM:
// "The conversation history is approaching consolidation. Review recent context
// and use memory_write to store any important information (user preferences,
// decisions, learned facts). Ignore temporary conversation context."
This prevents accidental loss of important facts during consolidation.
Token costs:
- Recent context: ~1,000-3,000 tokens
- Episodic: ~500-1,500 tokens
- Long-term files: ~500-2,000 tokens
- Vector search: ~500-2,000 tokens
- Total memory context per step: ~2,500-8,500 tokens
Optimization tips:
- Use
memoryContextLimit to reduce recent entries
- Enable step compaction to trim history
- Set
memoryContentMaxLength to 1500 (prevents bloat from large tool outputs)
- Use
isLeanMode: true for simple tasks (skips semantic/episodic retrieval)
Latency:
- Semantic search: ~200-500ms (depends on embedding API)
- Consolidation: ~2-5 seconds (LLM call)
- Daily log read: ~50-100ms
- Parallel retrieval: ~300-600ms (all async ops run concurrently)
Debugging Memory Issues
1. Inspect raw memory:
cat ~/.orcbot/memory.json | jq '.memories | length'
# Should be < 50 (auto-consolidates at 30)
2. Check vector memory stats:
const stats = memory.vectorMemory?.getStats();
console.log(stats);
// { indexed: 1234, queued: 5, totalEmbeddings: 1234 }
3. View episodic summaries:
cat ~/.orcbot/memory.json | jq '.memories[] | select(.type == "episodic") | .content'
4. Test semantic search:
const results = await memory.semanticSearch("test query", 5);
console.log(results.map(r => ({ content: r.content.slice(0, 80), score: r.score })));
Further Reading