Skip to main content
OrcBot’s memory system is the foundation of its contextual awareness. Unlike stateless LLM wrappers, OrcBot maintains persistent, multi-tier memory that enables true continuity across conversations, tasks, and sessions.

Memory Architecture

Memory Types

1. Short-Term Memory

Purpose: Store recent step observations within the current action. Implementation: MemoryManager.ts (lines 186-210) Characteristics:
  • Lives in-memory and on-disk (memory.json)
  • Default limit: 20 entries (configurable via memoryContextLimit)
  • Cleaned up after action completion
  • Includes tool observations, user messages, system injections
Entry format:
interface MemoryEntry {
  id: string;           // e.g., "abc123-step-1" for action-scoped
  type: 'short';
  content: string;      // Observation text (max 1500 chars by default)
  timestamp: string;    // ISO 8601
  metadata?: {
    actionId?: string;  // Links memory to action
    step?: number;      // Step number within action
    skill?: string;     // Tool that was executed
    source?: string;    // 'telegram' | 'whatsapp' | etc.
    role?: 'user' | 'assistant' | 'system';
    [key: string]: any;
  };
}
Example:
{
  "id": "task-42-step-3",
  "type": "short",
  "content": "Observation: Tool web_search returned 10 results. Top result: 'OpenAI releases GPT-5'...",
  "timestamp": "2025-01-15T14:23:01.234Z",
  "metadata": {
    "actionId": "task-42",
    "step": 3,
    "skill": "web_search"
  }
}
Retrieval:
// Get recent short-term context
const recent = memory.getRecentContext(20); // Last 20 entries

// Get all memories for a specific action
const actionMemories = memory.getActionMemories(actionId);

// Search by type
const shortMemories = memory.searchMemory('short');

2. Episodic Memory

Purpose: LLM-generated summaries of conversation batches for durable thread context. Implementation: MemoryManager.ts (lines 273-332) Characteristics:
  • Created via automatic consolidation when short-term memory exceeds threshold (default: 30 entries)
  • Grouped by platform + contact (e.g., telegram:123456789)
  • Batch size: 12 exchanges (configurable via interactionBatchSize)
  • Summarized with structured JSON: {summary, facts, pending, tone, preferences, confidence}
Consolidation trigger:
// Automatically runs when short-term memory > consolidationThreshold
const shortCount = memory.searchMemory('short').length;
if (shortCount >= 30) {
  await memory.consolidate(llm);
}

// Manual consolidation at action completion
await memory.consolidateInteractions(llm, 'session_end');
Example episodic entry:
{
  "id": "episodic-telegram-123456789-1705330981234",
  "type": "episodic",
  "content": "Interaction summary (telegram/123456789): {\"summary\": \"User asked about AI news. Agent searched and delivered 3 top articles.\", \"facts\": [\"User interested in AI/ML topics\"], \"pending\": [], \"tone\": \"informative\", \"preferences\": {\"format\": \"concise\"}, \"confidence\": 0.9}",
  "timestamp": "2025-01-15T14:29:41.234Z",
  "metadata": {
    "source": "telegram",
    "sourceId": "123456789",
    "reason": "threshold",
    "interactionCount": 12,
    "structured": true,
    "messageTypes": ["text", "photo"],
    "timeRange": {
      "from": "2025-01-15T14:00:00.000Z",
      "to": "2025-01-15T14:29:41.234Z"
    }
  }
}
Retrieval:
// Get recent episodic summaries
const episodic = memory.searchMemory('episodic').slice(-5);

// Get semantically relevant episodic memories for a task
const relevant = await memory.getRelevantEpisodicMemories(
  "How do I deploy to production?",
  5
);

3. Long-Term Memory

Purpose: Persistent markdown files for durable facts, learning, and reflections. Implementation: File-backed storage in ~/.orcbot/ Files:
FilePurposeUpdated By
JOURNAL.mdAgent’s self-reflections and activity logsupdate_journal skill
LEARNING.mdStructured knowledge on various topicsupdate_learning skill
USER.mdUser preferences and profileupdate_user_profile skill
WORLD.mdEnvironment state and governanceupdate_world skill
MEMORY.mdGeneral long-term factsmemory_write skill
Example LEARNING.md entry:
# Agent Learning Base

## Topic: Docker Deployment

**Last Updated:** 2025-01-15

**Key Facts:**
- Docker images are built from Dockerfiles
- Use `docker-compose.yml` for multi-container apps
- Volumes persist data between container restarts
- Expose ports with `-p host:container`

**Resources:**
- Official docs: https://docs.docker.com
- Best practices: Use multi-stage builds for smaller images

**When to use:**
- User asks about Docker
- Deployment tasks involving containers
Retrieval:
// Loaded automatically during prompt assembly
const journalContent = fs.readFileSync(journalPath, 'utf-8');
const learningContent = fs.readFileSync(learningPath, 'utf-8');
const userContext = memory.getUserContext();
Purpose: Semantic embeddings for full-history similarity search. Implementation: VectorMemory.ts (lines 8-940) Characteristics:
  • Uses text-embedding-3-small (OpenAI) or text-embedding-004 (Google)
  • Stores embeddings in file-backed JSON (vector_memory.json)
  • Background indexing every 5 minutes
  • Cosine similarity search
  • Max entries: 10,000 (configurable via vectorMemoryMaxEntries)
Initialization:
memory.initVectorMemory({
  openaiApiKey: config.get('openaiApiKey'),
  googleApiKey: config.get('googleApiKey'),
  preferredProvider: 'openai',
  maxEntries: 10000
});
Indexing:
// Automatic: All memories saved via saveMemory() are queued for indexing
memory.saveMemory({ ... });
// → Queued for next background index cycle

// Manual: Force immediate indexing
await memory.vectorMemory.processQueue();
Retrieval:
// Semantic search across all indexed memories
const results = await memory.semanticSearch(
  "How do I configure the database?",
  limit: 10,
  filter: { source: 'telegram' } // Optional
);

// Deep recall (cross-session, excludes already-shown IDs)
const recalled = await memory.semanticRecall(
  "Previous deployment issues",
  limit: 8,
  excludeIds: new Set(['id1', 'id2'])
);
Response format:
interface ScoredVectorEntry {
  id: string;
  content: string;
  score: number;       // Cosine similarity (0-1)
  type: string;        // 'short' | 'episodic' | 'long'
  timestamp?: string;
  metadata?: any;
}

5. Daily Memory Logs

Purpose: Append-only markdown logs organized by date. Implementation: DailyMemory.ts (lines 38-742) Characteristics:
  • Files stored in ~/.orcbot/daily_memory/YYYY-MM-DD.md
  • Categorized entries (System, Research, Communication, Consolidation)
  • Automatically appended for important events
  • Read into extended context for awareness
Example log:
# Daily Memory - 2025-01-15

## 14:23 - Research
Searched for "latest AI news 2025". Found 10 articles. Top: "OpenAI releases GPT-5".

## 14:29 - Consolidation
Consolidated 12 memories from Telegram user 123456789:
- User asked about AI news
- Agent delivered 3 top articles
- User satisfied with response

## 15:10 - System
Memory flush triggered. Response: No important information to store.
Retrieval:
const dailyContext = memory.getDailyMemory().readRecentContext();
// Returns last 3 days of logs (configurable)

Memory Lifecycle

1. Creation

// User sends message
memory.saveMemory({
  id: `telegram-in-${Date.now()}`,
  type: 'short',
  content: 'User: How do I deploy to production?',
  metadata: {
    source: 'telegram',
    chatId: '123456789',
    role: 'user'
  }
});
// → Saved to memory.json
// → Queued for vector indexing
// → Tracked for consolidation

2. Consolidation

// When short-term memory > 30 entries
await memory.consolidate(llm);
// 1. Takes oldest 20 short-term memories
// 2. Calls LLM to summarize
// 3. Saves episodic summary
// 4. Deletes those 20 short-term entries
// 5. Also appends to daily log

3. Retrieval

// During prompt assembly (parallel)
const context = await memory.assemblePromptContext(taskDescription);
// Returns:
// {
//   recent: MemoryEntry[],      // Last 20 short-term
//   episodic: MemoryEntry[],    // Last 5 episodic or semantically relevant
//   semantic: ScoredVectorEntry[], // Top 8 similar memories
//   extended: string            // Daily logs + long-term files
// }

4. Cleanup

// After action completion
const removed = memory.cleanupActionMemories(actionId);
logger.info(`Cleaned up ${removed} step memories for action ${actionId}`);

Memory Limits & Configuration

# orcbot.config.yaml

# Short-term memory
memoryContextLimit: 20                    # Max recent entries in prompt
memoryContentMaxLength: 1500              # Max chars per memory entry

# Episodic memory
memoryEpisodicLimit: 5                    # Max episodic entries in prompt
interactionBatchSize: 12                  # Entries per consolidation batch
interactionStaleMinutes: 10               # Auto-consolidate after N minutes

# Consolidation
memoryConsolidationThreshold: 30          # Trigger at N short-term entries
memoryConsolidationBatch: 20              # Consolidate oldest N entries

# Memory flush (reminder before consolidation)
memoryFlushSoftThreshold: 25              # Trigger at N entries
memoryFlushCooldownMinutes: 30            # Min interval between flushes

# Long-term context
memoryExtendedContextLimit: 2000          # Max chars of long-term files
journalContextLimit: 1500                 # Max chars of JOURNAL.md tail
learningContextLimit: 1500                # Max chars of LEARNING.md tail

# Vector memory
vectorMemoryMaxEntries: 10000             # Max embeddings stored

# Thread context
threadContextRecentN: 8                   # Recent messages from same contact
threadContextRelevantN: 8                 # Semantically relevant messages
threadContextMaxLineLen: 420              # Truncate long lines

# Step history compaction
stepCompactionThreshold: 10               # Compact when > N steps
stepCompactionPreserveFirst: 2            # Keep first N steps
stepCompactionPreserveLast: 5             # Keep last N steps

Memory Deduplication

OrcBot prevents storing duplicate events within a 5-minute window:
// Deduplication strategy (MemoryManager.ts:227-249)
function isDuplicateMemory(entry, existingMemories) {
  const cutoff = Date.now() - (5 * 60 * 1000); // 5 minutes
  
  return existingMemories.some(candidate => {
    // 1. Check stable event ID (messageId, eventId, statusMessageId)
    const entryId = entry.metadata?.messageId;
    const candidateId = candidate.metadata?.messageId;
    if (entryId && candidateId && entryId === candidateId) return true;
    
    // 2. Fallback: same source + contact + content
    const sameSource = candidate.metadata?.source === entry.metadata?.source;
    const sameContact = candidate.metadata?.sourceId === entry.metadata?.sourceId;
    const sameContent = candidate.content === entry.content;
    const recentEnough = new Date(candidate.timestamp) > cutoff;
    
    return sameSource && sameContact && sameContent && recentEnough;
  });
}

Session Scoping

OrcBot supports three session scoping modes (configurable via sessionScope):

1. main (Single Global Session)

All conversations share the same memory pool.
sessionScope: main
Use case: Single-user deployment, no multi-tenancy.

2. per-peer (Cross-Platform Identity)

Memories are scoped to a user across all channels.
sessionScope: per-peer
identityLinks:
  telegram:123456789: user-alice
  whatsapp:+1234567890: user-alice
Use case: Same user contacts you on multiple platforms.

3. per-channel-peer (Default)

Memories are scoped to a user on a specific channel.
sessionScope: per-channel-peer
Use case: Multi-tenant deployment, strict isolation. Session ID format:
// main
resolveSessionScopeId('telegram', { chatId: '123' }) 
// → "scope:main"

// per-peer
resolveSessionScopeId('telegram', { userId: '123' })
// → "scope:peer:telegram:123" (or "scope:peer:user-alice" if linked)

// per-channel-peer
resolveSessionScopeId('telegram', { chatId: '123' })
// → "scope:channel-peer:telegram:123"

Thread Context Retrieval

For follow-up messages, OrcBot retrieves thread context (recent + relevant messages from the same contact):
// Automatically injected during prompt assembly
const threadContext = await buildThreadContext({
  source: 'telegram',
  sourceId: '123456789',
  taskDescription: 'Continue the research'
});

// Strategy:
// 1. Get last 8 messages from same contact (RECENT)
// 2. Semantic search for 8 relevant messages (RELEVANT)
// 3. Merge and deduplicate
// 4. Truncate lines > 420 chars

// Result:
// [
//   "[2025-01-15 14:00] (user) Can you research Docker?",
//   "[2025-01-15 14:01] (assistant) Sure, I'll look into Docker...",
//   "[2025-01-15 14:05] (user) What about Kubernetes?",
//   "[2025-01-15 14:06] (assistant) Kubernetes is...",
//   "[2025-01-15 14:10] (user) Continue the research"
// ]
This enables pronouns and context continuity:
User: "Tell me about Docker"
Agent: "Docker is a containerization platform..."
User: "How do I use it?"  ← Thread context knows "it" = Docker

Contact Profiles

OrcBot maintains per-contact profiles in ~/.orcbot/profiles/:
{
  "jid": "telegram:123456789",
  "displayName": "Alice",
  "identity": {
    "primary": { "platform": "telegram", "id": "123456789" },
    "aliases": [
      { "platform": "telegram", "id": "123456789", "seenAt": "2025-01-15T14:00:00Z" },
      { "platform": "whatsapp", "id": "+1234567890", "seenAt": "2025-01-14T10:30:00Z" }
    ]
  },
  "platform": "telegram",
  "platformIds": {
    "telegram": "123456789",
    "whatsapp": "+1234567890"
  },
  "lastSeenAt": "2025-01-15T14:23:01Z",
  "lastMessageType": "text",
  "createdAt": "2025-01-10T08:00:00Z",
  "lastUpdated": "2025-01-15T14:23:01Z"
}
Retrieval:
const profile = memory.getContactProfile('telegram:123456789');
// Returns JSON string or null

Memory Flush System

Inspired by OpenClaw, OrcBot proactively reminds the LLM to write important memories before consolidation:
// Triggered when short-term memory > 25 entries
if (shortCount >= 25 && now - lastFlushAt > 30 * 60 * 1000) {
  await memory.memoryFlush(llm);
}

// Sends this prompt to LLM:
// "The conversation history is approaching consolidation. Review recent context
//  and use memory_write to store any important information (user preferences,
//  decisions, learned facts). Ignore temporary conversation context."
This prevents accidental loss of important facts during consolidation.

Performance Considerations

Token costs:
  • Recent context: ~1,000-3,000 tokens
  • Episodic: ~500-1,500 tokens
  • Long-term files: ~500-2,000 tokens
  • Vector search: ~500-2,000 tokens
  • Total memory context per step: ~2,500-8,500 tokens
Optimization tips:
  • Use memoryContextLimit to reduce recent entries
  • Enable step compaction to trim history
  • Set memoryContentMaxLength to 1500 (prevents bloat from large tool outputs)
  • Use isLeanMode: true for simple tasks (skips semantic/episodic retrieval)
Latency:
  • Semantic search: ~200-500ms (depends on embedding API)
  • Consolidation: ~2-5 seconds (LLM call)
  • Daily log read: ~50-100ms
  • Parallel retrieval: ~300-600ms (all async ops run concurrently)

Debugging Memory Issues

1. Inspect raw memory:
cat ~/.orcbot/memory.json | jq '.memories | length'
# Should be < 50 (auto-consolidates at 30)
2. Check vector memory stats:
const stats = memory.vectorMemory?.getStats();
console.log(stats);
// { indexed: 1234, queued: 5, totalEmbeddings: 1234 }
3. View episodic summaries:
cat ~/.orcbot/memory.json | jq '.memories[] | select(.type == "episodic") | .content'
4. Test semantic search:
const results = await memory.semanticSearch("test query", 5);
console.log(results.map(r => ({ content: r.content.slice(0, 80), score: r.score })));

Further Reading