Documentation Index Fetch the complete documentation index at: https://docs.orcbot.buzzchat.site/llms.txt
Use this file to discover all available pages before exploring further.
Overview
OrcBot uses Vitest for unit and integration testing. This guide covers testing strategies, running tests, and writing new tests for skills and core components.
Quick Start
Running Tests
# Run all tests
npm test
# Run tests in watch mode
npm run test:watch
# Run a specific test file
npx vitest run tests/memory.test.ts
# Run tests matching a pattern
npx vitest run -t "MemoryManager"
Test Output
✓ tests/memory.test.ts (5)
✓ MemoryManager (5)
✓ should save and retrieve short memory
✓ should consolidate episodic memory
✓ should perform semantic search
✓ should respect memory limits
✓ should flush stale memories
Test Files 1 passed (1)
Tests 5 passed (5)
Duration 2.34s
Test Structure
File Organization
tests/
├── memory.test.ts # Memory system tests
├── skills/
│ ├── web-search.test.ts # Web search skill tests
│ ├── files.test.ts # File operation tests
│ └── shell.test.ts # Shell execution tests
├── core/
│ ├── agent.test.ts # Agent loop tests
│ ├── decision.test.ts # Decision pipeline tests
│ └── skills.test.ts # Skills manager tests
└── utils/
├── parser.test.ts # JSON parser tests
└── logger.test.ts # Logger tests
Test File Template
import { describe , it , expect , beforeEach , afterEach } from 'vitest' ;
import { YourComponent } from '../src/core/YourComponent' ;
describe ( 'YourComponent' , () => {
let component : YourComponent ;
beforeEach (() => {
// Setup
component = new YourComponent ();
});
afterEach (() => {
// Cleanup
component . cleanup ();
});
it ( 'should do something' , async () => {
// Arrange
const input = 'test' ;
// Act
const result = await component . process ( input );
// Assert
expect ( result ). toBe ( 'expected' );
});
});
Testing Strategies
Unit Tests
Test individual functions and classes in isolation.
Example: Testing a Parser Function
import { parseDecisionOutput } from '../src/core/ParserLayer' ;
it ( 'should parse valid JSON decision' , () => {
const input = `{"reasoning": "test", "tools": [], "completed": false}` ;
const result = parseDecisionOutput ( input );
expect ( result ). toEqual ({
reasoning: 'test' ,
tools: [],
completed: false
});
});
it ( 'should handle malformed JSON with fallback' , () => {
const input = `{reasoning: test, tools: []}` ;
const result = parseDecisionOutput ( input );
expect ( result ). toBeDefined ();
expect ( result . reasoning ). toContain ( 'test' );
});
Integration Tests
Test multiple components working together.
Example: Testing Agent with Memory
import { Agent } from '../src/core/Agent' ;
import { MemoryManager } from '../src/memory/MemoryManager' ;
it ( 'should save action results to memory' , async () => {
const agent = new Agent ();
const memory = agent . memory ;
await agent . processAction ({
actionId: 'test-1' ,
task: 'Test task' ,
priority: 5
});
const recent = memory . getRecentContext ();
expect ( recent ). toContain ( 'Test task' );
});
Mock Testing
Mock external dependencies (LLM, APIs) for fast, reliable tests.
Example: Mocking LLM Calls
import { vi } from 'vitest' ;
import { MultiLLM } from '../src/core/MultiLLM' ;
it ( 'should handle LLM responses' , async () => {
const mockLLM = new MultiLLM ({});
// Mock the chat method
vi . spyOn ( mockLLM , 'chat' ). mockResolvedValue ({
content: '{"reasoning": "mocked", "tools": [], "completed": false}'
});
const response = await mockLLM . chat ([
{ role: 'user' , content: 'test' }
]);
expect ( response . content ). toContain ( 'mocked' );
});
End-to-End Tests
Test the full agent loop with real skills (use sparingly, they’re slow).
Example: Testing Full Action Execution
import { Agent } from '../src/core/Agent' ;
it ( 'should complete a simple task end-to-end' , async () => {
const agent = new Agent ();
const result = await agent . processAction ({
actionId: 'e2e-1' ,
task: 'Read the file test.txt and tell me its contents' ,
priority: 5
});
expect ( result . completed ). toBe ( true );
expect ( result . observations ). toContain ( 'File contents:' );
}, 30000 ); // 30s timeout for E2E tests
Testing Skills
Skill Test Template
import { describe , it , expect } from 'vitest' ;
import { yourSkill } from '../src/skills/yourSkill' ;
describe ( 'yourSkill' , () => {
it ( 'should have correct metadata' , () => {
expect ( yourSkill . name ). toBe ( 'your_skill_name' );
expect ( yourSkill . description ). toBeDefined ();
expect ( yourSkill . parameters ). toBeDefined ();
});
it ( 'should execute successfully' , async () => {
const context = {
actionId: 'test-1' ,
userId: 'test-user' ,
channelType: 'cli'
};
const result = await yourSkill . handler (
{ param1: 'value1' },
context
);
expect ( result ). toContain ( 'expected output' );
});
it ( 'should handle errors gracefully' , async () => {
const context = { actionId: 'test-1' };
const result = await yourSkill . handler (
{ param1: 'invalid' },
context
);
expect ( result ). toContain ( 'error' );
});
});
Testing Web Skills
Mock HTTP Requests:
import { vi } from 'vitest' ;
import { webSearchSkill } from '../src/core/Agent' ;
it ( 'should search the web' , async () => {
// Mock fetch
global . fetch = vi . fn (). mockResolvedValue ({
ok: true ,
json : async () => ({
organic: [
{ title: 'Result 1' , snippet: 'Description 1' }
]
})
});
const result = await webSearchSkill . handler (
{ query: 'test query' },
{ actionId: 'test-1' }
);
expect ( result ). toContain ( 'Result 1' );
expect ( fetch ). toHaveBeenCalledWith (
expect . stringContaining ( 'test query' )
);
});
Testing File Skills
Use Temporary Directories:
import fs from 'fs' ;
import path from 'path' ;
import os from 'os' ;
import { readFileSkill , writeFileSkill } from '../src/core/Agent' ;
it ( 'should write and read files' , async () => {
const tempDir = fs . mkdtempSync ( path . join ( os . tmpdir (), 'orcbot-test-' ));
const testFile = path . join ( tempDir , 'test.txt' );
// Write
await writeFileSkill . handler (
{ filePath: testFile , content: 'Hello' },
{ actionId: 'test-1' }
);
// Read
const result = await readFileSkill . handler (
{ filePath: testFile },
{ actionId: 'test-1' }
);
expect ( result ). toContain ( 'Hello' );
// Cleanup
fs . rmSync ( tempDir , { recursive: true });
});
Testing Memory System
Short Memory Tests
import { MemoryManager } from '../src/memory/MemoryManager' ;
it ( 'should save and retrieve short memory' , () => {
const memory = new MemoryManager ( '/tmp/orcbot-test' );
memory . saveMemory ({
actionId: 'test-1' ,
step: 1 ,
content: 'Test observation'
});
const recent = memory . getRecentContext ();
expect ( recent ). toContain ( 'Test observation' );
});
Episodic Memory Tests
it ( 'should consolidate episodic memory' , async () => {
const memory = new MemoryManager ( '/tmp/orcbot-test' );
// Add many short memories
for ( let i = 0 ; i < 50 ; i ++ ) {
memory . saveMemory ({
actionId: `test- ${ i } ` ,
step: 1 ,
content: `Memory ${ i } `
});
}
// Trigger consolidation
await memory . consolidateIfNeeded ();
const episodic = memory . getEpisodicMemory ();
expect ( episodic . length ). toBeGreaterThan ( 0 );
});
Vector Memory Tests
it ( 'should perform semantic search' , async () => {
const memory = new MemoryManager ( '/tmp/orcbot-test' );
// Add searchable content
await memory . saveToVectorMemory ( 'OrcBot is an autonomous agent' );
await memory . saveToVectorMemory ( 'TypeScript is a programming language' );
// Search
const results = await memory . semanticSearch ( 'agent' );
expect ( results [ 0 ]. content ). toContain ( 'autonomous' );
});
Testing Decision Pipeline
Deduplication Tests
import { DecisionPipeline } from '../src/core/DecisionPipeline' ;
it ( 'should deduplicate repeated tool calls' , () => {
const pipeline = new DecisionPipeline ();
const decision = {
reasoning: 'test' ,
tools: [
{ name: 'web_search' , metadata: { query: 'test' } },
{ name: 'web_search' , metadata: { query: 'test' } }
],
completed: false
};
const filtered = pipeline . applyGuardrails ( decision , [
{ name: 'web_search' , metadata: { query: 'test' } }
]);
expect ( filtered . tools ). toHaveLength ( 0 ); // Duplicate removed
});
Termination Review Tests
it ( 'should block premature completion' , () => {
const pipeline = new DecisionPipeline ();
const decision = {
reasoning: 'Done' ,
tools: [],
completed: true
};
const result = pipeline . reviewTermination ( decision , {
deepToolsUsed: [ 'web_search' ],
lastSentStep: 0 ,
currentStep: 2
});
expect ( result . shouldBlock ). toBe ( true );
expect ( result . auditCodes ). toContain ( 'UNSENT_RESULTS' );
});
Testing Configuration
vitest.config.ts
import { defineConfig } from 'vitest/config' ;
export default defineConfig ({
test: {
globals: true ,
environment: 'node' ,
setupFiles: [ './tests/setup.ts' ],
coverage: {
provider: 'v8' ,
reporter: [ 'text' , 'json' , 'html' ],
exclude: [
'node_modules/' ,
'dist/' ,
'tests/' ,
'**/*.test.ts'
]
},
testTimeout: 10000 ,
hookTimeout: 10000
}
}) ;
Test Setup File
// tests/setup.ts
import { beforeAll , afterAll } from 'vitest' ;
import fs from 'fs' ;
import path from 'path' ;
import os from 'os' ;
let tempDir : string ;
beforeAll (() => {
// Create temp directory for tests
tempDir = fs . mkdtempSync ( path . join ( os . tmpdir (), 'orcbot-test-' ));
process . env . ORCBOT_DATA_DIR = tempDir ;
});
afterAll (() => {
// Cleanup
if ( fs . existsSync ( tempDir )) {
fs . rmSync ( tempDir , { recursive: true });
}
});
Continuous Integration
GitHub Actions
# .github/workflows/test.yml
name : Tests
on : [ push , pull_request ]
jobs :
test :
runs-on : ubuntu-latest
steps :
- uses : actions/checkout@v3
- uses : actions/setup-node@v3
with :
node-version : '18'
- run : npm install
- run : npm run build
- run : npm test
- name : Upload coverage
uses : codecov/codecov-action@v3
with :
files : ./coverage/coverage-final.json
Best Practices
Write Fast Tests
Mock external APIs
Use in-memory storage
Keep test data small
Parallelize tests
Test Boundaries
Test error handling
Test edge cases
Test invalid inputs
Test resource limits
Keep Tests Isolated
No shared state
Clean up after each test
Use temp directories
Reset mocks
Maintain Coverage
Aim for 80%+ coverage
Cover critical paths
Test new features
Update tests with code changes
Troubleshooting
Tests Timing Out
Symptoms: Tests fail with “Test timeout” error.
Solutions:
// Increase timeout for specific test
it ( 'slow test' , async () => {
// test code
}, 30000 ); // 30 seconds
// Or in vitest.config.ts
export default defineConfig ({
test: {
testTimeout: 20000
}
}) ;
Flaky Tests
Symptoms: Tests pass sometimes, fail other times.
Common Causes:
Race conditions
Shared state between tests
Network/filesystem timing
Solutions:
// Use waitFor for async assertions
import { waitFor } from '@testing-library/react' ;
await waitFor (() => {
expect ( element ). toBeInTheDocument ();
});
// Add proper cleanup
afterEach ( async () => {
await agent . cleanup ();
vi . clearAllMocks ();
});
Memory Leaks
Symptoms: Tests slow down or crash after many runs.
Solutions:
afterEach (() => {
// Clear event listeners
eventBus . removeAllListeners ();
// Close connections
await memory . close ();
// Clear intervals
clearInterval ( heartbeatInterval );
});
Vitest Documentation Official Vitest testing framework docs
Agent Architecture Understand the agent structure for better tests
Skills System Learn how skills work to write skill tests
Contributing Guide Guidelines for contributing tests