Architecture Overview¶
Nova AI's multi-agent orchestration system design and component overview.
Table of Contents¶
- System Architecture
- Core Components
- Data Flow
- Agent System
- MCP Integration
- Performance Optimizations
System Architecture¶
User → Slash Command → Claude SDK Executor → Agent Layer → MCP Tools → Services
(/novaai) ↓
Session Management
↓
Cost Tracking
↓
Approval Gates (Security)
Architecture Layers¶
1. Interface Layer
- Claude Code CLI with /novaai and /novaai-execute commands
- Python SDK for programmatic access
- RESTful API (future)
2. Orchestration Layer
- ClaudeSDKExecutor - Main orchestration engine
- SessionManager - State persistence and context compression
- CostTracker - Token usage and cost monitoring
- ApprovalGate - Security approval workflow
3. Agent Layer (6 specialized agents)
- orchestrator - Task decomposition and coordination
- architect - System design and technical decisions
- implementer - Code implementation
- code-reviewer - Security and quality validation
- tester - Test generation and execution
- debugger - Error analysis and fixing
4. Tool Layer - SDK MCP - In-process tools (10-100x faster) - Knowledge Base (FAISS/HNSW) - GitHub operations - External MCP - Stdio servers - Browser automation - Voice transcription/TTS - AWS/Docker operations
5. Service Layer - Anthropic API (Claude Messages) - GitHub API - LangFuse (observability) - Vector databases
Core Components¶
ClaudeSDKExecutor¶
Main orchestration engine wrapping Claude Agent SDK.
Responsibilities: - Task execution with retry logic - Agent lifecycle management - Session persistence - Tool coordination - Cost tracking
Key Features: - API retry with exponential backoff - Circuit breaker pattern (5 failures → open) - Session compression (88-95% overhead reduction) - Prompt caching (90% cost savings)
executor = ClaudeSDKExecutor(
project_root=Path.cwd(),
agent_name="orchestrator",
use_sdk_mcp=True, # Enable SDK MCP
model="sonnet" # Sonnet 4.5
)
result = await executor.run_task("implement feature")
File: src/orchestrator/claude_sdk_executor.py
SessionManager¶
Manages conversation state and context compression.
Features: - Lightweight state (max 150K tokens) - Resume conversations across executions - Compress decision history (10 decisions → summary) - Session cleanup (auto-expire after 24h)
File: src/orchestrator/session_manager.py
CostTracker¶
Monitors token usage and API costs.
Metrics Tracked: - Input/output tokens - Cache hits/misses - Total cost (USD) - Duration (ms)
File: src/orchestrator/cost_tracker.py
Data Flow¶
Task Execution Flow¶
1. User Request → /novaai "implement feature X"
↓
2. Command Parser → Extract task description
↓
3. ClaudeSDKExecutor → Initialize session (or resume)
↓
4. Orchestrator Agent → Decompose task into subtasks
↓
5. Specialist Agents → Execute subtasks in parallel
↓
6. Code-Reviewer Agent → Validate output (mandatory)
↓
7. Results Aggregation → Combine agent outputs
↓
8. Session Save → Persist state for resumption
↓
9. Response → Return to user
Knowledge Base Query Flow¶
User Query → KB Server (SDK MCP)
↓
Vector Search (FAISS/HNSW)
↓
Top-K Results (similarity ranked)
↓
Context Enhancement
↓
Return to Agent
Performance: 69.6x faster than external MCP (4ms vs 278ms)
Agent System¶
Agent Specification¶
Each agent defined in .claude/agents/<agent-name>.md:
---
name: code-reviewer
description: Security and quality code review
tools:
- Read
- Grep
- Glob
- Bash
model: sonnet
---
# Agent Instructions
[Markdown instructions for agent behavior]
Agent Coordination¶
Orchestrator Agent coordinates specialist agents:
- Task Analysis - Understand user request
- Decomposition - Break into subtasks
- Assignment - Route subtasks to specialists
- Aggregation - Combine results
- Validation - Code review before returning
Dev→Review→Refine Loop:
Agent Tools¶
Standard Tools:
- Read - File reading
- Write - File writing
- Edit - In-place editing
- Grep - Content search
- Glob - File pattern matching
- Bash - Command execution
MCP Tools (via SDK):
- search_knowledge_base - Vector search
- github_search_code - GitHub search
- github_get_pr - PR retrieval
MCP Integration¶
SDK MCP (In-Process)¶
Advantages: - 10-100x faster (no stdio overhead) - Direct Python function calls - Shared memory space - No serialization overhead
Implementation:
# Register SDK MCP server
kb_sdk_server = build_kb_sdk_server()
mcp_servers = [kb_sdk_server]
client = ClaudeAgent(
mcp_servers=mcp_servers, # In-process
system_prompt="..."
)
Files:
- src/orchestrator/tools/sdk_mcp_kb_server.py
- src/orchestrator/tools/sdk_mcp_github_server.py
External MCP (Stdio)¶
Status: Not used - Nova AI uses SDK MCP only for maximum performance (10-100x speedup).
All MCP servers are in-process (SDK-based). No stdio/external MCP servers are configured.
Performance Optimizations¶
1. Prompt Caching (90% Cost Savings)¶
system_prompt = {
"type": "preset",
"preset": "claude_code",
"cache_control": {"type": "ephemeral"} # Cache system prompt
}
Results: - First request: 50K tokens @ $3/M = $0.15 - Cached request: 5K tokens @ $0.30/M = $0.002 - 98.7% cost reduction on cache hits
2. SDK MCP (10-100x Faster)¶
| Operation | External MCP | SDK MCP | Speedup |
|---|---|---|---|
| KB Search | 278ms | 4ms | 69.6x |
| GitHub API | 150ms | 15ms | 10x |
3. Session Compression (88-95% Overhead)¶
Before: - Full conversation history (150K tokens) - Every message added to history - Context window fills quickly
After: - Lightweight state (decision summaries) - Compress every 10 decisions - Resume with minimal context
Savings: - Overhead: 50K → 5K tokens (90% reduction) - Cost per resume: $0.15 → $0.015
4. Parallel Agent Execution¶
# Execute agents in parallel
tasks = [
ParallelTask("code-reviewer", "review src/"),
ParallelTask("tester", "run tests"),
ParallelTask("architect", "assess design")
]
results = await parallel_executor.execute(tasks)
Performance: - Sequential: 3 × 60s = 180s - Parallel: max(60s) = 60s - 3x faster
Related Documentation¶
- Security Architecture - 7 defense layers
- API Reference - Python SDK
- Developer Guide - Contributing workflow
- Architecture Decision Records - Design decisions
Last Updated: November 2025 Version: 2.3.0