API Overview¶
Complete reference for Nova AI modules, classes, and entry points.
Core Modules¶
Orchestrator (src/orchestrator/)¶
Multi-agent orchestration engine coordinating specialized agents and workflows.
ClaudeSDKExecutor¶
Location: src/orchestrator/claude_sdk_executor.py
Main executor for running agent tasks with the Claude SDK.
from src.orchestrator.claude_sdk_executor import ClaudeSDKExecutor
from pathlib import Path
# Create executor
executor = ClaudeSDKExecutor(
project_root=Path.cwd(),
agent_name="orchestrator", # or "implementer", "code-reviewer", etc.
use_sdk_mcp=True, # Enable in-process MCP servers (10-100x faster)
model="sonnet" # "sonnet", "opus", or "haiku"
)
# Execute task
result = await executor.run_task("implement user authentication")
# Access results
print(result.transcript) # Full conversation
print(result.session_id) # For session continuation
print(result.usage_metrics) # Token usage and costs
Key Methods:
- run_task(prompt) - Execute single task
- run_task_async(prompt) - Async execution
- run_task_with_result(prompt) - Returns ExecutorResult object
SessionManager¶
Location: src/orchestrator/session_manager.py
Lightweight session state management with aggressive compression.
from src.orchestrator.session_manager import SessionManager
manager = SessionManager()
# Save session state
state = manager.get_session_state(session_id)
# Compress state (max 10 decisions, 150K tokens)
manager.prune_session_if_needed(session_id)
# Fork for exploration
forked_id = manager.fork_session(session_id)
Features: - Aggressive compression (max 10 decisions) - Fork support for parallel exploration - Automatic pruning at 150K tokens - Pydantic-based serialization
CostTracker¶
Location: src/orchestrator/cost_tracker.py
Token usage and cost tracking with LangFuse integration.
from src.orchestrator.cost_tracker import CostTracker
tracker = CostTracker()
# Track usage
tracker.track_usage(
model="claude-sonnet-4-5-20250929",
input_tokens=1000,
output_tokens=500,
cache_read_tokens=5000
)
# Get summary
summary = tracker.get_cost_summary()
print(f"Total cost: ${summary['total_cost']}")
print(f"Cache hit rate: {summary['cache_hit_rate']}%")
# LangFuse tracing
tracker.enable_langfuse(
public_key="your-key",
secret_key="your-secret"
)
Knowledge Base (src/kb_store/)¶
FAISS/HNSW-based semantic search for documentation and specifications.
LocalKB¶
Location: src/kb_store/retriever.py
from src.kb_store.retriever import LocalKB
# Initialize
kb = LocalKB(index_dir="vector/local-kb")
# Search
results = kb.search_by_text(
query="How to implement authentication?",
top_k=5
)
for doc, score in results:
print(f"Score: {score:.3f}")
print(f"Content: {doc}")
DocumentIndexer¶
Location: src/kb_store/ingestor.py
from src.kb_store.ingestor import DocumentIndexer
indexer = DocumentIndexer(output_dir="vector/local-kb")
# Index documents
await indexer.index_directory(
source_dir="docs/",
file_patterns=["*.md", "*.txt"]
)
MCP Servers¶
SDK MCP Servers (In-Process, 10-100x faster)¶
Location: src/orchestrator/tools/sdk_mcp_*.py
# KB Search
from src.orchestrator.tools.sdk_mcp_kb_server import search_knowledge_base
results = await search_knowledge_base(
query="API standards",
top_k=5
)
# GitHub Operations
from src.orchestrator.tools.sdk_mcp_github_server import create_pull_request
pr_url = await create_pull_request(
title="Add feature X",
body="Description",
base="main"
)
# External Memory
from src.orchestrator.tools.sdk_mcp_memory_server import store_memory
await store_memory(
content="Important context",
category="decisions"
)
External MCP Servers (Stdio)¶
Location: mcp/servers/
SDK MCP servers (in-process, no separate processes needed):
- kb - src/orchestrator/tools/sdk_mcp_kb_server.py
- github - src/orchestrator/tools/sdk_mcp_github_server.py
- memory - src/orchestrator/tools/sdk_mcp_memory_server.py
All MCP servers are SDK-based (in-process) for 10-100x speedup. No stdio servers to run.
Entry Points¶
Slash Commands¶
Location: .claude/commands/novaai.md
# Interactive mode with planning
/novaai implement user authentication
# The command expands to:
# - Phase 0: Clarifying questions
# - Phase 1: KB search for standards
# - Phase 2: Planning with architect
# - Phase 3: Implementation with implementer
# - Phase 4: Code review
# - Phase 5: Testing
# - Phase 6: Verification
Python Scripts¶
Location: scripts/
# Index knowledge base
python scripts/index_kb.py --source docs/ --output vector/local-kb
# Health check
python scripts/mcp_health_check.py
# Cost analysis
python scripts/analyze_costs.py --langfuse
Data Models¶
ExecutorResult¶
@dataclass
class ExecutorResult:
transcript: str # Full conversation
session_id: str # For continuation
usage_metrics: UsageMetrics # Token usage
success: bool # Execution status
error: str | None # Error message if failed
UsageMetrics¶
@dataclass
class UsageMetrics:
input_tokens: int
output_tokens: int
cache_read_tokens: int
cache_creation_tokens: int
@property
def total_tokens(self) -> int: ...
@property
def cache_hit_rate(self) -> float: ...
SessionState¶
@dataclass
class SessionState:
session_id: str
agent_name: str
decisions: list[Decision] # Max 10
context_tokens: int # Max 150K
created_at: datetime
last_activity: datetime
Configuration¶
Agent Configuration¶
Location: .claude/agents/*.md
---
name: implementer
description: Specialized code implementation agent
tools:
- Read
- Write
- Edit
- Grep
- Glob
- Bash
model: sonnet
---
MCP Configuration¶
SDK MCP - Configured in Python code (no .mcp.json needed):
from src.orchestrator.claude_sdk_executor import ClaudeSDKExecutor
executor = ClaudeSDKExecutor(
agent_name="orchestrator",
use_sdk_mcp=True # Enables kb, github, memory servers
)
All MCP servers are SDK-based (in-process). No external configuration needed.
Error Handling¶
All APIs use standard Python exceptions:
try:
result = await executor.run_task("task")
except ValueError as e:
# Invalid input
print(f"Validation error: {e}")
except RuntimeError as e:
# Execution error
print(f"Runtime error: {e}")
except Exception as e:
# Unexpected error
print(f"Unexpected error: {e}")
Performance¶
Prompt Caching¶
# Automatically enabled for agent instructions
# 90% cost reduction on cache hits
# No configuration needed - handled by SDK
Session Continuation¶
# Reuse sessions for 88-95% overhead reduction
result1 = await executor.run_task("Task 1")
# Continue session
executor2 = ClaudeSDKExecutor(
agent_name="code-reviewer",
session_id=result1.session_id # Reuse!
)
result2 = await executor2.run_task("Task 2")
SDK MCP vs External MCP¶
# External MCP (stdio): ~150ms per call
# SDK MCP (in-process): ~2.15ms per call
# 69.6x faster!
# Use SDK MCP by default:
executor = ClaudeSDKExecutor(use_sdk_mcp=True)
Best Practices¶
- Always use session continuation for related tasks
- Enable SDK MCP for 10-100x speedup
- Monitor cache hit rates (target: 80%+)
- Use appropriate agent (don't use orchestrator for simple tasks)
- Implement retry logic for production (see architecture docs)
See Also¶
- Developer Guide - Issue → PR workflow
- Architecture Overview - System design
- Coding Style - Code conventions
- Testing Guide - Test standards
Last Updated: November 7, 2025 Version: 2.3.0