API Overview¶

Complete reference for Nova AI modules, classes, and entry points.

Core Modules¶

Orchestrator (`src/orchestrator/`)¶

Multi-agent orchestration engine coordinating specialized agents and workflows.

ClaudeSDKExecutor¶

Location: src/orchestrator/claude_sdk_executor.py

Main executor for running agent tasks with the Claude SDK.

from src.orchestrator.claude_sdk_executor import ClaudeSDKExecutor
from pathlib import Path

# Create executor
executor = ClaudeSDKExecutor(
    project_root=Path.cwd(),
    agent_name="orchestrator",  # or "implementer", "code-reviewer", etc.
    use_sdk_mcp=True,  # Enable in-process MCP servers (10-100x faster)
    model="sonnet"  # "sonnet", "opus", or "haiku"
)

# Execute task
result = await executor.run_task("implement user authentication")

# Access results
print(result.transcript)  # Full conversation
print(result.session_id)  # For session continuation
print(result.usage_metrics)  # Token usage and costs

Key Methods: - run_task(prompt) - Execute single task - run_task_async(prompt) - Async execution - run_task_with_result(prompt) - Returns ExecutorResult object

SessionManager¶

Location: src/orchestrator/session_manager.py

Lightweight session state management with aggressive compression.

from src.orchestrator.session_manager import SessionManager

manager = SessionManager()

# Save session state
state = manager.get_session_state(session_id)

# Compress state (max 10 decisions, 150K tokens)
manager.prune_session_if_needed(session_id)

# Fork for exploration
forked_id = manager.fork_session(session_id)

Features: - Aggressive compression (max 10 decisions) - Fork support for parallel exploration - Automatic pruning at 150K tokens - Pydantic-based serialization

CostTracker¶

Location: src/orchestrator/cost_tracker.py

Token usage and cost tracking with LangFuse integration.

from src.orchestrator.cost_tracker import CostTracker

tracker = CostTracker()

# Track usage
tracker.track_usage(
    model="claude-sonnet-4-5-20250929",
    input_tokens=1000,
    output_tokens=500,
    cache_read_tokens=5000
)

# Get summary
summary = tracker.get_cost_summary()
print(f"Total cost: ${summary['total_cost']}")
print(f"Cache hit rate: {summary['cache_hit_rate']}%")

# LangFuse tracing
tracker.enable_langfuse(
    public_key="your-key",
    secret_key="your-secret"
)

Knowledge Base (`src/kb_store/`)¶

FAISS/HNSW-based semantic search for documentation and specifications.

LocalKB¶

Location: src/kb_store/retriever.py

from src.kb_store.retriever import LocalKB

# Initialize
kb = LocalKB(index_dir="vector/local-kb")

# Search
results = kb.search_by_text(
    query="How to implement authentication?",
    top_k=5
)

for doc, score in results:
    print(f"Score: {score:.3f}")
    print(f"Content: {doc}")

DocumentIndexer¶

Location: src/kb_store/ingestor.py

from src.kb_store.ingestor import DocumentIndexer

indexer = DocumentIndexer(output_dir="vector/local-kb")

# Index documents
await indexer.index_directory(
    source_dir="docs/",
    file_patterns=["*.md", "*.txt"]
)

MCP Servers¶

SDK MCP Servers (In-Process, 10-100x faster)¶

Location: src/orchestrator/tools/sdk_mcp_*.py

# KB Search
from src.orchestrator.tools.sdk_mcp_kb_server import search_knowledge_base

results = await search_knowledge_base(
    query="API standards",
    top_k=5
)

# GitHub Operations
from src.orchestrator.tools.sdk_mcp_github_server import create_pull_request

pr_url = await create_pull_request(
    title="Add feature X",
    body="Description",
    base="main"
)

# External Memory
from src.orchestrator.tools.sdk_mcp_memory_server import store_memory

await store_memory(
    content="Important context",
    category="decisions"
)

External MCP Servers (Stdio)¶

Location: mcp/servers/

SDK MCP servers (in-process, no separate processes needed): - kb - src/orchestrator/tools/sdk_mcp_kb_server.py - github - src/orchestrator/tools/sdk_mcp_github_server.py - memory - src/orchestrator/tools/sdk_mcp_memory_server.py

All MCP servers are SDK-based (in-process) for 10-100x speedup. No stdio servers to run.

Entry Points¶

Slash Commands¶

Location: .claude/commands/novaai.md

# Interactive mode with planning
/novaai implement user authentication

# The command expands to:
# - Phase 0: Clarifying questions
# - Phase 1: KB search for standards
# - Phase 2: Planning with architect
# - Phase 3: Implementation with implementer
# - Phase 4: Code review
# - Phase 5: Testing
# - Phase 6: Verification

Python Scripts¶

Location: scripts/

# Index knowledge base
python scripts/index_kb.py --source docs/ --output vector/local-kb

# Health check
python scripts/mcp_health_check.py

# Cost analysis
python scripts/analyze_costs.py --langfuse

Data Models¶

ExecutorResult¶

@dataclass
class ExecutorResult:
    transcript: str  # Full conversation
    session_id: str  # For continuation
    usage_metrics: UsageMetrics  # Token usage
    success: bool  # Execution status
    error: str | None  # Error message if failed

UsageMetrics¶

@dataclass
class UsageMetrics:
    input_tokens: int
    output_tokens: int
    cache_read_tokens: int
    cache_creation_tokens: int

    @property
    def total_tokens(self) -> int: ...

    @property
    def cache_hit_rate(self) -> float: ...

SessionState¶

@dataclass
class SessionState:
    session_id: str
    agent_name: str
    decisions: list[Decision]  # Max 10
    context_tokens: int  # Max 150K
    created_at: datetime
    last_activity: datetime

Configuration¶

Agent Configuration¶

Location: .claude/agents/*.md

---
name: implementer
description: Specialized code implementation agent
tools:
  - Read
  - Write
  - Edit
  - Grep
  - Glob
  - Bash
model: sonnet
---

MCP Configuration¶

SDK MCP - Configured in Python code (no .mcp.json needed):

from src.orchestrator.claude_sdk_executor import ClaudeSDKExecutor

executor = ClaudeSDKExecutor(
    agent_name="orchestrator",
    use_sdk_mcp=True  # Enables kb, github, memory servers
)

All MCP servers are SDK-based (in-process). No external configuration needed.

Error Handling¶

All APIs use standard Python exceptions:

try:
    result = await executor.run_task("task")
except ValueError as e:
    # Invalid input
    print(f"Validation error: {e}")
except RuntimeError as e:
    # Execution error
    print(f"Runtime error: {e}")
except Exception as e:
    # Unexpected error
    print(f"Unexpected error: {e}")

Performance¶

Prompt Caching¶

# Automatically enabled for agent instructions
# 90% cost reduction on cache hits
# No configuration needed - handled by SDK

Session Continuation¶

# Reuse sessions for 88-95% overhead reduction
result1 = await executor.run_task("Task 1")

# Continue session
executor2 = ClaudeSDKExecutor(
    agent_name="code-reviewer",
    session_id=result1.session_id  # Reuse!
)
result2 = await executor2.run_task("Task 2")

SDK MCP vs External MCP¶

# External MCP (stdio): ~150ms per call
# SDK MCP (in-process): ~2.15ms per call
# 69.6x faster!

# Use SDK MCP by default:
executor = ClaudeSDKExecutor(use_sdk_mcp=True)

Best Practices¶

Always use session continuation for related tasks
Enable SDK MCP for 10-100x speedup
Monitor cache hit rates (target: 80%+)
Use appropriate agent (don't use orchestrator for simple tasks)
Implement retry logic for production (see architecture docs)