Skip to content

Cost Tracking Guide

Learn how to monitor and optimize Claude API costs with LangFuse integration.

Overview

Nova AI includes production-grade cost tracking with:

  • Token usage tracking - Input, output, cached tokens
  • Cost calculation - Accurate pricing per model (Haiku, Sonnet, Opus)
  • Per-agent attribution - Track costs by agent and task
  • Time-series aggregation - Daily/weekly/monthly reports
  • LangFuse integration - Real-time observability dashboard
  • Cost optimization tips - Reduce costs by 90%+ with caching
graph LR
    A[Claude API Call] --> B[CostTracker]
    B --> C[Token Metrics]
    B --> D[Cost Calculation]
    B --> E[LangFuse Export]
    E --> F[Dashboard]

    style B fill:#3f51b5,color:#fff

Quick Start

1. Enable Cost Tracking

Cost tracking is automatically enabled in Nova AI. No configuration needed for basic tracking.

2. View Cost Summary

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Get overall summary
summary = tracker.get_cost_summary()
print(f"Total cost: ${summary['total_cost_usd']:.4f}")
print(f"Total tokens: {summary['total_tokens']:,}")

3. Enable LangFuse (Optional)

For real-time dashboard and advanced analytics:

# Sign up at https://cloud.langfuse.com (free tier)

# Set environment variables
export LANGFUSE_PUBLIC_KEY="your-public-key"
export LANGFUSE_SECRET_KEY="your-secret-key"
export LANGFUSE_HOST="https://cloud.langfuse.com"

# Restart Nova AI

Cost Tracking Features

Token Usage Tracking

Track all token types:

Token Type Cost (Sonnet 4.5) Description
Input $3.00 / MTok New input tokens
Output $15.00 / MTok Generated tokens
Cache Write $3.75 / MTok Writing to cache
Cache Read $0.30 / MTok Reading from cache (90% discount)

Example:

from src.orchestrator.cost_tracker import track_agent_call

async with track_agent_call(agent_name="implementer", task="Add authentication"):
    result = await executor.run_task("implement user authentication")

# Automatically tracked:
# - Input tokens: 25,000
# - Output tokens: 5,000
# - Cache read tokens: 180,000 (90% savings)
# - Total cost: $0.15

Per-Agent Cost Attribution

Track costs by agent:

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Get costs by agent
agent_costs = tracker.get_costs_by_agent()

for agent, cost in agent_costs.items():
    print(f"{agent}: ${cost:.4f}")

Example Output:

orchestrator: $2.45
implementer: $8.32
code-reviewer: $1.89
tester: $0.54
Total: $13.20

Time-Series Reports

Track costs over time:

from datetime import datetime, timedelta

tracker = get_tracker()

# Get daily costs for last 30 days
start_date = datetime.now() - timedelta(days=30)
end_date = datetime.now()

daily_costs = tracker.get_daily_costs(
    start_date=start_date,
    end_date=end_date
)

for date, cost in daily_costs.items():
    print(f"{date}: ${cost:.2f}")

Example Output:

2025-10-01: $15.23
2025-10-02: $22.45
2025-10-03: $18.90
...
Total (30 days): $547.82

Cost Summary Reports

Generate comprehensive reports:

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Full summary
summary = tracker.get_cost_summary(
    start_date="2025-10-01",
    end_date="2025-10-31"
)

print(f"""
Cost Summary (October 2025)
{'='*40}
Total API calls: {summary['total_calls']:,}
Total tokens: {summary['total_tokens']:,}
Total cost: ${summary['total_cost_usd']:.2f}

Breakdown by model:
- Sonnet 4.5: ${summary['sonnet_cost']:.2f} ({summary['sonnet_percent']:.1f}%)
- Haiku 4.5: ${summary['haiku_cost']:.2f} ({summary['haiku_percent']:.1f}%)

Cache efficiency:
- Cache hit rate: {summary['cache_hit_rate']:.1f}%
- Cache savings: ${summary['cache_savings']:.2f} (90% discount)

Top agents:
1. implementer: ${summary['top_agents'][0]['cost']:.2f}
2. code-reviewer: ${summary['top_agents'][1]['cost']:.2f}
3. orchestrator: ${summary['top_agents'][2]['cost']:.2f}
""")

LangFuse Integration

Setup

  1. Sign up at cloud.langfuse.com (free tier: 50K events/month)

  2. Get API keys from Settings → API Keys

  3. Set environment variables:

export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"
  1. Verify connection:
from src.orchestrator.cost_tracker import verify_langfuse_connection

if verify_langfuse_connection():
    print("✅ LangFuse connected")
else:
    print("❌ LangFuse connection failed")

Dashboard Features

LangFuse provides:

  • Real-time cost tracking - Live cost updates
  • Token usage graphs - Visualize token usage over time
  • Agent performance - Compare agent efficiency
  • Trace inspection - Detailed call traces
  • Cost alerts - Email alerts for budget thresholds

Trace Inspection

LangFuse captures full execution traces:

Task: Implement user authentication
├─ orchestrator.run_task (150ms, $0.02)
│  ├─ KB search: "authentication patterns" (8ms, $0.001)
│  └─ Create plan (142ms, $0.019)
├─ implementer.run_task (4,500ms, $0.12)
│  ├─ Read: src/auth/routes.py (5ms)
│  ├─ Write: src/auth/service.py (50ms)
│  └─ Write: tests/auth/test_service.py (45ms)
└─ code-reviewer.run_task (2,100ms, $0.04)
   ├─ Security scan (1,200ms, $0.025)
   └─ Best practices check (900ms, $0.015)

Total: 6,750ms, $0.18

Cost Optimization

1. Enable Prompt Caching (90% Savings)

Enabled by default in Nova AI:

# Automatic prompt caching
executor = ClaudeSDKExecutor(
    agent_name="orchestrator",
    use_sdk_mcp=True  # Enables caching
)

# Result: 90% cost reduction on repeated context

Example Savings:

Scenario Without Caching With Caching Savings
KB search (repeated) $0.10 $0.01 90%
Session continuation $0.25 $0.03 88%
Multi-agent workflow $0.50 $0.08 84%

2. Use Session Continuation

Reuse sessions for related tasks:

# First task (no session)
result1 = await executor.run_task("implement feature A")
# Cost: $0.25

# Second task (reuse session)
executor2 = ClaudeSDKExecutor(
    agent_name="implementer",
    session_id=result1.session_id  # Reuse session
)
result2 = await executor2.run_task("implement feature B")
# Cost: $0.03 (88% savings)

3. Choose Right Model

Use appropriate model for task complexity:

Task Complexity Model Cost When to Use
Simple Haiku 4.5 $0.80/MTok Code review, testing
Standard Sonnet 4.5 $3.00/MTok All agents (recommended)
Complex Opus 4 $15.00/MTok Architecture, critical decisions

Nova AI uses Sonnet 4.5 for all agents (Anthropic's recommendation for agentic workflows).

4. Batch Operations

Combine multiple operations:

# BAD: Multiple separate calls
result1 = await executor.run_task("implement feature A")  # $0.25
result2 = await executor.run_task("implement feature B")  # $0.25
result3 = await executor.run_task("implement feature C")  # $0.25
# Total: $0.75

# GOOD: Single batch call
result = await executor.run_task(
    "implement features A, B, and C"
)
# Total: $0.30 (60% savings)

5. Use SDK MCP (10-100x Faster)

SDK MCP reduces token usage:

# stdio MCP (slower, more tokens)
executor = ClaudeSDKExecutor(use_sdk_mcp=False)
result = await executor.run_task("search KB")
# Tokens: 5,000, Cost: $0.02

# SDK MCP (faster, fewer tokens)
executor = ClaudeSDKExecutor(use_sdk_mcp=True)
result = await executor.run_task("search KB")
# Tokens: 500, Cost: $0.002 (90% savings)

Cost Monitoring

Set Budget Alerts

from src.orchestrator.cost_tracker import set_budget_alert

# Alert when daily cost exceeds $10
set_budget_alert(
    threshold_usd=10.0,
    period="daily",
    email="admin@example.com"
)

Real-Time Monitoring

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Monitor current task
async with track_agent_call(agent_name="implementer", task="Feature X"):
    result = await executor.run_task("implement feature X")

# Check cost after task
task_cost = tracker.get_last_task_cost()
print(f"Task cost: ${task_cost:.4f}")

# Check if over budget
if tracker.is_over_budget(daily_limit=10.0):
    print("⚠️ Daily budget exceeded!")

Export Reports

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Export to CSV
tracker.export_to_csv(
    filepath="cost_report_october.csv",
    start_date="2025-10-01",
    end_date="2025-10-31"
)

# Export to JSON
tracker.export_to_json(
    filepath="cost_report_october.json",
    start_date="2025-10-01",
    end_date="2025-10-31"
)

Pricing Reference

Claude 4 Pricing (October 2025)

Model Input Output Cache Write Cache Read
Haiku 4.5 $0.80 $4.00 $1.00 $0.08
Sonnet 4.5 $3.00 $15.00 $3.75 $0.30
Opus 4 $15.00 $75.00 $18.75 $1.50

All prices in USD per million tokens (MTok).

Example Cost Calculations

Feature Implementation (Sonnet 4.5):

Input tokens: 25,000 (25K)
Output tokens: 5,000 (5K)
Cache read tokens: 180,000 (180K)

Cost calculation:
- Input: (25K / 1M) × $3.00 = $0.075
- Output: (5K / 1M) × $15.00 = $0.075
- Cache read: (180K / 1M) × $0.30 = $0.054

Total: $0.204 (~$0.20)

Code Review (Sonnet 4.5):

Input tokens: 20,000 (20K)
Output tokens: 2,000 (2K)
Cache read tokens: 150,000 (150K)

Cost calculation:
- Input: (20K / 1M) × $3.00 = $0.060
- Output: (2K / 1M) × $15.00 = $0.030
- Cache read: (150K / 1M) × $0.30 = $0.045

Total: $0.135 (~$0.14)

Best Practices

1. Enable Caching

Always enable prompt caching (default in Nova AI):

 executor = ClaudeSDKExecutor(use_sdk_mcp=True)
 executor = ClaudeSDKExecutor(use_sdk_mcp=False)

2. Reuse Sessions

Reuse sessions for related tasks:

 ClaudeSDKExecutor(session_id=previous_session_id)
 New executor for each task

3. Batch Operations

Combine multiple operations:

 "implement features A, B, and C"
 Three separate calls

4. Monitor Regularly

Check costs daily:

tracker = get_tracker()
daily_cost = tracker.get_today_cost()
print(f"Today: ${daily_cost:.2f}")

5. Set Budgets

Set daily/monthly budgets:

set_budget_alert(threshold_usd=10.0, period="daily")

Troubleshooting

LangFuse Connection Failed

Error: Failed to connect to LangFuse

Solution: Verify API keys:

echo $LANGFUSE_PUBLIC_KEY
echo $LANGFUSE_SECRET_KEY
echo $LANGFUSE_HOST

Cost Tracking Not Working

Solution: Verify cost tracker is initialized:

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()
if tracker.is_enabled():
    print("✅ Cost tracking enabled")
else:
    print("❌ Cost tracking disabled")

Incorrect Cost Calculations

Solution: Verify pricing table is up-to-date:

from src.orchestrator.cost_tracker import PRICING

print(PRICING)

Next Steps