Cost Tracking Guide¶

Learn how to monitor and optimize Claude API costs with LangFuse integration.

Overview¶

Nova AI includes production-grade cost tracking with:

Token usage tracking - Input, output, cached tokens
Cost calculation - Accurate pricing per model (Haiku, Sonnet, Opus)
Per-agent attribution - Track costs by agent and task
Time-series aggregation - Daily/weekly/monthly reports
LangFuse integration - Real-time observability dashboard
Cost optimization tips - Reduce costs by 90%+ with caching

graph LR
    A[Claude API Call] --> B[CostTracker]
    B --> C[Token Metrics]
    B --> D[Cost Calculation]
    B --> E[LangFuse Export]
    E --> F[Dashboard]

    style B fill:#3f51b5,color:#fff

Quick Start¶

1. Enable Cost Tracking¶

Cost tracking is automatically enabled in Nova AI. No configuration needed for basic tracking.

2. View Cost Summary¶

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Get overall summary
summary = tracker.get_cost_summary()
print(f"Total cost: ${summary['total_cost_usd']:.4f}")
print(f"Total tokens: {summary['total_tokens']:,}")

3. Enable LangFuse (Optional)¶

For real-time dashboard and advanced analytics:

# Sign up at https://cloud.langfuse.com (free tier)

# Set environment variables
export LANGFUSE_PUBLIC_KEY="your-public-key"
export LANGFUSE_SECRET_KEY="your-secret-key"
export LANGFUSE_HOST="https://cloud.langfuse.com"

# Restart Nova AI

Cost Tracking Features¶

Token Usage Tracking¶

Track all token types:

Token Type	Cost (Sonnet 4.5)	Description
Input	$3.00 / MTok	New input tokens
Output	$15.00 / MTok	Generated tokens
Cache Write	$3.75 / MTok	Writing to cache
Cache Read	$0.30 / MTok	Reading from cache (90% discount)

Example:

from src.orchestrator.cost_tracker import track_agent_call

async with track_agent_call(agent_name="implementer", task="Add authentication"):
    result = await executor.run_task("implement user authentication")

# Automatically tracked:
# - Input tokens: 25,000
# - Output tokens: 5,000
# - Cache read tokens: 180,000 (90% savings)
# - Total cost: $0.15

Per-Agent Cost Attribution¶

Track costs by agent:

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Get costs by agent
agent_costs = tracker.get_costs_by_agent()

for agent, cost in agent_costs.items():
    print(f"{agent}: ${cost:.4f}")

Example Output:

orchestrator: $2.45
implementer: $8.32
code-reviewer: $1.89
tester: $0.54
Total: $13.20

Time-Series Reports¶

Track costs over time:

from datetime import datetime, timedelta

tracker = get_tracker()

# Get daily costs for last 30 days
start_date = datetime.now() - timedelta(days=30)
end_date = datetime.now()

daily_costs = tracker.get_daily_costs(
    start_date=start_date,
    end_date=end_date
)

for date, cost in daily_costs.items():
    print(f"{date}: ${cost:.2f}")

Example Output:

2025-10-01: $15.23
2025-10-02: $22.45
2025-10-03: $18.90
...
Total (30 days): $547.82

Cost Summary Reports¶

Generate comprehensive reports:

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Full summary
summary = tracker.get_cost_summary(
    start_date="2025-10-01",
    end_date="2025-10-31"
)

print(f"""
Cost Summary (October 2025)
{'='*40}
Total API calls: {summary['total_calls']:,}
Total tokens: {summary['total_tokens']:,}
Total cost: ${summary['total_cost_usd']:.2f}

Breakdown by model:
- Sonnet 4.5: ${summary['sonnet_cost']:.2f} ({summary['sonnet_percent']:.1f}%)
- Haiku 4.5: ${summary['haiku_cost']:.2f} ({summary['haiku_percent']:.1f}%)

Cache efficiency:
- Cache hit rate: {summary['cache_hit_rate']:.1f}%
- Cache savings: ${summary['cache_savings']:.2f} (90% discount)

Top agents:
1. implementer: ${summary['top_agents'][0]['cost']:.2f}
2. code-reviewer: ${summary['top_agents'][1]['cost']:.2f}
3. orchestrator: ${summary['top_agents'][2]['cost']:.2f}
""")

LangFuse Integration¶

Setup¶

Sign up at cloud.langfuse.com (free tier: 50K events/month)
Get API keys from Settings → API Keys
Set environment variables:

export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"

Verify connection:

from src.orchestrator.cost_tracker import verify_langfuse_connection

if verify_langfuse_connection():
    print("✅ LangFuse connected")
else:
    print("❌ LangFuse connection failed")

Dashboard Features¶

LangFuse provides:

Real-time cost tracking - Live cost updates
Token usage graphs - Visualize token usage over time
Agent performance - Compare agent efficiency
Trace inspection - Detailed call traces
Cost alerts - Email alerts for budget thresholds

Trace Inspection¶

LangFuse captures full execution traces:

Task: Implement user authentication
│
├─ orchestrator.run_task (150ms, $0.02)
│  ├─ KB search: "authentication patterns" (8ms, $0.001)
│  └─ Create plan (142ms, $0.019)
│
├─ implementer.run_task (4,500ms, $0.12)
│  ├─ Read: src/auth/routes.py (5ms)
│  ├─ Write: src/auth/service.py (50ms)
│  └─ Write: tests/auth/test_service.py (45ms)
│
└─ code-reviewer.run_task (2,100ms, $0.04)
   ├─ Security scan (1,200ms, $0.025)
   └─ Best practices check (900ms, $0.015)

Total: 6,750ms, $0.18

Cost Optimization¶

1. Enable Prompt Caching (90% Savings)¶

Enabled by default in Nova AI:

# Automatic prompt caching
executor = ClaudeSDKExecutor(
    agent_name="orchestrator",
    use_sdk_mcp=True  # Enables caching
)

# Result: 90% cost reduction on repeated context

Example Savings:

Scenario	Without Caching	With Caching	Savings
KB search (repeated)	$0.10	$0.01	90%
Session continuation	$0.25	$0.03	88%
Multi-agent workflow	$0.50	$0.08	84%

2. Use Session Continuation¶

Reuse sessions for related tasks:

# First task (no session)
result1 = await executor.run_task("implement feature A")
# Cost: $0.25

# Second task (reuse session)
executor2 = ClaudeSDKExecutor(
    agent_name="implementer",
    session_id=result1.session_id  # Reuse session
)
result2 = await executor2.run_task("implement feature B")
# Cost: $0.03 (88% savings)

3. Choose Right Model¶

Use appropriate model for task complexity:

Task Complexity	Model	Cost	When to Use
Simple	Haiku 4.5	$0.80/MTok	Code review, testing
Standard	Sonnet 4.5	$3.00/MTok	All agents (recommended)
Complex	Opus 4	$15.00/MTok	Architecture, critical decisions

Nova AI uses Sonnet 4.5 for all agents (Anthropic's recommendation for agentic workflows).

4. Batch Operations¶

Combine multiple operations:

# BAD: Multiple separate calls
result1 = await executor.run_task("implement feature A")  # $0.25
result2 = await executor.run_task("implement feature B")  # $0.25
result3 = await executor.run_task("implement feature C")  # $0.25
# Total: $0.75

# GOOD: Single batch call
result = await executor.run_task(
    "implement features A, B, and C"
)
# Total: $0.30 (60% savings)

5. Use SDK MCP (10-100x Faster)¶

SDK MCP reduces token usage:

# stdio MCP (slower, more tokens)
executor = ClaudeSDKExecutor(use_sdk_mcp=False)
result = await executor.run_task("search KB")
# Tokens: 5,000, Cost: $0.02

# SDK MCP (faster, fewer tokens)
executor = ClaudeSDKExecutor(use_sdk_mcp=True)
result = await executor.run_task("search KB")
# Tokens: 500, Cost: $0.002 (90% savings)

Cost Monitoring¶

Set Budget Alerts¶

from src.orchestrator.cost_tracker import set_budget_alert

# Alert when daily cost exceeds $10
set_budget_alert(
    threshold_usd=10.0,
    period="daily",
    email="admin@example.com"
)

Real-Time Monitoring¶

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Monitor current task
async with track_agent_call(agent_name="implementer", task="Feature X"):
    result = await executor.run_task("implement feature X")

# Check cost after task
task_cost = tracker.get_last_task_cost()
print(f"Task cost: ${task_cost:.4f}")

# Check if over budget
if tracker.is_over_budget(daily_limit=10.0):
    print("⚠️ Daily budget exceeded!")

Export Reports¶

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()

# Export to CSV
tracker.export_to_csv(
    filepath="cost_report_october.csv",
    start_date="2025-10-01",
    end_date="2025-10-31"
)

# Export to JSON
tracker.export_to_json(
    filepath="cost_report_october.json",
    start_date="2025-10-01",
    end_date="2025-10-31"
)

Pricing Reference¶

Claude 4 Pricing (October 2025)¶

Model	Input	Output	Cache Write	Cache Read
Haiku 4.5	$0.80	$4.00	$1.00	$0.08
Sonnet 4.5	$3.00	$15.00	$3.75	$0.30
Opus 4	$15.00	$75.00	$18.75	$1.50

All prices in USD per million tokens (MTok).

Example Cost Calculations¶

Feature Implementation (Sonnet 4.5):

Input tokens: 25,000 (25K)
Output tokens: 5,000 (5K)
Cache read tokens: 180,000 (180K)

Cost calculation:
- Input: (25K / 1M) × $3.00 = $0.075
- Output: (5K / 1M) × $15.00 = $0.075
- Cache read: (180K / 1M) × $0.30 = $0.054

Total: $0.204 (~$0.20)

Code Review (Sonnet 4.5):

Input tokens: 20,000 (20K)
Output tokens: 2,000 (2K)
Cache read tokens: 150,000 (150K)

Cost calculation:
- Input: (20K / 1M) × $3.00 = $0.060
- Output: (2K / 1M) × $15.00 = $0.030
- Cache read: (150K / 1M) × $0.30 = $0.045

Total: $0.135 (~$0.14)

Best Practices¶

1. Enable Caching¶

Always enable prompt caching (default in Nova AI):

✅ executor = ClaudeSDKExecutor(use_sdk_mcp=True)
❌ executor = ClaudeSDKExecutor(use_sdk_mcp=False)

2. Reuse Sessions¶

Reuse sessions for related tasks:

✅ ClaudeSDKExecutor(session_id=previous_session_id)
❌ New executor for each task

3. Batch Operations¶

Combine multiple operations:

✅ "implement features A, B, and C"
❌ Three separate calls

4. Monitor Regularly¶

Check costs daily:

tracker = get_tracker()
daily_cost = tracker.get_today_cost()
print(f"Today: ${daily_cost:.2f}")

5. Set Budgets¶

Set daily/monthly budgets:

set_budget_alert(threshold_usd=10.0, period="daily")

Troubleshooting¶

LangFuse Connection Failed¶

Error: Failed to connect to LangFuse

Solution: Verify API keys:

echo $LANGFUSE_PUBLIC_KEY
echo $LANGFUSE_SECRET_KEY
echo $LANGFUSE_HOST

Cost Tracking Not Working¶

Solution: Verify cost tracker is initialized:

from src.orchestrator.cost_tracker import get_tracker

tracker = get_tracker()
if tracker.is_enabled():
    print("✅ Cost tracking enabled")
else:
    print("❌ Cost tracking disabled")

Incorrect Cost Calculations¶

Solution: Verify pricing table is up-to-date:

from src.orchestrator.cost_tracker import PRICING

print(PRICING)

Next Steps¶

API Reference

CostTracker API documentation

Cost Tracker API
LangFuse Dashboard

Set up real-time monitoring

LangFuse Docs
Architecture

Cost tracking system design

Architecture