Cost Tracking Guide¶
Learn how to monitor and optimize Claude API costs with LangFuse integration.
Overview¶
Nova AI includes production-grade cost tracking with:
- Token usage tracking - Input, output, cached tokens
- Cost calculation - Accurate pricing per model (Haiku, Sonnet, Opus)
- Per-agent attribution - Track costs by agent and task
- Time-series aggregation - Daily/weekly/monthly reports
- LangFuse integration - Real-time observability dashboard
- Cost optimization tips - Reduce costs by 90%+ with caching
graph LR
A[Claude API Call] --> B[CostTracker]
B --> C[Token Metrics]
B --> D[Cost Calculation]
B --> E[LangFuse Export]
E --> F[Dashboard]
style B fill:#3f51b5,color:#fff
Quick Start¶
1. Enable Cost Tracking¶
Cost tracking is automatically enabled in Nova AI. No configuration needed for basic tracking.
2. View Cost Summary¶
from src.orchestrator.cost_tracker import get_tracker
tracker = get_tracker()
# Get overall summary
summary = tracker.get_cost_summary()
print(f"Total cost: ${summary['total_cost_usd']:.4f}")
print(f"Total tokens: {summary['total_tokens']:,}")
3. Enable LangFuse (Optional)¶
For real-time dashboard and advanced analytics:
# Sign up at https://cloud.langfuse.com (free tier)
# Set environment variables
export LANGFUSE_PUBLIC_KEY="your-public-key"
export LANGFUSE_SECRET_KEY="your-secret-key"
export LANGFUSE_HOST="https://cloud.langfuse.com"
# Restart Nova AI
Cost Tracking Features¶
Token Usage Tracking¶
Track all token types:
| Token Type | Cost (Sonnet 4.5) | Description |
|---|---|---|
| Input | $3.00 / MTok | New input tokens |
| Output | $15.00 / MTok | Generated tokens |
| Cache Write | $3.75 / MTok | Writing to cache |
| Cache Read | $0.30 / MTok | Reading from cache (90% discount) |
Example:
from src.orchestrator.cost_tracker import track_agent_call
async with track_agent_call(agent_name="implementer", task="Add authentication"):
result = await executor.run_task("implement user authentication")
# Automatically tracked:
# - Input tokens: 25,000
# - Output tokens: 5,000
# - Cache read tokens: 180,000 (90% savings)
# - Total cost: $0.15
Per-Agent Cost Attribution¶
Track costs by agent:
from src.orchestrator.cost_tracker import get_tracker
tracker = get_tracker()
# Get costs by agent
agent_costs = tracker.get_costs_by_agent()
for agent, cost in agent_costs.items():
print(f"{agent}: ${cost:.4f}")
Example Output:
Time-Series Reports¶
Track costs over time:
from datetime import datetime, timedelta
tracker = get_tracker()
# Get daily costs for last 30 days
start_date = datetime.now() - timedelta(days=30)
end_date = datetime.now()
daily_costs = tracker.get_daily_costs(
start_date=start_date,
end_date=end_date
)
for date, cost in daily_costs.items():
print(f"{date}: ${cost:.2f}")
Example Output:
Cost Summary Reports¶
Generate comprehensive reports:
from src.orchestrator.cost_tracker import get_tracker
tracker = get_tracker()
# Full summary
summary = tracker.get_cost_summary(
start_date="2025-10-01",
end_date="2025-10-31"
)
print(f"""
Cost Summary (October 2025)
{'='*40}
Total API calls: {summary['total_calls']:,}
Total tokens: {summary['total_tokens']:,}
Total cost: ${summary['total_cost_usd']:.2f}
Breakdown by model:
- Sonnet 4.5: ${summary['sonnet_cost']:.2f} ({summary['sonnet_percent']:.1f}%)
- Haiku 4.5: ${summary['haiku_cost']:.2f} ({summary['haiku_percent']:.1f}%)
Cache efficiency:
- Cache hit rate: {summary['cache_hit_rate']:.1f}%
- Cache savings: ${summary['cache_savings']:.2f} (90% discount)
Top agents:
1. implementer: ${summary['top_agents'][0]['cost']:.2f}
2. code-reviewer: ${summary['top_agents'][1]['cost']:.2f}
3. orchestrator: ${summary['top_agents'][2]['cost']:.2f}
""")
LangFuse Integration¶
Setup¶
-
Sign up at cloud.langfuse.com (free tier: 50K events/month)
-
Get API keys from Settings → API Keys
-
Set environment variables:
export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"
- Verify connection:
from src.orchestrator.cost_tracker import verify_langfuse_connection
if verify_langfuse_connection():
print("✅ LangFuse connected")
else:
print("❌ LangFuse connection failed")
Dashboard Features¶
LangFuse provides:
- Real-time cost tracking - Live cost updates
- Token usage graphs - Visualize token usage over time
- Agent performance - Compare agent efficiency
- Trace inspection - Detailed call traces
- Cost alerts - Email alerts for budget thresholds
Trace Inspection¶
LangFuse captures full execution traces:
Task: Implement user authentication
│
├─ orchestrator.run_task (150ms, $0.02)
│ ├─ KB search: "authentication patterns" (8ms, $0.001)
│ └─ Create plan (142ms, $0.019)
│
├─ implementer.run_task (4,500ms, $0.12)
│ ├─ Read: src/auth/routes.py (5ms)
│ ├─ Write: src/auth/service.py (50ms)
│ └─ Write: tests/auth/test_service.py (45ms)
│
└─ code-reviewer.run_task (2,100ms, $0.04)
├─ Security scan (1,200ms, $0.025)
└─ Best practices check (900ms, $0.015)
Total: 6,750ms, $0.18
Cost Optimization¶
1. Enable Prompt Caching (90% Savings)¶
Enabled by default in Nova AI:
# Automatic prompt caching
executor = ClaudeSDKExecutor(
agent_name="orchestrator",
use_sdk_mcp=True # Enables caching
)
# Result: 90% cost reduction on repeated context
Example Savings:
| Scenario | Without Caching | With Caching | Savings |
|---|---|---|---|
| KB search (repeated) | $0.10 | $0.01 | 90% |
| Session continuation | $0.25 | $0.03 | 88% |
| Multi-agent workflow | $0.50 | $0.08 | 84% |
2. Use Session Continuation¶
Reuse sessions for related tasks:
# First task (no session)
result1 = await executor.run_task("implement feature A")
# Cost: $0.25
# Second task (reuse session)
executor2 = ClaudeSDKExecutor(
agent_name="implementer",
session_id=result1.session_id # Reuse session
)
result2 = await executor2.run_task("implement feature B")
# Cost: $0.03 (88% savings)
3. Choose Right Model¶
Use appropriate model for task complexity:
| Task Complexity | Model | Cost | When to Use |
|---|---|---|---|
| Simple | Haiku 4.5 | $0.80/MTok | Code review, testing |
| Standard | Sonnet 4.5 | $3.00/MTok | All agents (recommended) |
| Complex | Opus 4 | $15.00/MTok | Architecture, critical decisions |
Nova AI uses Sonnet 4.5 for all agents (Anthropic's recommendation for agentic workflows).
4. Batch Operations¶
Combine multiple operations:
# BAD: Multiple separate calls
result1 = await executor.run_task("implement feature A") # $0.25
result2 = await executor.run_task("implement feature B") # $0.25
result3 = await executor.run_task("implement feature C") # $0.25
# Total: $0.75
# GOOD: Single batch call
result = await executor.run_task(
"implement features A, B, and C"
)
# Total: $0.30 (60% savings)
5. Use SDK MCP (10-100x Faster)¶
SDK MCP reduces token usage:
# stdio MCP (slower, more tokens)
executor = ClaudeSDKExecutor(use_sdk_mcp=False)
result = await executor.run_task("search KB")
# Tokens: 5,000, Cost: $0.02
# SDK MCP (faster, fewer tokens)
executor = ClaudeSDKExecutor(use_sdk_mcp=True)
result = await executor.run_task("search KB")
# Tokens: 500, Cost: $0.002 (90% savings)
Cost Monitoring¶
Set Budget Alerts¶
from src.orchestrator.cost_tracker import set_budget_alert
# Alert when daily cost exceeds $10
set_budget_alert(
threshold_usd=10.0,
period="daily",
email="admin@example.com"
)
Real-Time Monitoring¶
from src.orchestrator.cost_tracker import get_tracker
tracker = get_tracker()
# Monitor current task
async with track_agent_call(agent_name="implementer", task="Feature X"):
result = await executor.run_task("implement feature X")
# Check cost after task
task_cost = tracker.get_last_task_cost()
print(f"Task cost: ${task_cost:.4f}")
# Check if over budget
if tracker.is_over_budget(daily_limit=10.0):
print("⚠️ Daily budget exceeded!")
Export Reports¶
from src.orchestrator.cost_tracker import get_tracker
tracker = get_tracker()
# Export to CSV
tracker.export_to_csv(
filepath="cost_report_october.csv",
start_date="2025-10-01",
end_date="2025-10-31"
)
# Export to JSON
tracker.export_to_json(
filepath="cost_report_october.json",
start_date="2025-10-01",
end_date="2025-10-31"
)
Pricing Reference¶
Claude 4 Pricing (October 2025)¶
| Model | Input | Output | Cache Write | Cache Read |
|---|---|---|---|---|
| Haiku 4.5 | $0.80 | $4.00 | $1.00 | $0.08 |
| Sonnet 4.5 | $3.00 | $15.00 | $3.75 | $0.30 |
| Opus 4 | $15.00 | $75.00 | $18.75 | $1.50 |
All prices in USD per million tokens (MTok).
Example Cost Calculations¶
Feature Implementation (Sonnet 4.5):
Input tokens: 25,000 (25K)
Output tokens: 5,000 (5K)
Cache read tokens: 180,000 (180K)
Cost calculation:
- Input: (25K / 1M) × $3.00 = $0.075
- Output: (5K / 1M) × $15.00 = $0.075
- Cache read: (180K / 1M) × $0.30 = $0.054
Total: $0.204 (~$0.20)
Code Review (Sonnet 4.5):
Input tokens: 20,000 (20K)
Output tokens: 2,000 (2K)
Cache read tokens: 150,000 (150K)
Cost calculation:
- Input: (20K / 1M) × $3.00 = $0.060
- Output: (2K / 1M) × $15.00 = $0.030
- Cache read: (150K / 1M) × $0.30 = $0.045
Total: $0.135 (~$0.14)
Best Practices¶
1. Enable Caching¶
Always enable prompt caching (default in Nova AI):
2. Reuse Sessions¶
Reuse sessions for related tasks:
3. Batch Operations¶
Combine multiple operations:
4. Monitor Regularly¶
Check costs daily:
5. Set Budgets¶
Set daily/monthly budgets:
Troubleshooting¶
LangFuse Connection Failed¶
Solution: Verify API keys:
Cost Tracking Not Working¶
Solution: Verify cost tracker is initialized:
from src.orchestrator.cost_tracker import get_tracker
tracker = get_tracker()
if tracker.is_enabled():
print("✅ Cost tracking enabled")
else:
print("❌ Cost tracking disabled")
Incorrect Cost Calculations¶
Solution: Verify pricing table is up-to-date:
Next Steps¶
-
API Reference
CostTracker API documentation
-
LangFuse Dashboard
Set up real-time monitoring
-
Architecture
Cost tracking system design