ADR-005: Skills System Architecture¶

Date: 2025-10-26 Status: Accepted

Context¶

Nova AI's agents need specialized knowledge across diverse domains:

Python best practices (PEP 8, type hints, async patterns)
Testing strategies (pytest, mocking, coverage)
Security patterns (input validation, authentication, encryption)
GitHub automation (PR workflows, commit conventions, CI/CD)
Documentation standards (Sphinx, docstrings, ADRs)

Initial Approach Problems¶

Monolithic Agent Prompts:

<!-- code-reviewer.md -->
You are a code reviewer. You must check:
- Security: SQL injection, XSS, CSRF...
- Style: PEP 8, naming, imports...
- Testing: Coverage, edge cases...
- Documentation: Docstrings, comments...
(... 5000 lines of detailed instructions)

Issues: 1. Prompt Length: Agent files ballooned to 5K+ lines (>80% of token limit) 2. Context Waste: Most knowledge irrelevant for specific tasks 3. Duplication: Same security patterns in 5+ agent files 4. Maintenance: Updates required editing multiple files 5. No Reusability: Knowledge locked in specific agents

Requirements¶

Modularity: Knowledge separated into reusable skills
Progressive Disclosure: Load only relevant knowledge per task
Maintainability: Update knowledge in one place
Composability: Mix and match skills per agent
Efficiency: Minimize prompt tokens (leverage caching)

Decision¶

We implemented a hierarchical skills system with progressive disclosure via @import directives.

Architecture¶

.claude/skills/
├── README.md                          # Skills system documentation
├── development/                       # Coding best practices
│   ├── SKILL.md                      # Domain overview
│   ├── python-best-practices.md      # PEP 8, type hints, async
│   ├── testing-strategies.md         # pytest, mocking, coverage
│   ├── error-handling.md             # Exception patterns
│   └── performance-optimization.md   # Profiling, caching
├── github-automation/                # GitHub workflows
│   ├── SKILL.md                      # Domain overview
│   ├── pr-workflows.md               # Creating, reviewing PRs
│   ├── commit-conventions.md         # Conventional commits
│   ├── ci-cd-patterns.md             # GitHub Actions best practices
│   └── release-management.md         # Versioning, changelogs
├── operations/                       # DevOps and deployment
│   ├── SKILL.md                      # Domain overview
│   ├── deployment-strategies.md      # Blue-green, canary, rollback
│   ├── monitoring.md                 # Logging, metrics, alerts
│   └── security-hardening.md         # Secrets, permissions, auditing
└── meta/                             # Meta-skills
    ├── SKILL.md                      # Domain overview
    ├── agent-communication.md        # Inter-agent protocols
    ├── cost-optimization.md          # Token usage, caching
    └── debugging-agents.md           # Agent troubleshooting

Progressive Disclosure via @import¶

Agent Base Prompt (concise):

<!-- .claude/agents/code-reviewer.md -->
---
name: code-reviewer
mode: auto
tools:
  - allow: Read, Grep, Glob
  - ask: Bash, Write
---

# Code Reviewer Agent

You perform security, correctness, and maintainability reviews.

Core checklist:
- Security vulnerabilities
- Logic errors and edge cases
- Code style and maintainability
- Test coverage

@import .claude/skills/development/python-best-practices.md
@import .claude/skills/development/testing-strategies.md
@import .claude/skills/operations/security-hardening.md

Skill File (detailed, cached):

<!-- .claude/skills/development/python-best-practices.md -->

# Python Best Practices

## Type Hints
Always use type hints for function signatures:
```python
def process_data(items: List[Dict[str, Any]]) -> pd.DataFrame:
    """Process items into DataFrame."""
    ...

Async Patterns¶

Use async/await for I/O-bound operations:

async def fetch_data(url: str) -> Dict:
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.json()

(... detailed patterns with examples)

**How It Works**:
1. Agent base prompt (500 tokens) loaded on every call
2. `@import` directives trigger skill loading (2K tokens each)
3. Skills marked with `cache_control` for 90% cost reduction
4. Only imported skills loaded (not entire skills library)

### Skill Composition

Agents can import multiple skills:

```markdown
<!-- .claude/agents/architect.md -->
@import .claude/skills/development/python-best-practices.md
@import .claude/skills/development/performance-optimization.md
@import .claude/skills/operations/deployment-strategies.md
@import .claude/skills/meta/cost-optimization.md

Benefits: - Mix domain expertise per agent - Share common knowledge (DRY principle) - Cache skill content across agents (90% savings)

Skill Metadata (YAML Frontmatter)¶

Each skill file includes metadata:

---
skill_name: python-best-practices
domain: development
version: 1.2.0
updated: 2025-10-26
cache: true  # Enable prompt caching
prerequisites: []  # Other skills to import first
applicable_agents:
  - code-reviewer
  - architect
  - debugger
---

Implementation¶

Agent Loading with Skills (src/orchestrator/claude_sdk_executor.py):

def load_agent_with_skills(self, agent_name: str) -> str:
    """Load agent prompt and resolve @import directives."""
    agent_path = self.agents_dir / f"{agent_name}.md"
    agent_content = agent_path.read_text()

    # Find all @import directives
    imports = re.findall(r'@import\s+(.+\.md)', agent_content)

    # Load skill files
    skill_content = []
    for skill_path in imports:
        full_path = self.project_root / skill_path
        skill = full_path.read_text()

        # Mark skill for prompt caching
        skill_content.append({
            "type": "text",
            "text": skill,
            "cache_control": {"type": "ephemeral"}  # 90% cost reduction
        })

    # Combine agent + skills
    return agent_content, skill_content

Skill Validation (ensures no broken imports):

def validate_skills(self) -> List[str]:
    """Validate all @import directives resolve correctly."""
    errors = []

    for agent_file in self.agents_dir.glob("*.md"):
        content = agent_file.read_text()
        imports = re.findall(r'@import\s+(.+\.md)', content)

        for skill_path in imports:
            full_path = self.project_root / skill_path
            if not full_path.exists():
                errors.append(f"{agent_file.name}: Missing skill {skill_path}")

    return errors

Consequences¶

Positive¶

80% Smaller Agent Files: 5K lines → 1K lines (skills separated)
90% Cost Reduction on Skills: Prompt caching eliminates repeated skill loading
DRY Principle: Update knowledge in one place (e.g., security patterns)
Composability: Mix skills per agent (e.g., architect = dev + ops + meta)
Progressive Disclosure: Load only relevant knowledge per task
Versioning: Track skill versions independently (v1.2.0)
Reusability: Same skill used by 5+ agents

Negative¶

Indirection: Must follow @import to see full agent context
Cache Dependency: Requires Claude SDK v0.1.4+ with prompt caching
Validation Overhead: Must check all imports resolve correctly
Learning Curve: Developers must understand skills system

Trade-offs¶

Considered Alternatives:

Monolithic Agent Prompts (Original)
❌ 5K+ line files
❌ 95% duplication across agents
❌ No caching benefits
✅ Simple (everything in one file)
External Knowledge Base
✅ Centralized knowledge
❌ Requires KB search per query (50-200ms)
❌ Retrieval may miss relevant context
❌ More complex architecture
Skills System with @import (Chosen)
✅ Modular, reusable
✅ 90% cost savings via caching
✅ Progressive disclosure
⚠️ Requires import validation
Python Package Imports
✅ Familiar to developers
❌ Skills are not code (Markdown)
❌ Breaks prompt caching
❌ Runtime overhead
LangChain Tools
✅ Composable tools
❌ Vendor lock-in (LangChain)
❌ Not designed for knowledge (more for actions)
❌ No caching benefits

Why We Chose Skills with @import: - Maximizes prompt caching benefits (90% savings) - Simple, declarative syntax (Markdown) - Works with Claude SDK native features - Clear separation of concerns (agent vs skills)

Skill Organization Principles¶

1. Domain-Based Hierarchy¶

Skills organized by domain: - development/ - Coding best practices - github-automation/ - GitHub workflows - operations/ - DevOps and deployment - meta/ - Meta-skills (agent communication, cost optimization)

2. SKILL.md Convention¶

Each domain has a SKILL.md overview:

<!-- .claude/skills/development/SKILL.md -->

# Development Skills

This domain contains coding best practices and patterns.

## Available Skills

- [python-best-practices.md](./python-best-practices.md) - PEP 8, type hints, async
- [testing-strategies.md](./testing-strategies.md) - pytest, mocking, coverage
- [error-handling.md](./error-handling.md) - Exception patterns
- [performance-optimization.md](./performance-optimization.md) - Profiling, caching

## Usage

Import specific skills in agent frontmatter:
@import .claude/skills/development/python-best-practices.md

3. Skill Size Guidelines¶

Target: 1-3K tokens per skill (fits in single cache block)
Maximum: 5K tokens (split if larger)
Minimum: 200 tokens (merge if smaller)

4. Versioning¶

Skills follow semantic versioning: - v1.0.0 - Initial version - v1.1.0 - Add new patterns (backward compatible) - v2.0.0 - Breaking changes (rename, restructure)

Cost Impact¶

Before Skills System:

Code review (10 calls/day):
  Agent prompt: 5K tokens × 10 = 50K tokens
  No caching (different content each time)
  Cost: 50K × $3.00/MTok = $0.15/day

After Skills System:

Code review (10 calls/day):
  Agent prompt: 1K tokens × 10 = 10K tokens (80% reduction)
  Skills: 4K tokens × 1 (cached, loaded once)
  Cache reads: 4K tokens × 9 × 0.1 (90% discount)
  Cost: (10K + 4K + 3.6K) × $3.00/MTok = $0.053/day

Savings: $0.097/day = $35.40/year per agent
With 10 agents: $354/year

Implementation Timeline¶

October 8: Analyzed agent prompt duplication (95% overlap)
October 10: Designed skills hierarchy (4 domains)
October 12: Implemented @import directive parser
October 14: Added prompt caching for skills (90% savings)
October 16: Migrated 5 agents to skills system
October 18: Validated cost savings ($354/year)
October 20: Completed migration (10 agents)

See ADR-003: Cost Tracking for measuring cost savings
See ADR-004: Model Selection for complementary cost optimizations

Validation¶

Tested skills system with: - ✅ 10 agents using skills (80% smaller prompts) - ✅ 20+ skill files across 4 domains - ✅ Prompt caching (90% cost reduction) - ✅ Import validation (no broken links) - ✅ Skill versioning (semantic versions tracked) - ✅ Production use (2+ weeks stable)

Migration Path¶

Converting Monolithic Agent to Skills:

Extract Common Patterns:

# Identify duplicated content
grep -r "PEP 8" .claude/agents/*.md
# Found in 5 agents → Extract to skill

Create Skill File:

<!-- .claude/skills/development/python-best-practices.md -->
---
skill_name: python-best-practices
domain: development
version: 1.0.0
cache: true
---

# Python Best Practices
(... content extracted from agents)

Update Agent Files:

<!-- Before -->
# Code Reviewer
You must follow PEP 8...
(... 500 lines of Python rules)

<!-- After -->
# Code Reviewer
@import .claude/skills/development/python-best-practices.md

Validate:

python scripts/validate_skills.py
# ✅ All imports resolve correctly

References¶

Implementation: src/orchestrator/claude_sdk_executor.py (import parser)
Skills Directory: .claude/skills/
Validation Script: scripts/validate_skills.py
Documentation: .claude/skills/README.md
Agent Examples: .claude/agents/code-reviewer.md