AI Agent Guide¶

Complete guide to using the AI Documentation Agent with iterative refinement.

Overview¶

The AI Agent (src/langgraph_agent.py) is the default implementation of the documentation agent, built using LangGraph. It uses an iterative critique-refine cycle to generate high-quality documentation that improves with each iteration.

Note: This guide covers the LangGraph-based implementation. The project also includes an original AIAgent implementation with a different architectural approach.

How It Works¶

The AI Agent follows a sophisticated workflow, which is implemented as a stateful graph using LangGraph:

graph TD
    A[Start] --> B[Analyze Codebase]
    B --> C[Detect Project Type]
    C --> D[Find & Prioritize Files]
    D --> E[Read File Contents]
    E --> F[Generate Initial Documentation]
    F --> G[Self-Critique]
    G --> H{Quality Acceptable?}
    H -->|No| I[Refine Documentation]
    I --> G
    H -->|Yes| J{Max Iterations?}
    J -->|No| I
    J -->|Yes| K[Save Documentation]
    K --> L[Report Metrics]
    L --> M[End]

    classDef processNode fill:#1e3a8a,stroke:#3b82f6,color:#fff
    classDef decisionNode fill:#065f46,stroke:#10b981,color:#fff
    classDef endNode fill:#7c2d12,stroke:#f97316,color:#fff

    class A,B,C,D,E,F,G,I,K,L processNode
    class H,J decisionNode
    class M endNode

Key Features¶

The agent doesn't settle for the first draft:

Iteration 1:

Draft → "Basic project structure documented"
Critique → "Missing deployment section, unclear component relationships"

Iteration 2:

Refined → "Added deployment guide, clarified architecture"
Critique → "Good improvement, but examples needed"

Iteration 3:

Refined → "Added code examples and usage patterns"
Critique → "Excellent quality, comprehensive coverage"
→ Accept ✓

2. Self-Critique¶

The agent evaluates its own work against quality criteria:

Completeness - All sections covered?
Clarity - Easy to understand?
Accuracy - Technically correct?
Organization - Well-structured?
Examples - Sufficient code samples?
Deployment - Setup instructions clear?

3. Smart File Analysis¶

Prioritizes important files based on project type:

Frontend Projects:

Priority 1: package.json, index.html, App.tsx
Priority 2: vite.config.ts, webpack.config.js
Priority 3: Component files, utilities

Backend Projects:

Priority 1: pom.xml, requirements.txt, go.mod
Priority 2: application.properties, config files
Priority 3: Controllers, services, models

Usage¶

Basic Usage¶

# Quick launcher (uses default settings)
python run.py

# Analyze specific project
python run.py --directory ~/my-project

# Custom output name
python run.py --directory ~/my-app --output my_documentation

Command-Line Options¶

Directory Options¶

# Analyze current directory
python run.py

# Analyze specific directory (relative)
python run.py --directory ./my-app

# Analyze specific directory (absolute)
python run.py --directory /home/user/projects/my-app

Model Options¶

# Use default model from .env
python run.py

# Override with specific model
python run.py --model codellama

# Fast model for quick docs
python run.py --model llama2:7b

# High-quality model
python run.py --model llama2:13b

Output Options¶

# Default format (Markdown)
python run.py

# HTML output
python run.py --format html

# PDF output (requires wkhtmltopdf)
python run.py --format pdf

# Custom filename
python run.py --output my_project_documentation

Quality Options¶

# Quick documentation (2 iterations)
python run.py --iterations 2 --max-files 20

# Standard quality (default)
python run.py --iterations 3 --max-files 30

# Maximum quality
python run.py --iterations 5 --max-files 100

Project Type¶

# Auto-detect (default)
python run.py

# Explicitly specify frontend
python run.py --project-type frontend

# Explicitly specify backend
python run.py --project-type backend

# Mixed project
python run.py --project-type mixed

Debugging¶

# Enable verbose logging
python run.py --verbose

# Check logs
tail -f ai_agent.log

Configuration¶

The agent respects all environment variables in .env:

Critical Settings¶

# Model selection
MODEL_NAME=codellama

# API endpoint
OLLAMA_API_URL=http://localhost:11434/api/generate

# Timeouts
API_TIMEOUT=300

# Retry behavior
MAX_RETRIES=3
RETRY_DELAY=2

# Quality threshold (0.0-1.0)
CRITIQUE_THRESHOLD=0.8

See Configuration Guide for details.

Workflow Examples¶

Example 1: Quick Project Documentation¶

Goal: Fast documentation for a small project

python run.py \
  --directory ./small-app \
  --iterations 2 \
  --max-files 15 \
  --model llama2:7b

Result: Basic but complete documentation in ~2 minutes

Example 2: Production-Ready Documentation¶

Goal: Comprehensive, high-quality documentation

python run.py \
  --directory ~/important-project \
  --iterations 5 \
  --max-files 100 \
  --model codellama \
  --format pdf \
  --output project_documentation \
  --verbose

Result: Thorough, refined documentation in ~15 minutes

Example 3: Backend API Documentation¶

Goal: Document REST API with endpoints

python run.py \
  --directory ./api-server \
  --project-type backend \
  --model codellama \
  --max-files 50 \
  --iterations 4 \
  --format html

Result: API-focused documentation with endpoints documented

Example 4: Frontend Component Library¶

Goal: Document React component library

python run.py \
  --directory ./component-library \
  --project-type frontend \
  --max-files 75 \
  --iterations 4 \
  --format html

Result: Component-centric documentation with usage examples

Understanding Agent Behavior¶

Iteration Process¶

Each iteration consists of:

Generate/Refine - Create or improve documentation
Critique - AI evaluates the documentation
Decision - Accept or continue refining

Console Output:

INFO - Iteration 1/3 - Generating documentation...
INFO - Iteration 1/3 - Critiquing documentation...
INFO - Critique: Missing deployment section. Adding refinement...
INFO - Iteration 2/3 - Refining documentation...
INFO - Iteration 2/3 - Critiquing documentation...
INFO - Critique: Good quality. Minor improvements needed.
INFO - Iteration 3/3 - Refining documentation...
INFO - Iteration 3/3 - Critiquing documentation...
INFO - Critique: Excellent quality. Documentation complete.
INFO - Documentation accepted.

File Selection¶

Discovery Process: 1. Scan directory recursively 2. Filter supported file types 3. Exclude ignored directories 4. Detect project type 5. Prioritize important files 6. Limit to max-files

Priority Scoring:

High Priority (analyzed first):
  - package.json, requirements.txt, pom.xml
  - index.html, App.tsx/jsx, main.py
  - Configuration files

Medium Priority:
  - Core modules and components
  - Services and utilities
  - API routes and controllers

Lower Priority:
  - Test files
  - Utility functions
  - Helper modules

Quality Metrics¶

The agent tracks:

Iterations Used - How many refinement cycles
Files Analyzed - Number of files processed
Generation Time - Total time spent
API Calls Made - Number of LLM requests
Final Quality Score - Self-assessed quality (0.0-1.0)

Advanced Features¶

Caching (Future)¶

The agent is designed to cache responses:

# Enable caching in .env
ENABLE_CACHING=true

Benefits: - Faster subsequent runs - Reduced API calls - Lower costs - Consistent results

Note: Full caching implementation coming soon

Custom Prompts¶

You can customize the prompts by editing src/langgraph_agent.py:

def _build_critique_prompt(self, documentation: str) -> str:
    return f"""Your custom critique prompt...

    Evaluation criteria:
    1. Your specific requirement
    2. Your quality standard
    3. Your documentation style

    {documentation}
    """

Error Recovery¶

The agent handles errors gracefully:

API Timeout → Retry with exponential backoff
Connection Error → Retry up to MAX_RETRIES
File Read Error → Skip file, continue with others
Quality Not Met → Continue refining or accept at max iterations

Troubleshooting¶

Poor Quality Output¶

Symptoms: - Documentation lacks detail - Missing sections - Unclear explanations

Solutions:

# Increase iterations
python run.py --iterations 5

# Use better model
python run.py --model codellama

# Analyze more files for context
python run.py --max-files 100

# Specify project type
python run.py --project-type backend

Slow Performance¶

Symptoms: - Takes too long to complete - Frequent timeouts

Solutions:

# Reduce files
python run.py --max-files 20

# Use faster model
python run.py --model llama2:7b

# Reduce iterations
python run.py --iterations 2

# Increase timeout in .env
API_TIMEOUT=600

API Errors¶

Symptoms: - Connection refused - Timeout errors - Model not found

Solutions:

# Ensure Ollama is running
ollama serve

# Check model is available
ollama list

# Pull required model
ollama pull llama2:7b

# Verify .env configuration
cat .env | grep OLLAMA_API_URL

File Discovery Issues¶

Symptoms: - No files found - Important files skipped

Solutions:

# Use verbose mode
python run.py --verbose

# Check directory path
python run.py --directory /absolute/path

# Review ignored directories
# Edit IGNORED_DIRECTORIES in src/doc_generator.py

Best Practices¶

1. Start with Defaults¶

# First run - use defaults
python run.py --directory ./my-project

Evaluate the output, then adjust settings.

2. Choose Appropriate Iterations¶

2 iterations → Quick docs, small projects
3 iterations → Standard quality, most projects (default)
5 iterations → Maximum quality, important projects

3. Match Model to Task¶

llama2:7b → Fast, good for testing
mistral → Balanced quality/speed
codellama → Best for code documentation
llama2:13b → Maximum quality (slow)

4. Monitor and Adjust¶

# Enable verbose mode
python run.py --verbose

# Watch logs in real-time
tail -f ai_agent.log

5. Specify Project Type¶

Auto-detection is good, but explicit is better:

python run.py --project-type backend

Performance Tips¶

For Speed¶

python run.py \
  --model llama2:7b \
  --iterations 2 \
  --max-files 20

For Quality¶

python run.py \
  --model codellama \
  --iterations 5 \
  --max-files 100

For Balance¶

python run.py \
  --model mistral \
  --iterations 3 \
  --max-files 50

Next Steps¶

Simple Generator Guide - Quick, non-iterative generation
Command Reference - Complete command documentation
Configuration - Detailed configuration options
Complete Guide - All features in depth

AI Agent Guide¶

Overview¶

How It Works¶

Key Features¶

1. Iterative Refinement¶

2. Self-Critique¶

3. Smart File Analysis¶

Usage¶

Basic Usage¶

Command-Line Options¶

Directory Options¶

Model Options¶

Output Options¶

Quality Options¶

Project Type¶

Debugging¶

Configuration¶

Critical Settings¶

Workflow Examples¶

Example 1: Quick Project Documentation¶

Example 2: Production-Ready Documentation¶

Example 3: Backend API Documentation¶

Example 4: Frontend Component Library¶

Understanding Agent Behavior¶

Iteration Process¶

File Selection¶

Quality Metrics¶

Advanced Features¶

Caching (Future)¶

Custom Prompts¶

Error Recovery¶

Troubleshooting¶

Poor Quality Output¶

Slow Performance¶

API Errors¶

File Discovery Issues¶

Best Practices¶

1. Start with Defaults¶

2. Choose Appropriate Iterations¶

3. Match Model to Task¶

4. Monitor and Adjust¶

5. Specify Project Type¶

Performance Tips¶

For Speed¶

For Quality¶

For Balance¶

Next Steps¶