Configuration¶

Complete guide to configuring the AI Documentation Agent for optimal performance.

Configuration File¶

The agent uses a .env file for configuration located in the project root.

Creating Your Configuration¶

# Copy the example configuration
cp config/.env.example .env

# Edit with your preferred editor
nano .env  # or vim, code, notepad++, etc.

Environment Variables¶

Ollama API Configuration¶

OLLAMA_API_URL¶

The URL endpoint for the Ollama API.

Default: https://ollama.com/api/generate
Local: http://localhost:11434/api/generate

# For local Ollama installation
OLLAMA_API_URL=http://localhost:11434/api/generate

# For remote Ollama server
OLLAMA_API_URL=https://your-ollama-server.com/api/generate

# For cloud-hosted Ollama
OLLAMA_API_URL=https://ollama.com/api/generate

Local vs Remote

Local → Faster, free, private, requires local installation
Remote → Accessible anywhere, may have costs, requires network

OLLAMA_API_KEY¶

Optional API key for authenticated Ollama instances.

Default: Empty (not required for local Ollama)

# Leave empty for local installation
OLLAMA_API_KEY=

# Set for authenticated remote servers
OLLAMA_API_KEY=your_api_key_here

MODEL_NAME¶

The LLM model to use for documentation generation.

Default: gpt-oss:120b-cloud

# Fast and efficient (recommended for testing)
MODEL_NAME=llama2:7b

# Better quality
MODEL_NAME=mistral

# Best for code documentation
MODEL_NAME=codellama

# High quality, slower
MODEL_NAME=llama2:13b

# Cloud model (requires internet)
MODEL_NAME=gpt-oss:120b-cloud

Model Comparison:

Model	Speed	Quality	RAM	Best For
`llama2:7b`	⚡⚡⚡	⭐⭐⭐	4-8 GB	Quick docs, testing
`mistral`	⚡⚡	⭐⭐⭐⭐	8 GB	Balanced quality/speed
`codellama`	⚡⚡	⭐⭐⭐⭐⭐	8 GB	Code documentation
`llama2:13b`	⚡	⭐⭐⭐⭐⭐	16 GB	Maximum quality

Choosing a Model

First time? Use llama2:7b
Production docs? Use codellama
Limited RAM? Use llama2:7b
Best quality? Use llama2:13b or codellama

API_TIMEOUT¶

Maximum time (in seconds) to wait for API responses.

Default: 300 (5 minutes)

# Quick timeout for fast models
API_TIMEOUT=180

# Standard timeout
API_TIMEOUT=300

# Extended timeout for large projects
API_TIMEOUT=600

# Maximum timeout for comprehensive docs
API_TIMEOUT=900

Timeout Considerations

Larger models need more time
More files = longer processing
Complex code needs more timeout
Network latency affects remote APIs

Agent Behavior Configuration¶

MAX_RETRIES¶

Number of retry attempts for failed API calls.

Default: 3

# Minimal retries (fail fast)
MAX_RETRIES=1

# Standard retries
MAX_RETRIES=3

# Maximum reliability
MAX_RETRIES=5

RETRY_DELAY¶

Base delay in seconds between retries (uses exponential backoff).

Default: 2

# Quick retries
RETRY_DELAY=1

# Standard delay
RETRY_DELAY=2

# Conservative delay (network issues)
RETRY_DELAY=5

Retry Pattern: - 1st retry: wait RETRY_DELAY seconds - 2nd retry: wait RETRY_DELAY * 2 seconds - 3rd retry: wait RETRY_DELAY * 4 seconds - And so on (exponential backoff)

ENABLE_CACHING¶

Enable response caching for faster subsequent runs.

Default: true
Status: Future feature (not yet implemented)

ENABLE_CACHING=true

CRITIQUE_THRESHOLD¶

Quality threshold (0.0-1.0) for accepting documentation.

Default: 0.8

# Lenient (faster, lower quality)
CRITIQUE_THRESHOLD=0.6

# Balanced
CRITIQUE_THRESHOLD=0.8

# Strict (slower, higher quality)
CRITIQUE_THRESHOLD=0.9

Configuration Profiles¶

Development Profile¶

Fast iterations, quick feedback:

OLLAMA_API_URL=http://localhost:11434/api/generate
MODEL_NAME=llama2:7b
API_TIMEOUT=180
MAX_RETRIES=2
RETRY_DELAY=1
CRITIQUE_THRESHOLD=0.7

Use with:

python run.py --max-files 20 --iterations 2

Production Profile¶

High quality, comprehensive documentation:

OLLAMA_API_URL=http://localhost:11434/api/generate
MODEL_NAME=codellama
API_TIMEOUT=600
MAX_RETRIES=5
RETRY_DELAY=3
CRITIQUE_THRESHOLD=0.9

Use with:

python run.py --iterations 5 --max-files 100

Cloud Profile¶

Using remote Ollama service:

OLLAMA_API_URL=https://ollama.com/api/generate
OLLAMA_API_KEY=your_api_key
MODEL_NAME=gpt-oss:120b-cloud
API_TIMEOUT=600
MAX_RETRIES=5
RETRY_DELAY=3
CRITIQUE_THRESHOLD=0.8

Testing Profile¶

Fast, minimal resource usage:

OLLAMA_API_URL=http://localhost:11434/api/generate
MODEL_NAME=llama2:7b
API_TIMEOUT=120
MAX_RETRIES=1
RETRY_DELAY=1
CRITIQUE_THRESHOLD=0.6

Use with:

python run.py --max-files 10 --iterations 1

Command-Line Overrides¶

Command-line options override .env settings:

# Model override
python run.py --model mistral

# Format override
python run.py --format html

# Multiple overrides
python run.py --model codellama --iterations 5 --max-files 100

Verification¶

Check Configuration¶

# View current .env
cat .env

# Test configuration
python run.py --help

# Verify Ollama connection
curl http://localhost:11434/api/tags

Validate Settings¶

# List available models
ollama list

# Check if specific model is available
ollama list | grep llama2:7b

# Pull missing model
ollama pull llama2:7b

Advanced Configuration¶

Multiple Environments¶

Create different configuration files:

# Development
cp .env .env.dev

# Production
cp .env .env.prod

# Testing
cp .env .env.test

Use with:

# Copy desired config before running
cp .env.prod .env
python run.py

Custom Model Settings¶

For self-hosted Ollama with custom models:

# Create custom model
ollama create my-docs-model -f Modelfile

# Use in configuration
MODEL_NAME=my-docs-model

Performance Tuning¶

For Speed¶

MODEL_NAME=llama2:7b
API_TIMEOUT=180
MAX_RETRIES=2
CRITIQUE_THRESHOLD=0.7

Command: python run.py --max-files 20 --iterations 2

For Quality¶

MODEL_NAME=codellama
API_TIMEOUT=900
MAX_RETRIES=5
CRITIQUE_THRESHOLD=0.9

Command: python run.py --max-files 100 --iterations 5

For Reliability¶

MAX_RETRIES=5
RETRY_DELAY=5
API_TIMEOUT=600

Troubleshooting Configuration¶

Configuration Not Loading¶

Settings not applied

Check: 1. .env file exists in project root 2. No syntax errors in .env 3. Environment variables properly formatted

# Verify .env location
ls -la .env

# Check for syntax errors
cat .env

Model Not Found¶

Model 'xxx' not found

Solution:

# List available models
ollama list

# Pull the model
ollama pull llama2:7b

# Update .env
MODEL_NAME=llama2:7b

Connection Issues¶

Cannot connect to Ollama

Solution:

# Check if Ollama is running
ollama list

# Start Ollama if needed
ollama serve

# Verify URL in .env
OLLAMA_API_URL=http://localhost:11434/api/generate

Timeout Issues¶

Request timeout

Solutions: 1. Increase timeout: API_TIMEOUT=600 2. Reduce files: --max-files 20 3. Use faster model: MODEL_NAME=llama2:7b 4. Reduce iterations: --iterations 2

Configuration Best Practices¶

1. Start Conservative¶

# Begin with fast, reliable settings
MODEL_NAME=llama2:7b
API_TIMEOUT=300
MAX_RETRIES=3

2. Monitor Performance¶

# Use verbose mode to see timing
python run.py --verbose

# Check logs
tail -f ai_agent.log

3. Optimize Based on Results¶

Timeout often? → Increase API_TIMEOUT
Poor quality? → Use better model
Slow? → Reduce files or iterations
Unreliable? → Increase retries

4. Document Your Settings¶

# Add comments to your .env
# This configuration is optimized for:
# - Project type: Backend APIs
# - Typical size: 50-100 files
# - Quality level: Production
MODEL_NAME=codellama
# ... rest of config

Next Steps¶

✅ Configuration complete!

Quick Start - Generate your first docs
Installation - Advanced installation methods
User Guide - Learn all features
Complete Guide - Deep dive into capabilities

Configuration Reference¶

All Variables¶

Variable	Type	Default	Description
`OLLAMA_API_URL`	URL	`https://ollama.com/api/generate`	API endpoint
`OLLAMA_API_KEY`	String	Empty	API key (optional)
`MODEL_NAME`	String	`gpt-oss:120b-cloud`	LLM model name
`API_TIMEOUT`	Integer	`300`	Timeout in seconds
`MAX_RETRIES`	Integer	`3`	Retry attempts
`RETRY_DELAY`	Integer	`2`	Base retry delay
`ENABLE_CACHING`	Boolean	`true`	Enable caching
`CRITIQUE_THRESHOLD`	Float	`0.8`	Quality threshold

Valid Values¶

Model Names: Any model available in ollama list
Timeouts: 60-1800 seconds (1 min - 30 min)
Retries: 0-10 attempts
Delays: 1-10 seconds
Threshold: 0.0-1.0