Database Architecture

Prompt Alchemy uses SQLite as its primary storage engine, providing a lightweight, reliable, and self-contained database solution that requires no external dependencies.

Overview

The database stores all prompt data, metadata, metrics, and learning information in a single SQLite file located at ~/.prompt-alchemy/prompts.db by default.

Recent Schema Enhancements (v1.1.0)

Enhanced ModelMetadata: Comprehensive tracking of model usage, costs, and performance
Complete Storage API: Fully implemented CRUD operations for all entities
Improved Search: Text-based search capabilities across content and metadata
Better Lifecycle Tracking: Enhanced prompt lifecycle and usage analytics

Schema Design

Core Tables

`prompts` - Enhanced prompt storage

CREATE TABLE prompts (
    id TEXT PRIMARY KEY,
    content TEXT NOT NULL,
    phase TEXT NOT NULL,
    provider TEXT NOT NULL,
    model TEXT NOT NULL,
    temperature REAL,
    max_tokens INTEGER,
    actual_tokens INTEGER,
    tags TEXT,
    parent_id TEXT,
    session_id TEXT NOT NULL,
    
    -- Lifecycle management
    source_type TEXT NOT NULL,              -- How prompt was created
    enhancement_method TEXT,                -- How it was improved
    relevance_score REAL DEFAULT 0.0,      -- Dynamic relevance (0.0-1.0)
    usage_count INTEGER DEFAULT 0,         -- Times accessed/used
    generation_count INTEGER DEFAULT 0,    -- Prompts this generated
    last_used_at DATETIME,                 -- Last access timestamp
    
    -- Original input tracking
    original_input TEXT,                   -- Original user input
    persona_used TEXT,                     -- Persona used for generation
    target_model_family TEXT,             -- Target model family
    target_use_case TEXT,                 -- Target use case
    
    -- Embedding support
    embedding BLOB,
    embedding_model TEXT,
    embedding_provider TEXT,
    
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    
    FOREIGN KEY (parent_id) REFERENCES prompts(id)
);

`model_metadata` - Enhanced generation metadata

CREATE TABLE model_metadata (
    id TEXT PRIMARY KEY,                    -- UUID identifier
    prompt_id TEXT NOT NULL,               -- Associated prompt ID
    generation_model TEXT NOT NULL,        -- Model used for generation
    generation_provider TEXT NOT NULL,     -- Provider used for generation
    embedding_model TEXT,                  -- Model used for embeddings
    embedding_provider TEXT,               -- Provider used for embeddings
    model_version TEXT,                    -- Specific model version
    api_version TEXT,                      -- API version used
    processing_time INTEGER NOT NULL,      -- Processing time in milliseconds
    input_tokens INTEGER NOT NULL,         -- Number of input tokens
    output_tokens INTEGER NOT NULL,        -- Number of output tokens
    total_tokens INTEGER NOT NULL,         -- Total tokens (input + output)
    cost REAL,                            -- Cost in USD if available
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (prompt_id) REFERENCES prompts(id)
);

`metrics` - Performance tracking

CREATE TABLE metrics (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    prompt_id TEXT NOT NULL,
    token_usage INTEGER,
    response_time_ms INTEGER,
    success_rate REAL,
    user_rating INTEGER,
    feedback TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (prompt_id) REFERENCES prompts(id)
);

`enhancement_history` - Version control

CREATE TABLE enhancement_history (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    prompt_id TEXT NOT NULL,
    updated_content TEXT NOT NULL,
    update_reason TEXT,
    updated_by TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (prompt_id) REFERENCES prompts(id)
);

Learning Tables

`learning_patterns` - Adaptive learning data

CREATE TABLE learning_patterns (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    pattern_type TEXT NOT NULL,
    pattern_data TEXT NOT NULL,
    confidence_score REAL DEFAULT 0.0,
    usage_count INTEGER DEFAULT 0,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

`user_feedback` - Feedback collection

CREATE TABLE user_feedback (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    prompt_id TEXT NOT NULL,
    rating INTEGER CHECK (rating >= 1 AND rating <= 5),
    feedback_type TEXT,
    feedback_text TEXT,
    session_id TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (prompt_id) REFERENCES prompts(id)
);

Data Types and Constraints

Text Fields

id: UUID format (e.g., abc123-def456-789)
phase: One of prima-materia, solutio, coagulatio
provider: One of openai, anthropic, google, openrouter, ollama
persona: One of code, writing, analysis, generic
tags: Comma-separated values (e.g., api,backend,sql)

Numeric Fields

effectiveness_score: 0.0 to 1.0 (higher is better)
usage_count: Non-negative integer
token_usage: Non-negative integer
response_time_ms: Non-negative integer
user_rating: 1 to 5 (integer)

Binary Fields

embedding: Vector embeddings as BLOB (typically 1536 dimensions for OpenAI)

Indexes

The database includes several indexes for optimal query performance:

-- Primary search indexes
CREATE INDEX idx_prompts_phase ON prompts(phase);
CREATE INDEX idx_prompts_provider ON prompts(provider);
CREATE INDEX idx_prompts_persona ON prompts(persona);
CREATE INDEX idx_prompts_created_at ON prompts(created_at);
CREATE INDEX idx_prompts_effectiveness ON prompts(effectiveness_score);

-- Tag search index
CREATE INDEX idx_prompts_tags ON prompts(tags);

-- Foreign key indexes
CREATE INDEX idx_model_metadata_prompt_id ON model_metadata(prompt_id);
CREATE INDEX idx_metrics_prompt_id ON metrics(prompt_id);
CREATE INDEX idx_enhancement_history_prompt_id ON enhancement_history(prompt_id);
CREATE INDEX idx_user_feedback_prompt_id ON user_feedback(prompt_id);

-- Learning indexes
CREATE INDEX idx_learning_patterns_type ON learning_patterns(pattern_type);
CREATE INDEX idx_learning_patterns_confidence ON learning_patterns(confidence_score);

Data Relationships

Prompt Hierarchy

prompts (parent_id) → prompts (id)
prompts (variant_of) → prompts (id)

Metadata Relationships

prompts (id) → model_metadata (prompt_id)
prompts (id) → metrics (prompt_id)
prompts (id) → enhancement_history (prompt_id)
prompts (id) → user_feedback (prompt_id)

Database Operations

Migration Management

Run database migrations to update schema:

# Preview migration without changes
prompt-alchemy migrate --dry-run

# Run migration with custom batch size
prompt-alchemy migrate --batch-size 25

# Force migration (skip safety checks)
prompt-alchemy migrate --force

Backup and Restore

# Export all data
prompt-alchemy export --format sql > backup.sql

# Export specific data
prompt-alchemy export --format json --since 2024-01-01 > recent_prompts.json

# Import data
prompt-alchemy import --file backup.sql

Database Maintenance

# Analyze database performance
prompt-alchemy db analyze

# Optimize database (VACUUM)
prompt-alchemy db optimize

# Check database integrity
prompt-alchemy db integrity-check

# Compact database
prompt-alchemy db compact

Performance Considerations

Query Optimization

Use indexes: Queries on phase, provider, persona, and created_at are optimized
Limit results: Use --limit flag to restrict result sets
Filter early: Apply filters before semantic search operations
Batch operations: Use batch commands for multiple operations

Storage Optimization

Embedding compression: Store embeddings efficiently
Regular cleanup: Remove old or ineffective prompts
Database maintenance: Run periodic VACUUM operations
Archive old data: Move old data to separate archive tables

Memory Usage

Default cache size: 1000 prompts in memory
Embedding cache: Configurable size limit
Connection pooling: Single connection per process

Security Considerations

Data Protection

File permissions: Database file should be readable only by the user
Encryption: Consider filesystem-level encryption for sensitive data
Backup security: Encrypt backup files containing API keys
Access control: Limit database file access to authorized users

API Key Storage

API keys are stored in the configuration file, not the database:

Location: ~/.prompt-alchemy/config.yaml
Permissions: 600 (user read/write only)
Format: Plain text (consider environment variables for production)

Monitoring and Analytics

Built-in Metrics

# View database statistics
prompt-alchemy metrics --database

# Check table sizes
prompt-alchemy db stats

# Monitor query performance
prompt-alchemy db performance

Custom Queries

Connect directly to the database for custom analysis:

# Open SQLite shell
sqlite3 ~/.prompt-alchemy/prompts.db

# Example queries
SELECT phase, COUNT(*) FROM prompts GROUP BY phase;
SELECT provider, AVG(effectiveness_score) FROM prompts GROUP BY provider;
SELECT DATE(created_at), COUNT(*) FROM prompts GROUP BY DATE(created_at);

Troubleshooting

Common Issues

Database locked errors:

# Check for other processes
lsof ~/.prompt-alchemy/prompts.db

# Restart the application
pkill prompt-alchemy

Corrupted database:

# Check integrity
prompt-alchemy db integrity-check

# Recover from backup
cp backup.sql ~/.prompt-alchemy/prompts.db

Performance issues:

# Analyze query performance
prompt-alchemy db analyze

# Optimize database
prompt-alchemy db optimize

Logs and Debugging

Database operations are logged to:

Location: ~/.prompt-alchemy/logs/prompt-alchemy.log
Level: DEBUG for database operations
Rotation: Automatic daily rotation

Migration Strategy

Version Compatibility

Backward compatibility: New versions maintain compatibility with existing databases
Automatic migration: Schema updates are applied automatically
Rollback support: Previous versions can read newer database formats
Data preservation: All existing data is preserved during migrations

Migration Process

Pre-migration backup: Automatic backup before schema changes
Validation: Verify data integrity after migration
Rollback: Automatic rollback on migration failure
Notification: Clear feedback on migration status

Next Steps

Review the Architecture for system design details
Explore Vector Embeddings for semantic search implementation
Check CLI Reference for database management commands
Learn about Learning Mode for adaptive features

Database Architecture

Database Architecture

Overview

Recent Schema Enhancements (v1.1.0)

Schema Design

Core Tables

prompts - Enhanced prompt storage

model_metadata - Enhanced generation metadata

metrics - Performance tracking

enhancement_history - Version control

Learning Tables

learning_patterns - Adaptive learning data

user_feedback - Feedback collection

Data Types and Constraints

Text Fields

Numeric Fields

Binary Fields

Indexes

Data Relationships

Prompt Hierarchy

Metadata Relationships

Database Operations

Migration Management

Backup and Restore

Database Maintenance

Performance Considerations

Query Optimization

Storage Optimization

Memory Usage

Security Considerations

Data Protection

API Key Storage

Monitoring and Analytics

Built-in Metrics

Custom Queries

Troubleshooting

Common Issues

Logs and Debugging

Migration Strategy

Version Compatibility

Migration Process

Next Steps

`prompts` - Enhanced prompt storage

`model_metadata` - Enhanced generation metadata

`metrics` - Performance tracking

`enhancement_history` - Version control

`learning_patterns` - Adaptive learning data

`user_feedback` - Feedback collection