Building a Production-Grade Database-Driven AI Model Selection System

When building enterprise AI applications, one of the most critical architectural decisions is how to manage AI model selection at scale. After months of iteration, we've arrived at a system that eliminates hardcoded models entirely, provides strict tenant isolation, and enables users to switch providers without code changes.

The Problem with Hardcoded Models

Most AI applications start with hardcoded model names scattered throughout the codebase:

// ❌ The old way - hardcoded everywhere
const model = getModel("gpt-4o");
const chatModel = getModel("llama-3.3-70b-versatile");
const toolModel = getModel("openai/gpt-oss-120b");

This approach creates several critical problems:

Configuration drift: Model names duplicated across 50+ files
Provider lock-in: Switching providers requires code changes
No user control: Users can't choose their preferred models
Silent failures: Hardcoded fallbacks mask configuration errors
Tenant isolation issues: Personal and organization contexts share models
Deployment complexity: Model changes require code deployments

Our Solution: Database-Driven Model Catalog

We built a system where every AI model configuration lives in the database. No hardcoded models. No fallbacks. No exceptions.

This post walks through the architecture that powers Fabric's production AI infrastructure.

Architecture Overview: Database as Single Source of Truth

The system is built on five core database tables that form a complete AI model catalog:

Loading diagram...

Core Principles

Zero hardcoded models: Every model name comes from the database
Strict tenant isolation: Personal and organization contexts are completely separate
Clear preference hierarchy: User Override → Org Override → System Default → Error
No silent fallbacks: Throws clear errors if configuration is missing
Provider flexibility: Switch providers without code changes
Single source of truth: All provider metadata centralized in one module

Centralized Provider Configuration

One of the most important architectural decisions was eliminating all duplicated provider configuration. In many AI applications, provider metadata ends up scattered across multiple files—display names here, capability flags there, URL mappings somewhere else. This leads to configuration drift and bugs when providers are added or modified.

The Problem with Scattered Configuration

Before centralization, we had provider information duplicated in three places:

Backend API handlers
Frontend settings components
Model resolution logic

When we added a new provider or changed a capability flag, we had to update multiple files and hope we didn't miss one. Inevitably, bugs crept in—providers showing as "embedding capable" in the UI but failing at runtime.

The Single Source of Truth Approach

We consolidated all provider metadata into a centralized configuration module. This module defines:

Provider Categories: Gateways (Vercel, OpenRouter, Cloudflare), cloud platforms (Azure, AWS, Google), direct providers (OpenAI, Anthropic, Groq, Cerebras, etc.), and special providers (Hybrid, Custom).

Provider Metadata: Display names, descriptions, and capability flags for all 21 supported providers.

Capability Functions: Helper functions that answer questions like "Can this provider support embedding models?" or "Is this a gateway provider?"

The Frontend/Backend Split

One interesting challenge: frontend client components can't import from the database package because it pulls in server-side dependencies (Prisma, PostgreSQL drivers). These don't run in the browser.

Our solution is a two-file architecture:

Backend module: The authoritative source in the database package
Frontend module: A client-safe copy in the web app's settings module

Both files define the same constants and functions. When we add a new provider, we update both. It's a small trade-off for the benefit of having clean, importable helper functions on both sides.

Benefits of Centralization

Before:

250+ lines duplicated across 3 files
Provider capabilities scattered across codebase
Easy to miss updates when adding providers
Runtime capability bugs from inconsistent data

After:

Single source of truth (+ frontend copy for browser)
Centralized capability functions
Clear update checklist for new providers
Compile-time type safety

This architecture makes adding new providers straightforward: update the centralized config, mirror to frontend, and everything just works.

The AI Model Catalog: Single Source of Truth

At the heart of the system is a single source of truth for all AI model definitions: the ai-model-catalog.ts file. This TypeScript module defines every model, its capabilities, and provider-specific mappings in one place.

Why a Single Source of Truth Matters

Before implementing the catalog, we had model names scattered across:

Database seed scripts
API handlers
Agent configurations
Frontend components
External service integrations (Fabric AI, MCP sampling)

This led to bugs when providers changed model names (like Groq renaming llama3-70b to llama-3.3-70b-versatile) and made it impossible to validate model configurations at build time.

The Catalog Architecture

// packages/database/prisma/ai-model-catalog.ts

// Every model definition with full metadata
export const MODELS: ModelSeedData[] = [
  {
    canonicalName: "llama-3-3-70b",
    displayName: "Llama 3.3 70B",
    description: "Meta's most capable open-source model",
    family: "llama",
    vendor: "Meta",
    capabilities: ["TEXT", "TOOL_CALLING", "CODE"],
    contextWindow: 128000,
    maxOutputTokens: 8192,
    speedTier: "FAST",
    qualityTier: "PREMIUM",
    suitableForTasks: ["COMPLEX", "CHAT", "TOOL_CALLING"],
    // Provider-specific model IDs - THE KEY TO AUTOMATIC PROVIDER SWITCHING
    providerMappings: [
      { provider: "GROQ", providerModelId: "llama-3.3-70b-versatile" },
      { provider: "VERCEL_GATEWAY", providerModelId: "meta/llama-3.3-70b" },
      { provider: "CEREBRAS", providerModelId: "llama-3.3-70b" },
      { provider: "TOGETHER_AI", providerModelId: "meta-llama/Llama-3.3-70B-Instruct-Turbo" },
      { provider: "AWS_BEDROCK", providerModelId: "meta.llama3-3-70b-instruct-v1:0" },
    ],
  },
  // ... 50+ more models
];

// Default models by task type - exported constants for application use
export const DEFAULT_MODELS = {
  SIMPLE: "gpt-4o-mini",
  COMPLEX: "gpt-4o",
  CHAT: "gpt-4o",
  TOOL_CALLING: "gpt-4o",
  REASONING: "o1",
  EMBEDDING: "text-embedding-3-small",
} as const;

// Task defaults per provider
export const TASK_DEFAULTS: TaskDefaultSeed[] = [
  // SIMPLE tasks - fast, cheap models
  { taskType: "SIMPLE", provider: "GROQ", canonicalName: "llama-3-1-8b" },
  { taskType: "SIMPLE", provider: "CEREBRAS", canonicalName: "llama-3-1-8b" },
  { taskType: "SIMPLE", provider: "OPENAI_DIRECT", canonicalName: "gpt-4o-mini" },
  // ... per-provider defaults for each task type
];

Helper Functions for Application Code

The catalog exports helper functions that eliminate hardcoded model names:

import {
  DEFAULT_MODELS,
  getProviderModelId,
  resolveModelAlias,
  getModelCapabilitiesFromCatalog,
} from "@repo/database/prisma/ai-model-catalog";

// Get default model for a task (compile-time constant)
const chatModel = DEFAULT_MODELS.CHAT;  // "gpt-4o"

// Get provider-specific ID at runtime
const groqModel = getProviderModelId("llama-3-3-70b", "GROQ");
// Returns: "llama-3.3-70b-versatile"

const cerebrasModel = getProviderModelId("llama-3-3-70b", "CEREBRAS");
// Returns: "llama-3.3-70b"

// Resolve user-friendly aliases
const model = resolveModelAlias("claude");  // Returns: "claude-sonnet-4-5"
const llama = resolveModelAlias("llama-70b");  // Returns: "llama-3-3-70b"

// Get capabilities for any model ID (works with canonical or provider-specific)
const caps = getModelCapabilitiesFromCatalog("gpt-4o");
// Returns: { vision: true, reasoning: true, toolCalling: true, ... }

Build-Time Validation

We run validation tests on every build to ensure model names are valid for each provider:

pnpm --filter @repo/database test -- --testPathPattern="ai-model-catalog"

# Output:
# ✅ OPENAI_DIRECT: 11 models valid
# ✅ ANTHROPIC_DIRECT: 3 models valid
# ✅ GROQ: 9 models valid
# ✅ CEREBRAS: 4 models valid
# ✅ VERCEL_GATEWAY: 22 models valid
# ... more providers
# TOTAL: 100 valid, 0 invalid

This catches issues like:

Typos in model names
Deprecated models
Missing provider mappings
Invalid capability combinations

Usage Across the Codebase

The catalog is now used everywhere models are referenced:

| Module | Usage | |--------|-------| | packages/ai | Dynamic model resolution via getAIModelWithMetadata() | | packages/fabric-ai | Default models for Fabric AI pattern execution | | packages/mcp | MCP sampling model selection | | seed-agent-templates.ts | Suggested models for agent templates | | Frontend components | Model picker dropdowns and capability display |

Database Tables

The catalog data is seeded into five interconnected tables.

Provider-Specific Model Mappings

The same canonical model has different IDs on different providers. The AiModelProviderMapping table handles this translation automatically.

Example: Llama 3.3 70B across providers

Canonical name: llama-3.3-70b
Cerebras: llama-3.3-70b
Groq: llama-3.3-70b-versatile
Vercel Gateway: groq/llama-3.3-70b-versatile

Example: GPT-4o (only available via gateway)

Canonical name: gpt-4o
Cerebras: not available
Groq: not available
Vercel Gateway: openai/gpt-4o

This mapping layer is what enables seamless provider switching—users change their provider in Settings, and the system automatically resolves to the correct model ID.

// Example: Provider mappings for llama-3.3-70b
{
  modelId: "model_abc123",
  provider: "CEREBRAS",
  providerModelId: "llama-3.3-70b"
},
{
  modelId: "model_abc123",
  provider: "GROQ",
  providerModelId: "llama-3.3-70b-versatile"
},
{
  modelId: "model_abc123",
  provider: "VERCEL_GATEWAY",
  providerModelId: "groq/llama-3.3-70b-versatile"
}

When a user switches from Groq to Cerebras, the system automatically selects the correct provider-specific model ID. No code changes required.

Dynamic Model Resolution Flow

When a user makes an AI request, the system dynamically resolves the model configuration:

Loading diagram...

Preference Hierarchy

The system follows a strict three-level hierarchy:

User Override (UserModelPreference table)
- User's explicit choice for this task type + provider
- Highest priority
- Example: User wants GPT-4o for CHAT tasks on OpenAI
Organization Override (OrganizationModelPreference table)
- Organization's default for this task type + provider
- Only applies in organization context
- Example: Org mandates Claude Sonnet 4.5 for all COMPLEX tasks
System Default (AiTaskModelDefault table)
- Provider-specific defaults seeded from database
- Example: Cerebras defaults to llama-3.3-70b for CHAT
Error (NO hardcoded fallbacks)
- Throws clear error: "No model configured for CEREBRAS + TOOL_CALLING"
- Prevents silent failures with wrong models

Task Types and System Defaults

The AiTaskModelDefault table defines optimized models for each task type per provider. Here's how the defaults are configured:

SIMPLE — Fast tasks like title generation and summarization

Cerebras: llama3.1-8b | Groq: llama-3.1-8b-instant | OpenAI: gpt-4o-mini

COMPLEX — Detailed generation like documents and analysis

Cerebras: llama-3.3-70b | Groq: llama-3.3-70b-versatile | OpenAI: gpt-4o

CHAT — Conversational AI

Cerebras: llama-3.3-70b | Groq: llama-3.3-70b-versatile | OpenAI: gpt-4o

TOOL_CALLING — Function calling and MCP tools

Cerebras: gpt-oss-120b | Groq: openai/gpt-oss-120b | OpenAI: gpt-4o

REASONING — Deep analysis and problem-solving

Cerebras: gpt-oss-120b | Groq: deepseek-r1-distill-llama-70b | OpenAI: o1

EMBEDDING — Vector generation for RAG

All providers: text-embedding-3-small

IMAGE / AUDIO — Media generation and transcription

Image: dall-e-3 | Audio: whisper-1

Why Different Models for Different Tasks?

SIMPLE tasks use smaller, faster models (8B parameters) for quick responses:

Title generation
Text summarization
Simple Q&A

COMPLEX tasks use larger, more capable models (70B+ parameters):

Document generation
Detailed analysis
Code generation

TOOL_CALLING tasks require models with reliable function calling:

gpt-oss-120b: OpenAI's open-source model with native tool calling
Available on Groq and Cerebras for fast inference
More reliable than Llama models for structured outputs

REASONING tasks use specialized models:

DeepSeek R1: Chain-of-thought reasoning
OpenAI o1: Advanced problem-solving

Strict Tenant Isolation

One of the most critical aspects of the system is strict tenant isolation between personal and organization contexts.

The XOR Pattern

Every database query uses an exclusive OR (XOR) pattern to ensure data never leaks between contexts:

// ✅ CORRECT - XOR pattern
const tenantFilter = organizationId
  ? { organizationId, userId }           // Org context
  : { organizationId: null, userId };    // Personal context (null is REQUIRED)

const models = await db.userModelPreference.findMany({
  where: tenantFilter
});

// ❌ WRONG - Leaks data between contexts
const models = await db.userModelPreference.findMany({
  where: { OR: [{ userId }, { organizationId }] }  // NEVER DO THIS
});

Context-Aware Model Resolution

When resolving models, the system always includes the tenant context:

// Personal context
const modelString = await getConfiguredModelString("CHAT", {
  userId: "user_123",
  organizationId: null  // Explicitly null for personal
});

// Organization context
const modelString = await getConfiguredModelString("CHAT", {
  userId: "user_123",
  organizationId: "org_456"  // Org context
});

This ensures:

User's personal models are NEVER visible in org context
Org A's models are NEVER visible to Org B
No accidental data leakage between tenants

Real-World Example: Switching Providers

Let's walk through what happens when a user switches from Groq to Cerebras.

Initial State (Groq)

// User's configuration in database
{
  userId: "user_123",
  provider: "GROQ",
  apiKey: "encrypted_groq_key",
  baseUrl: "https://api.groq.com/openai/v1"
}

// System defaults for GROQ
CHAT → llama-3.3-70b-versatile
TOOL_CALLING → openai/gpt-oss-120b
COMPLEX → llama-3.3-70b-versatile

User Changes Provider in Settings

The user navigates to Settings > AI Providers and selects Cerebras as their default provider.

New State (Cerebras)

// Updated configuration in database
{
  userId: "user_123",
  provider: "CEREBRAS",
  apiKey: "encrypted_cerebras_key",
  baseUrl: "https://api.cerebras.ai/v1"
}

// System defaults for CEREBRAS (automatically resolved)
CHAT → llama-3.3-70b
TOOL_CALLING → gpt-oss-120b
COMPLEX → llama-3.3-70b

What Changed Automatically

Provider-specific model IDs: llama-3.3-70b-versatile → llama-3.3-70b
Base URL: Groq API → Cerebras API
API key: Groq key → Cerebras key
Model format: Gateway format → Direct format

Zero code changes. Zero configuration files. Everything from the database.

Implementation: Core Functions

1. Get Configured Model String

The primary function for resolving models:

import { getConfiguredModelString } from "@repo/ai";

// Get model for a task type
const modelString = await getConfiguredModelString("CHAT", {
  userId: "user_123",
  organizationId: "org_456"  // or null for personal context
});

// Returns: "llama-3.3-70b" (for Cerebras)
// or "openai/gpt-4o" (for Vercel Gateway)

This function:

Checks UserModelPreference for user override
Checks OrganizationModelPreference for org override (if in org context)
Falls back to AiTaskModelDefault for system default
Throws error if no configuration found

2. Resolve Model with Provider

Get complete model configuration including provider and API key:

import { resolveModelWithProvider } from "@repo/ai";
import { decryptApiKey } from "@repo/utils";

const config = await resolveModelWithProvider("TOOL_CALLING", {
  userId: "user_123",
  organizationId: "org_456"
}, {
  requiresToolCalling: true  // Validate model supports tool calling
});

// Returns:
// {
//   modelString: "gpt-oss-120b",
//   provider: "CEREBRAS",
//   apiKey: "encrypted_key",
//   baseUrl: "https://api.cerebras.ai/v1"
// }

3. Create Model Instance

Use the resolved configuration to create a model:

import { getModel } from "@repo/ai";

const model = getModel(config.modelString, {
  userId: "user_123",
  organizationId: "org_456",
  apiKey: decryptApiKey(config.apiKey),
  provider: config.provider
});

// Returns: LanguageModel instance ready for AI SDK

4. Execute AI Operation

import { generateText } from "ai";

const result = await generateText({
  model,
  system: "You are a helpful assistant",
  prompt: "Write a blog post about AI"
});

Complete End-to-End Example

Here's a real-world example of generating a document with dynamic model resolution:

// apps/web/modules/saas/documents/actions/generate-document.ts
import { getAIModelWithMetadata } from "@repo/ai";
import { generateText } from "ai";

export async function generateDocument(params: {
  userId: string;
  organizationId: string | null;
  prompt: string;
}) {
  // Use centralized entry point - handles model resolution, credentials, and usage tracking
  const { model, metadata, trackUsage } = await getAIModelWithMetadata(
    { taskType: "COMPLEX" },
    { userId: params.userId, organizationId: params.organizationId ?? undefined }
  );

  // Generate document
  const result = await generateText({
    model,
    system: "You are a professional document writer.",
    prompt: params.prompt,
    temperature: 0.7,
    maxTokens: 4000
  });

  // Track usage (fire-and-forget, updates provider.lastUsedAt)
  trackUsage();

  return result.text;
}

What Happens Under the Hood

Provider lookup: Queries AiProviderApiKey table for user's default provider
Model resolution: Checks UserModelPreference → OrganizationModelPreference → AiTaskModelDefault
Provider mapping: Looks up provider-specific model ID in AiModelProviderMapping
Model creation: Creates appropriate SDK instance (OpenAI, Anthropic, Groq, etc.)
API call: Executes with correct base URL, API key, and model ID

All of this happens dynamically at runtime. No hardcoded models. No configuration files.

Database Schema

The system is built on five interconnected tables:

Loading diagram...

Table Descriptions

AiModel: Canonical model definitions

50+ models from OpenAI, Anthropic, Meta, Google, DeepSeek, etc.
Includes capabilities, context window, speed/quality tiers
Seeded from seed-ai-models.ts

AiModelProviderMapping: Provider-specific model IDs

Maps canonical names to provider-specific IDs
Example: llama-3.3-70b → llama-3.3-70b-versatile (Groq)
Enables automatic provider switching

AiTaskModelDefault: System defaults per task type

Optimized models for each task type per provider
Example: CEREBRAS + CHAT → llama-3.3-70b
Seeded with production-tested defaults

UserModelPreference: User overrides

User's explicit choice for task type + provider
Tenant-isolated (personal vs organization contexts)
Highest priority in resolution

OrganizationModelPreference: Organization overrides

Organization's default for task type + provider
Only applies in organization context
Second priority in resolution

Migration from Hardcoded Models

We completed a major refactoring to eliminate all hardcoded models and duplicated provider configuration. This was one of the most impactful architectural changes we made.

The State Before Migration

Our codebase had accumulated technical debt in several forms:

Hardcoded model names everywhere: Over 100 instances of model names like "gpt-4o" or "llama-3.3-70b-versatile" scattered across 50+ files. When OpenAI deprecated a model or Groq changed their naming convention, we had to hunt through the entire codebase.

Duplicated provider metadata: The same 250+ lines of provider configuration existed in three different files. Adding a new provider meant updating all three and hoping you didn't miss anything.

Silent fallbacks masking errors: When a model wasn't configured, the system would silently fall back to a hardcoded default. Users had no idea they were getting the wrong model.

The Migration Strategy

We took a systematic approach:

Audit: Found and cataloged every hardcoded model and provider constant
Centralize: Created the single-source-of-truth modules for provider configuration
Database: Moved all model defaults to database tables with proper seeding
Validate: Added capability validation (like embedding support checks)
Error: Replaced silent fallbacks with clear, actionable error messages

The Results

Before → After:

Hardcoded models: 100+ instances → 0
Deprecated constants: 5 major constants → 0
Duplicated provider config: 250+ lines × 3 files → 2 files (backend + frontend)
Provider metadata: Scattered → Single source of truth
Provider switching: Requires code changes → Automatic
Tenant isolation: Partial → Complete
Error handling: Silent fallbacks → Clear errors
Embedding validation: Manual runtime checks → Centralized capability functions

The most satisfying outcome: adding a new provider now takes minutes instead of hours, and we haven't had a "wrong model" bug since the migration.

Adding a New AI Provider

The system is designed to make adding new providers straightforward—a direct benefit of the centralized architecture.

The Seven-Step Process

Schema: Add the new provider to the database enum
Backend Config: Add provider metadata to the centralized configuration module (category, display name, description, capabilities)
Frontend Config: Mirror the same metadata in the client-safe module
Base URL: Configure the API endpoint if using OpenAI-compatible protocol
Model Mappings: Add provider-specific model IDs to the seed script
Task Defaults: Configure which models to use for each task type
Database Seed: Run the seeding command to populate the database

What Makes This Fast

The key insight is that most of this is configuration, not code. You're not writing new API handlers or modifying business logic—you're just declaring metadata and mappings.

For an OpenAI-compatible provider (which most are these days), the entire process takes about 15 minutes:

5 minutes to add the schema and config entries
5 minutes to configure model mappings
5 minutes to seed and test

No Deployment Required

Once the database is seeded, users can immediately:

Select the new provider in their Settings
Configure their API key
Start using it for all task types

The application code doesn't need to change. The provider routing, model resolution, and API key management all work automatically because they're driven by database configuration, not hardcoded logic.

Production Lessons Learned

1. Eliminate All Hardcoded Values

Every hardcoded model name was a potential bug. We found 100+ instances scattered across the codebase. The database-driven approach eliminated all of them.

Key insight: If it can change, it belongs in the database, not in code.

2. Fail Loudly, Not Silently

Hardcoded fallbacks masked configuration errors. Users would get wrong models without knowing why.

// ❌ Bad: Silent fallback
const model = modelString ?? "gpt-4o";  // User has no idea this happened

// ✅ Good: Clear error
if (!modelString) {
  throw new Error(
    "No model configured for CEREBRAS + TOOL_CALLING. " +
    "Please configure in Settings > AI Providers."
  );
}

3. Tenant Isolation is Non-Negotiable

We had several bugs where personal models leaked into organization contexts. The XOR pattern eliminated all of them.

Key insight: Use organizationId: null explicitly for personal context. Never use OR patterns.

4. Provider Compatibility Must Be Validated

Returning groq/gpt-4o when the user has Groq configured is wrong - GPT-4o isn't available on Groq. The AiModelProviderMapping table ensures only compatible models are returned.

5. Database Seeding is Critical

The system is only as good as its seed data. We invested heavily in comprehensive seed scripts with:

50+ canonical models
Provider mappings for all major providers
Optimized task defaults per provider
Production-tested configurations

6. Clear Error Messages Save Time

Instead of generic "Model not found" errors, we provide actionable messages:

Error: No model configured for provider "CEREBRAS" and task "TOOL_CALLING".

To fix:
1. Run: pnpm --filter @repo/database seed:ai-models
2. Or configure a custom model in Settings > AI Providers

User Experience: Settings UI

Users configure their AI providers through a clean Settings interface:

1. Select Default Provider

Settings > AI Providers > Default Provider

[ ] OpenAI Direct
[ ] Anthropic Direct
[x] Cerebras
[ ] Groq
[ ] Vercel AI Gateway
[ ] OpenRouter

2. Configure API Key

Cerebras API Key: [••••••••••••••••••••] [Save]

Get your API key: https://cloud.cerebras.ai/

3. Optional: Override Models per Task Type

Advanced Settings > Model Overrides

Task Type: CHAT
Provider: Cerebras
Model: [llama-3.3-70b ▼]
       - llama-3.3-70b (Default)
       - llama3.1-8b
       - gpt-oss-120b

[Save Override]

4. Organization Settings (Admins Only)

Organization admins can set defaults for all members:

Organization Settings > AI Providers

Default Provider: Cerebras
Organization API Key: [••••••••••••••••••••]

Model Overrides:
- CHAT: llama-3.3-70b
- COMPLEX: llama-3.3-70b
- TOOL_CALLING: gpt-oss-120b

[Save Organization Defaults]

Performance and Reliability

Database Query Optimization

All model resolution queries are optimized with:

Indexed lookups on userId, organizationId, taskType, provider
Single query to resolve model (no N+1 problems)
Cached provider configurations (Redis)

Error Handling

The system handles errors gracefully:

try {
  const modelString = await getConfiguredModelString("CHAT", {
    userId,
    organizationId
  });
} catch (error) {
  if (error.message.includes("No model configured")) {
    // Show user-friendly message with link to Settings
    return {
      error: "Please configure your AI provider in Settings",
      settingsUrl: "/app/settings/ai-providers"
    };
  }
  throw error;
}

Monitoring

We track:

Model resolution time (avg: 5ms)
Provider API latency
Error rates per provider
Cost per model per user

What's Next?

Planned Enhancements

Cost-based routing: Automatically select cheaper providers for simple tasks
Latency optimization: Route based on real-time response time metrics
Usage analytics: Detailed per-user and per-provider cost tracking
Model quality evaluation: A/B testing framework for model comparison
Automatic failover: Fallback to secondary provider if primary fails
Rate limit handling: Automatic retry with exponential backoff

Future Provider Support

Google Vertex AI: Enterprise-grade AI with data residency
Azure OpenAI: Microsoft's managed OpenAI service
AWS Bedrock: Amazon's managed AI service
Replicate: Community models and fine-tuned variants

Conclusion

Building a production-grade AI model selection system requires careful attention to:

Zero hardcoded values: Everything in the database
Strict tenant isolation: XOR pattern for personal vs organization contexts
Clear error handling: Fail loudly with actionable messages
Provider flexibility: Switch providers without code changes
User control: Let users choose their models and providers

The database-driven approach eliminated 100+ hardcoded models, improved tenant isolation, and made adding new providers trivial. Most importantly, it gave users complete control over their AI infrastructure.

For implementation details, see: AI Model Configuration Documentation