back to manuals

Image Models

Comprehensive guide to understanding image model configuration and pricing in the ChatBotKit platform

Image models in ChatBotKit are configured to generate images from text prompts and, in some cases, from image inputs as well. They share the same core identification parameters as language models but differ in that they are priced per generation rather than per token.

Understanding the configuration fields helps you choose the right model for your visual generation tasks and accurately predict costs across different providers and quality tiers.

Core Identification Parameters

Every image model is identified through the same core parameters as language models:

Provider: Identifies the organization or service that supplies the model (e.g., 'openai', 'bedrock', 'openrouter', 'vercel'). Different providers offer models with varying generation quality, style characteristics, and pricing.

Family: Groups related models together (e.g., 'gpt-image', 'dalle', 'gemini'). Models in the same family share an underlying architecture but may differ in quality tier or speed.

Features: An array specifying the model's capabilities. Image models typically have an empty features array or use feature flags to indicate support for image input alongside text.

Pricing Configuration

Image model pricing is based on the cost per generation request rather than token consumption. The pricing structure uses the following fields:

tokenRatio: The base cost multiplier for generation requests. Higher values reflect more capable or higher-quality models.

inputTokenRatio (when specified): A separate multiplier for the input side of a request. This applies to models that accept image inputs in addition to text prompts, where input processing carries its own cost.

outputTokenRatio (when specified): A separate multiplier for output generation. When both input and output ratios are provided they override the base tokenRatio for more accurate cost calculation.

inputPrice (when specified): A fixed price component for input processing, expressed in cost units per 1,000 requests. Used alongside outputPrice for models with distinct input and output pricing.

outputPrice (when specified): A fixed price component for the generated image output. This is the primary pricing field for most image models, representing the cost per image generated. Multiply by expected volume to estimate total generation costs.

Visibility and Lifecycle Management

The same visibility and lifecycle fields from language models apply to image models:

visible: Determines whether the model appears in user-facing model selection interfaces. Hidden models are accessible via the API but do not appear in dropdowns or pickers.

deprecated: Indicates whether the model is deprecated and should be avoided for new projects. Deprecated models continue to function for existing integrations during transition.

Regional Configuration

region: The primary region where the model is hosted ('us' or 'eu'). This affects latency and data residency for generation requests.

availableRegions: An array of all regions where the model can be accessed. Choose a region that meets your latency requirements and data residency compliance obligations.

Metadata and Classification

tags: An array of strings for categorizing models (e.g., 'beta', 'experimental'). Tags help identify model maturity and any special capabilities or testing status.

addedDate: The date when the model was added to the platform. More recent models typically reflect the latest provider capabilities and pricing.

Choosing the Right Image Model

When selecting and configuring an image model for your use case:

  1. Match the provider to your quality needs: Different providers and families excel at different visual styles and fidelity levels. Evaluate output quality before committing.

  2. Balance cost and quality: Higher outputPrice models typically produce higher-fidelity results, but may not be necessary for all tasks. Consider lower-cost models for drafts or high-volume generation.

  3. Account for input pricing: Models that accept image inputs carry an additional input cost. Factor both inputPrice and outputPrice into your cost estimates for edit workflows.

  4. Select appropriate regions: Choose models available in regions that meet your latency and data residency requirements.

  5. Monitor deprecated status: Avoid deprecated models for new projects, but understand they will continue working for existing integrations during transition periods.