AI Models

Configure the AI models used for chat, embeddings, and other features in ZenSearch.

Overview

The AI Models settings allow you to:

View available models
Add new model configurations
Set default models
Monitor model usage

Accessing Model Settings

Click Settings in the sidebar
Select the AI Models tab

Available Models

Model Types

Type	Purpose
Chat	Conversational AI responses
Embedding	Document vectorization
Reranker	Result reranking

Supported Providers

Provider	Models
OpenAI	GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
Anthropic	Claude 3.5 Sonnet, Claude 3 Opus
Cohere	Command, Command-R
Custom	OpenAI-compatible endpoints

Adding Models

Add a New Model

Click Add Model
Select the provider
Choose the model
Enter API key (if required)
Click Add

Configuration Fields

Field	Description
Provider	Model provider (OpenAI, Anthropic, etc.)
Model	Specific model name
API Key	Provider API key
Endpoint	Custom endpoint URL (if applicable)

Default Models

Setting Defaults

Set default models for each use case:

Find the model in the list
Click Set as Default
Select the use case (Chat, Embedding)

Default Assignment

Use Case	Recommendation
Chat	GPT-4o or Claude 3.5 Sonnet
Embedding	text-embedding-3-small
Reranker	Cohere rerank-v3

Model Usage

Viewing Usage

Navigate to the Model Usage tab to see:

Tokens consumed per model
Cost breakdown
Usage over time
Per-team breakdown

Usage Metrics

Metric	Description
Input Tokens	Tokens sent to model
Output Tokens	Tokens received from model
Total Cost	Estimated cost
Request Count	Number of API calls

Testing Models

Test Connection

Before saving, test the model:

Click Test Connection
Wait for verification
Check for errors

Test Results

Result	Meaning
Success	Model is accessible
Auth Error	API key is invalid
Network Error	Cannot reach endpoint
Model Error	Model not available

Custom Endpoints

OpenAI-Compatible APIs

For local or self-hosted models:

Provider: Custom
Endpoint: http://localhost:8000/v1
Model: local-llama
API Key: (optional)

Supported Endpoints

Ollama
LM Studio
vLLM
Text Generation Inference

Best Practices

Model Selection

Use GPT-4o or Claude for complex queries
Use faster models for simple tasks
Consider cost vs. quality tradeoffs
Test models before production use

API Key Security

Never share API keys
Rotate keys periodically
Use separate keys per environment
Monitor for unauthorized usage

Troubleshooting

Model Not Responding

Verify API key is valid
Check provider status page
Test connection in settings
Review rate limits

High Costs

Review model usage dashboard
Consider using smaller models
Optimize query complexity
Set usage limits

Next Steps

Guardrails - Configure safety features
API Keys - Manage API access

Overview​

Accessing Model Settings​

Available Models​

Model Types​

Supported Providers​

Adding Models​

Add a New Model​

Configuration Fields​

Default Models​

Setting Defaults​

Default Assignment​

Model Usage​

Viewing Usage​

Usage Metrics​

Testing Models​

Test Connection​

Test Results​

Custom Endpoints​

OpenAI-Compatible APIs​

Supported Endpoints​

Best Practices​

Model Selection​

API Key Security​

Troubleshooting​

Model Not Responding​

High Costs​

Next Steps​