Advanced Search

ZenSearch provides powerful search capabilities combining semantic understanding with precise filtering. Master these features to find exactly what you need.

Search Architecture

Hybrid Search

ZenSearch uses a hybrid approach combining:

Dense Embeddings

Semantic understanding: Finds conceptually similar content
Meaning over keywords: "car" matches "automobile"
Context awareness: Understands intent behind queries

Sparse Embeddings

Keyword precision: Exact term matching
Technical terms: Catches specific jargon
Names and codes: Finds exact identifiers

Fusion

Results from both methods are combined using sophisticated ranking algorithms to provide the best of both worlds.

Retrieval Pipeline

Query
  ↓
Intent Classification
  ↓
Query Expansion (optional)
  ↓
Permission Filtering
  ↓
Hybrid Search (Dense + Sparse)
  ↓
Context Decay (time-based weighting)
  ↓
Faceted Filtering
  ↓
Cross-Encoder Reranking
  ↓
Citation Grounding
  ↓
Context Enrichment
  ↓
Results

Query Types

Natural Language Questions

Ask questions as you would to a colleague:

"What is our policy on remote work?"
"How do I configure the database connection?"
"Who is responsible for the Q4 budget?"

Keyword Searches

Use specific terms for precision:

"API authentication token"
"error code 5001"
"employee handbook 2024"

Combined Queries

Mix natural language with specific terms:

"How do I fix error ERR_CONNECTION_REFUSED?"
"What are the steps to deploy version 2.3.1?"

Query Expansion

ZenSearch can automatically expand your query to improve results:

How It Works

Your query is analyzed
Alternative phrasings are generated
Multiple searches run in parallel
Results are merged and deduplicated

Example

Original: "How to fix login issues"

Expanded:
- "login issues troubleshooting"
- "authentication problems resolution"
- "sign in errors fix"
- "login failure solutions"

Faceted Search

Facet	Description	Example Values
Topics	Content categories	Technology, Finance, HR
Departments	Organizational units	Engineering, Sales, Marketing
Languages	Document language	English, Spanish, French
Sentiments	Content tone	Positive, Neutral, Negative
Date Range	Creation/modification	Last 7/30/90 days, Custom

Perform a search
View facets in the sidebar
Click to filter by facet values
Combine multiple facets
Clear filters to broaden results

Facets update based on current results:

Counts reflect filtered results
Unavailable facets are hidden
Values sorted by relevance

Context Decay (Time-Based Weighting)

ZenSearch applies time-based score weighting so that more recent documents rank higher than stale ones, all else being equal. This ensures your search results reflect current information rather than outdated content.

How It Works

Each document's relevance score is adjusted by a decay factor based on the document's age. The decay follows a half-life model: after one half-life period, a document's time-based weight drops to 50%.

Configuration

Context decay is configurable per collection:

Setting	Description	Default
Enabled	Whether time decay is active	false
Half-life	Duration after which weight drops to 50%	180 days

Use Cases

News or announcements: Short half-life (7-14 days) to surface the latest updates
Policies and procedures: Longer half-life (180+ days) since these change infrequently
Code documentation: Medium half-life (60-90 days) to balance freshness with stability

info

Context decay adjusts ranking, not visibility. Older documents still appear in results if they are highly relevant — they simply rank lower than equally relevant newer documents.

ZenSearch extracts and indexes images embedded in documents, making visual content searchable alongside text. When a document contains images (diagrams, charts, screenshots, photos), each image is processed through a vision model that generates a natural-language description.

How It Works

During document parsing, images are extracted as IMAGE structural units
A vision model analyzes each image and produces a text description
The description is embedded and indexed alongside the document's text content
When you search, image descriptions are included in the retrieval pipeline

What This Means for You

Search for "architecture diagram" and find images matching that description
Charts and graphs are described with their data points and trends
Screenshots of UIs are described with their visible elements
All image content participates in hybrid search (dense + sparse)

Supported Formats

Images in the following document types are automatically extracted:

PDF files (embedded images and figures)
Word documents (.docx)
PowerPoint presentations (.pptx)

Provider requirements

Image description requires a vision-capable model. The default zen-mini mapping uses a vision-capable provider in cloud deployments. Self-hosters who switch to a chat provider that does not support vision (e.g., Groq) will see images skipped — text content is still indexed normally. To re-enable image description in that case, point the vision-describer model at a vision-capable provider (OpenAI, Anthropic, Bedrock, Azure).

Lite edition

The lite parser (used in the Developer Edition by default) does not extract images and does not run OCR. PDFs that look like scans are flagged with a needs_ocr marker so you can route them to a richer parser later. Switch to the full parser if image search matters to your deployment — see Self-Hosting for the trade-offs.

Citation Grounding

After generating a response, ZenSearch verifies that each citation correctly attributes to its claimed source. This post-synthesis verification step catches misattributed citations and automatically corrects them.

How It Works

The AI generates a response with inline citations [1], [2], etc.
A grounding check compares each cited claim against the actual source content
Misattributed citations are corrected or removed
The final response contains only verified source attributions

Benefits

Higher trust in cited sources
Reduced hallucination in attribution
Each citation points to content that genuinely supports the claim

Progressive Retrieval

When the retrieval pipeline detects that initial search results have low confidence or insufficient coverage for a query, it autonomously fetches additional context. The system re-queries with refined terms, broader scope, or alternative phrasings to improve answer quality.

This is particularly useful for:

Niche or highly specific queries where the first pass returns few results
Questions that span multiple topics or collections
Queries where the answer requires synthesizing information from several documents

Progressive retrieval is transparent — you may see a brief "Searching for more context..." indicator while additional sources are being gathered.

Cross-Encoder Reranking

What Is Reranking?

After initial retrieval, a cross-encoder model reranks results for better precision:

Initial retrieval: Fast, broad search
Reranking: Deep analysis of top candidates
Final order: Most relevant results first

Benefits

More accurate relevance scores
Better handling of complex queries
Improved result ordering

Coverage Information

Understanding Coverage

Search results include coverage metrics showing completeness:

Full coverage: All relevant content found
Partial coverage: Some content may be missing
Warnings: Potential gaps in results

Coverage Indicators

Results: 15 documents found
Coverage: 94% (3 semantic units pending indexing)

⚠️ Some content from GitHub connector is still syncing

Search Modes

Chat Mode

Best for:

Questions needing synthesized answers
Multi-turn conversations
Research and exploration

Features:

AI-generated responses
Source citations
Follow-up capability

Search Mode

Best for:

Finding specific documents
Browsing available content
Detailed filtering

Features:

Document list results
Faceted filtering
Preview snippets

Scope and Collections

Collection Scoping

Control search boundaries:

Scope	Use Case
All Collections	Company-wide search
Single Collection	Department-specific search
Multiple Collections	Cross-functional research

Setting Scope

Click the Scope dropdown
Select collections to include
View document counts
Search within selection

Answer Shape

Query Classification

ZenSearch classifies queries to optimize responses:

Shape	Description	Example
Enumerative	List of items	"What tools do we use?"
Procedural	Step-by-step	"How do I submit expenses?"
Exploratory	Open-ended	"Tell me about our products"
Comparative	Comparison	"Compare Plan A vs Plan B"

Response Formatting

Responses are formatted based on query shape:

Enumerative: Bulleted lists
Procedural: Numbered steps
Exploratory: Comprehensive overview
Comparative: Tables and comparisons

Meta-Questions

About Your Knowledge Base

Ask meta-questions about your indexed content:

"What topics are covered in our documentation?"
"Give me an overview of the engineering wiki"
"What data sources are connected?"
"Show me statistics about our content"

Meta-Question Indicators

Meta-questions are indicated with badges:

Overview
Topics
Data Sources
Statistics
Capabilities

Search Tips

Effective Queries

Strategy	Example
Be specific	"Q4 2024 sales report" vs "sales"
Add context	"Python API authentication" vs "authentication"
Use timeframes	"last quarter", "2024", "recent"
Name specifics	Include project, team, or person names

Refining Results

Start broad, then narrow with facets
Try alternative phrasings
Use both chat and search modes
Check suggested related queries

Interpreting Results

Indicator	Meaning
High relevance	Strong match to query
Multiple citations	Synthesized from several sources
Recent date	Current information
Verified source	From authoritative connector

Permissions and Access

Search-Time Filtering

ZenSearch enforces permissions at search time:

Query is received
User's access rights are checked
Only accessible documents are searched
Results exclude unauthorized content

Permission Types

Type	Description
User	Individual access rights
Group	Team or group membership
Team	Workspace access
Domain	Organization-wide
Public	No restrictions

Performance

Speed Optimization

ZenSearch optimizes for fast results:

Cached embeddings
Indexed metadata
Parallel searches
Incremental updates

Large Result Sets

For queries with many results:

Pagination available
"Load more" functionality
Result count displayed
Coverage information shown

Troubleshooting

No Results

Check collection scope
Broaden search terms
Remove filters
Verify content is indexed

Irrelevant Results

Add more specific terms
Use facet filters
Try different phrasing
Check query intent

Slow Searches

Narrow collection scope
Simplify complex queries
Check for large pending syncs
Use specific filters

Next Steps

Ask & Chat - Main search interface
Agents - AI-powered research
Evaluation - Search quality metrics
API - Search API reference

Search Architecture​

Hybrid Search​

Dense Embeddings​

Sparse Embeddings​

Fusion​

Retrieval Pipeline​

Query Types​

Natural Language Questions​

Keyword Searches​

Combined Queries​

Query Expansion​

How It Works​

Example​

Faceted Search​

Available Facets​

Using Facets​

Dynamic Facets​

Context Decay (Time-Based Weighting)​

How It Works​

Configuration​

Use Cases​

Multi-Modal Search (Image Content)​

How It Works​

What This Means for You​

Supported Formats​

Provider requirements​

Lite edition​

Citation Grounding​

How It Works​

Benefits​

Progressive Retrieval​

Cross-Encoder Reranking​

What Is Reranking?​

Benefits​

Coverage Information​

Understanding Coverage​

Coverage Indicators​

Search Modes​

Chat Mode​

Search Mode​

Scope and Collections​

Collection Scoping​

Setting Scope​

Answer Shape​

Query Classification​

Response Formatting​

Meta-Questions​

About Your Knowledge Base​

Meta-Question Indicators​

Search Tips​

Effective Queries​

Refining Results​

Interpreting Results​

Permissions and Access​

Search-Time Filtering​

Permission Types​

Performance​

Speed Optimization​

Large Result Sets​

Troubleshooting​

No Results​

Irrelevant Results​

Slow Searches​

Next Steps​

Search Architecture

Hybrid Search

Dense Embeddings

Sparse Embeddings

Fusion

Retrieval Pipeline

Query Types

Natural Language Questions

Keyword Searches

Combined Queries

Query Expansion

How It Works

Example

Faceted Search

Available Facets

Using Facets

Dynamic Facets

Context Decay (Time-Based Weighting)

How It Works

Configuration

Use Cases

Multi-Modal Search (Image Content)

How It Works

What This Means for You

Supported Formats

Provider requirements

Lite edition

Citation Grounding

How It Works

Benefits

Progressive Retrieval

Cross-Encoder Reranking

What Is Reranking?

Benefits

Coverage Information

Understanding Coverage

Coverage Indicators

Search Modes

Chat Mode

Search Mode

Scope and Collections

Collection Scoping

Setting Scope

Answer Shape

Query Classification

Response Formatting

Meta-Questions

About Your Knowledge Base

Meta-Question Indicators

Search Tips

Effective Queries

Refining Results

Interpreting Results

Permissions and Access

Search-Time Filtering

Permission Types

Performance

Speed Optimization

Large Result Sets

Troubleshooting

No Results

Irrelevant Results

Slow Searches

Next Steps