Key Concepts

Understanding these core concepts will help you get the most out of ZenSearch.

Data Sources & Connectors

Connectors

A connector is a configured connection to an external data source. ZenSearch supports 17+ connector types:

Cloud Storage: S3, Google Drive, SharePoint, Azure Blob
Collaboration Tools: Confluence, Notion, Slack
Development Tools: GitHub, Jira
CRM Systems: Salesforce, HubSpot, SAP
Databases: PostgreSQL, MySQL, ClickHouse, MS SQL
Web: Web Crawler

Each connector:

Authenticates with your data source
Syncs content on a schedule or via webhooks
Maintains permissions from the source platform

Collections

A collection is a logical grouping of documents from one or more connectors. Collections help you:

Organize content by topic, department, or project
Control which content is searched
Apply different embedding models
Manage access permissions

Example setup:

Engineering Collection
├── GitHub (code repositories)
├── Confluence (technical docs)
└── Jira (tickets and issues)

Sales Collection
├── Salesforce (CRM data)
├── Google Drive (presentations)
└── HubSpot (marketing content)

Documents & Semantic Units

Documents

A document represents a single piece of content from a data source - a file, page, message, or record. Documents are:

Parsed to extract text and metadata
Classified by type and content
Indexed for retrieval

Semantic Units (SUs)

ZenSearch breaks documents into Semantic Units - meaningful chunks of content optimized for AI retrieval. This process:

Segments content into logical sections
Preserves context and relationships
Generates embeddings for semantic search
Maintains links to source documents

Search & Retrieval

Hybrid Search

ZenSearch uses hybrid search combining:

Dense embeddings: Semantic understanding of meaning
Sparse embeddings: Keyword matching for precision
Fusion algorithms: Combining results for best accuracy

Search Modes

Mode	Description	Best For
Chat	Conversational AI with streaming responses	Questions, research, exploration
Search	Traditional search results with faceted filtering	Finding specific documents

Faceted Search

Filter results by:

Topics/Categories: Auto-extracted document topics
Departments: Organizational categories
Languages: Document language
Date Ranges: When content was created/modified
Sentiment: Positive, neutral, or negative content

AI Agents

What are Agents?

Agents are AI-powered assistants that can:

Execute multi-step research tasks
Use tools to search, query, and analyze
Maintain conversation context
Provide comprehensive answers

Agent Tools

Built-in tools available to agents:

Tool	Description
`search_documents`	Search across collections
`get_document`	Retrieve full document content
`summarize_document`	Generate document summaries
`search_database_schema`	Discover database structure
`query_database`	Execute read-only SQL queries
`get_table_info`	Get table columns and types
`search_knowledge_graph`	Find entity relationships
`calculate`	Perform calculations

Agent Modes

Auto: Automatically uses agent for complex queries
Research: Always uses agent with planning
Off: Direct chat without agent capabilities

Permissions & Access Control

Team Roles

Role	Capabilities
Owner	Full control, delete team, transfer ownership
Admin	Manage members, connectors, collections
Editor	Create/edit connectors, run sync jobs
Viewer	Read-only, search and chat

Document-Level Permissions

ZenSearch syncs permissions from source platforms:

User permissions: Individual access rights
Group permissions: Team or group access
Domain permissions: Organization-wide access
Public access: Anyone can view

Permissions are enforced at search time - users only see content they're authorized to access.

Processing Pipeline

When you connect a data source, content flows through:

Collection → Parsing → Structure Analysis → Projection → Vectorization → Classification

Collection: Fetches content from source
Parsing: Extracts text and metadata
Structure Analysis: Identifies document structure
Projection: Creates semantic units
Vectorization: Generates embeddings
Classification: Categorizes content

Guardrails & Safety

ZenSearch includes built-in safety features:

Input Guardrails

Content moderation
Prompt injection detection
PII detection
Length validation

Output Guardrails

Hallucination detection
Toxicity filtering
Relevance checking

Next Steps

Now that you understand the key concepts:

Your First Search - Practice searching effectively
Core Features - Explore the full feature set
Connect More Sources - Add additional data sources

Data Sources & Connectors​

Connectors​

Collections​

Documents & Semantic Units​

Documents​

Semantic Units (SUs)​

Search & Retrieval​

Hybrid Search​

Search Modes​

Faceted Search​

AI Agents​

What are Agents?​

Agent Tools​

Agent Modes​

Permissions & Access Control​

Team Roles​

Document-Level Permissions​

Processing Pipeline​

Guardrails & Safety​

Input Guardrails​

Output Guardrails​

Next Steps​