Search & Retrieval

Understand the retrieval data hierarchy, hybrid dense/sparse search, and high-level RAG search.

To locate and utilize information efficiently, documents are structured into a logical hierarchy and queried using advanced hybrid matching algorithms.

Our platform offers two endpoints depending on your needs:

Retrieve API (/retrieve): A low-level endpoint returning raw scored chunks. Best for building custom pipelines.
Search API (/search): A high-level RAG agent that retrieves chunks and synthesizes an answer using an LLM.

🏗️ The Retrieval Data Hierarchy

To achieve precise retrieval at scale, the platform organizes information into a nested hierarchy. This structure allows the RAG engine to isolate relevant context while maintaining a clear link back to the source material.

 ┌───────────────────────────────────────────────────────────┐
 │  COLLECTION (Search Scope & Index Config)                 │
 │  ┌─────────────────────────────────────────────────────┐  │
 │  │  DOCUMENT (Source Metadata & Reference)             │  │
 │  │  ┌───────────────────────────────────────────────┐  │  │
 │  │  │  CHUNKS (Scored Searchable Blocks)            │  │  │
 │  │  │  ┌───────────┐ ┌───────────┐ ┌─────────────┐  │  │  │
 │  │  │  │ Block A   │ │ Block B   │ │ ...         │  │  │  │
 │  │  │  │ • Text    │ │ • Text    │ │ • Dense Vec │  │  │  │
 │  │  │  │ • Vectors │ │ • Vectors │ │ • Sparse Vec│  │  │  │
 │  │  │  └───────────┘ └───────────┘ └─────────────┘  │  │  │
 │  │  └───────────────────────────────────────────────┘  │  │
 │  └─────────────────────────────────────────────────────┘  │
 └───────────────────────────────────────────────────────────┘

Collections: The primary search boundary. Every query is scoped to one or more Collections, which define the underlying vector index and model configurations.
Documents: Logical representations of uploaded files (e.g., a 50-page PDF). Documents hold metadata and provide the context needed for citations.
Chunks (Blocks): The atomic unit of retrieval. Documents are split into semantic fragments that fit within LLM context windows. Each chunk contains the original text and the mathematical vectors (Dense + Sparse) used for hybrid matching.

🔍 Hybrid Search (Dense & Sparse)

To return the most accurate results for any question, the retrieval engine implements Hybrid Search. This approach merges two distinct search vectors to combine the benefits of meaning-based search and exact keyword queries.

 ┌──────────────────────────────────────────────┐
 │               Hybrid Search                  │
 │                                              │
 │                  User Query                  │
 │                       │                      │
 │          ┌────────────┴────────────┐         │
 │          ▼                         ▼         │
 │  ┌─────────────────┐    ┌─────────────────┐  │
 │  │  Dense Vector   │    │  Sparse BM25    │  │
 │  │  Search         │    │  Search         │  │
 │  │  (semantic)     │    │  (keyword)      │  │
 │  └────────┬────────┘    └────────┬────────┘  │
 │           │                      │           │
 │           └──────────┬───────────┘           │
 │                      ▼                       │
 │           ┌────────────────────┐             │
 │           │   Score Merge      │             │
 │           │  (≥ threshold)     │             │
 │           └────────┬───────────┘             │
 │                    ▼                         │
 │               Top Chunks                     │
 └──────────────────────────────────────────────┘

Dense Semantic Search (Vector Indexing): Your query is converted into a vector that captures the overall meaning of your sentence. This retrieves information even if the searcher uses different synonyms (e.g., searching "how do I start" returns "Quick Setup Guide").
Sparse Search (BM25 Keyword Matching): Traditional keyword search. This ensures that exact matches—like specific serial numbers, error codes, system rules, or variable names—are found instantly.
Score Merging: The retrieval system normalizes and blends dense and sparse scores. Any chunks scoring below your workspace's configured scoreThreshold are filtered out, returning only highly relevant blocks.

🛠️ Retrieval API (Low-Level)

The /retrieve endpoint gives you direct access to the Hybrid Search engine. It returns raw Document Blocks and their match scores, without any LLM intervention. This is ideal if you are building your own generative pipeline or just need semantic search. For a detailed technical reference, see the API Retrieval Reference.

Execute a Retrieval Query

Submit a query to a specific collection to retrieve the most relevant fragments.

curl -X POST "https://api.axelered.com/v1/w/{workspace_id}/col/{collection_id}/retrieve" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the company leave policy?",
    "group_by_document": true,
    "limit": 10
  }'

Key Parameters:

query: The search string used for both dense and sparse matching.
group_by_document: If true, returns results grouped by their parent Document, including the best-matching chunks within each document. If false, returns a flat list of chunks.

🤖 Search API (High-Level RAG)

The /search endpoint uses an AI Agent to execute a complete Retrieval-Augmented Generation (RAG) loop. It automatically runs a Hybrid Search, injects the top chunks into the LLM context, and streams back a synthesized answer with citations. For a detailed technical reference, see the API Search Reference.

Perform a RAG Search

Execute a natural language query that retrieves context and generates an answer in a single call.

curl -X POST "https://api.axelered.com/v1/w/{workspace_id}/search" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Summarize the Q4 financials based on the uploaded reports.",
    "collection_id": "{collection_id}",
    "stream": true
  }'