Guides

Documentation

Technical reference for the semantic search endpoint.

POST /v1/search

Perform semantic search across indexed documents. Returns relevant chunks without LLM processing.

Authentication

  • API Key or Frontend Token

Content-Type

http
Content-Type: application/json

Request Body

json
{ "datasetId": "string", "question": "string", "filters": [] }

Parameters

ParameterTypeRequiredDescription
datasetIdstringYes*Dataset to search (*optional with frontend token)
questionstringYesSearch query or question
filtersarrayNoQdrant filters for metadata-based filtering

Request Example

bash
curl -X POST https://api.easyrag.com/v1/search \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "datasetId": "my-dataset", "question": "What is the refund policy?" }'

Response (200)

json
{ "success": true, "data": [ { "score": 0.8924, "pageContent": "To reset your password, navigate to the login page...", "metadata": { "fileId": "f7a3b2c1-4d5e", "originalName": "user-manual.pdf", "customerId": "user_abc123", "datasetId": "my-dataset", "department": "support" } }, { "score": 0.8567, "pageContent": "For security purposes, passwords must be reset...", "metadata": { "fileId": "a1b2c3d4-5e6f", "originalName": "security-policy.pdf", "customerId": "user_abc123", "datasetId": "my-dataset" } } ] }

Response Fields

FieldTypeDescription
successbooleanAlways true on success
dataarrayArray of search results (typically 5-10)
data[].scorenumberSimilarity score (0-1, higher is better)
data[].pageContentstringText chunk from document
data[].metadataobjectMetadata about source document
data[].metadata.fileIdstringSource file identifier
data[].metadata.originalNamestringSource filename
data[].metadata.customerIdstringCustomer who owns the data
data[].metadata.datasetIdstringDataset containing the chunk
data[].metadata.*anyAdditional custom metadata from upload

Empty Results

json
{ "success": true, "data": [] }

Filtering

Apply metadata filters to narrow search results.

Filter Structure

json
{ "filters": [ { "key": "metadata_field", "match": { "value": "exact_value" } } ] }

Filter Examples

Single Filter

json
{ "datasetId": "company-docs", "question": "vacation policy", "filters": [ { "key": "department", "match": { "value": "HR" } } ] }

Multiple Filters (AND)

json
{ "datasetId": "legal-docs", "question": "contract terms", "filters": [ { "key": "department", "match": { "value": "legal" } }, { "key": "year", "match": { "value": 2024 } }, { "key": "status", "match": { "value": "active" } } ] }

Boolean Filters

json
{ "datasetId": "documents", "question": "active documents", "filters": [ { "key": "isArchived", "match": { "value": false } } ] }

User Isolation

json
{ "datasetId": "shared-dataset", "question": "my documents", "filters": [ { "key": "userId", "match": { "value": "user_123" } } ] }

Request Examples

JavaScript

javascript
const response = await fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'my-dataset', question: 'What are the key features?' }) }); const { data } = await response.json(); console.log(`Found ${data.length} results`);

Python

python
import requests response = requests.post( 'https://api.easyrag.com/v1/search', headers={ 'Authorization': f'Bearer {api_key}', 'Content-Type': 'application/json' }, json={ 'datasetId': 'my-dataset', 'question': 'What are the key features?' } ) data = response.json() results = data['data']

With Filters

javascript
const response = await fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'company-docs', question: 'vacation policy', filters: [ { key: 'department', match: { value: 'HR' } }, { key: 'year', match: { value: 2024 } } ] }) });

Error Responses

400 Bad Request

Missing required fields

json
{ "error": "datasetId and question are required" }

Dataset mismatch

json
{ "error": "datasetId mismatch between token and request" }

401 Unauthorized

json
{ "error": "Missing API key or token" }

or

json
{ "error": "Invalid API key or token" }

402 Payment Required

json
{ "error": "INSUFFICIENT_CREDITS", "message": "You are out of credits. Please top up to continue.", "details": { "required": 1, "available": 0 } }

500 Internal Server Error

json
{ "error": "Internal error" }

Technical Details

Embedding Model

  • Model: OpenAI text-embedding-3-small
  • Dimensions: 1536
  • Languages: 100+ supported

Vector Search

  • Database: Qdrant
  • Distance: Cosine similarity
  • Results: Top 5-10 chunks (configurable server-side)

Ranking

Results are ordered by similarity score (descending):

  • 0.9+: Highly relevant
  • 0.8-0.9: Very relevant
  • 0.7-0.8: Relevant
  • < 0.7: Possibly relevant

Chunk Size

Chunks typically contain 200-400 words, depending on:

  • Upload chunkSize parameter (default 300 tokens)
  • Upload chunkOverlap parameter (default 20 tokens)

Billing

  • Cost: 0.1 credit per search (1 unit)
  • Charged: Before processing (fails if insufficient credits)
  • Refund: None for searches returning zero results
OperationCost
1 search0.1 credit
10 searches1 credit
100 searches10 credits

Rate Limits

  • Limit: 1000 requests/minute per customer
  • Shared: With query endpoint
  • Headers: Response includes rate limit information

Use Cases

Document Discovery

Find relevant documents without AI generation:

javascript
const { data } = await search('company-docs', 'meeting notes Q4 2024'); const files = [...new Set(data.map(r => r.metadata.originalName))]; console.log('Relevant files:', files);

Custom RAG Pipeline

Get context for your own LLM:

javascript
const { data } = await search('knowledge-base', userQuestion); const context = data .map(r => r.pageContent) .join('\n\n'); const answer = await yourLLM.complete({ system: 'Answer based on context', messages: [ { role: 'system', content: context }, { role: 'user', content: userQuestion } ] });

Multi-Tenant Search

Search user-specific documents:

javascript
const { data } = await search('shared-dataset', query, { filters: [ { key: 'userId', match: { value: currentUserId } } ] });

Comparison: Search vs Query

Feature/v1/search/v1/query
ReturnsRaw chunksAI-generated answer
LLMNoYes (GPT-4)
StreamingNoYes
Cost0.1 credit0.1 credit
Use CaseCustom chat, discoveryReady answers
Speed~500ms~2-5s

Use /v1/search when:

  • Building custom UI
  • Need raw chunks
  • Using your own LLM
  • Want multiple results

Use /v1/query when:

  • Want ready answers
  • Building standard chat
  • Don't need custom prompts

Best Practices

1. Use Natural Language

javascript
// ✅ Good: Natural question await search('docs', 'How do I reset my password?'); // ❌ Bad: Keywords only await search('docs', 'password reset');

2. Apply Filters Server-Side

javascript
// ✅ Good: Backend enforces filters app.post('/api/search', authenticateUser, async (req, res) => { const results = await easyragSearch({ datasetId: 'shared-docs', question: req.body.question, filters: [ { key: 'userId', match: { value: req.user.id } } ] }); res.json(results); }); // ❌ Bad: Client controls filters app.post('/api/search', async (req, res) => { const results = await easyragSearch({ datasetId: 'shared-docs', question: req.body.question, filters: req.body.filters // Client can change! }); res.json(results); });

3. Handle Empty Results

javascript
const { data } = await search(datasetId, question); if (data.length === 0) { console.log('No results found'); } else if (data[0].score < 0.7) { console.log('Results may not be very relevant'); }

4. Show Source Information

javascript
data.forEach(result => { console.log(`Score: ${result.score.toFixed(3)}`); console.log(`From: ${result.metadata.originalName}`); console.log(`Content: ${result.pageContent}`); console.log('---'); });

TypeScript Definition

typescript
interface SearchRequest { datasetId: string; question: string; filters?: Filter[]; } interface Filter { key: string; match: { value: string | number | boolean; }; } interface SearchResponse { success: true; data: SearchResult[]; } interface SearchResult { score: number; pageContent: string; metadata: { fileId: string; originalName: string; customerId: string; datasetId: string; [key: string]: any; // Custom metadata }; }

Notes

  • Search uses semantic similarity, not keyword matching
  • Results are ordered by relevance (highest score first)
  • Number of results depends on server-side configuration
  • Empty datasets return empty array (not error)
  • Filters are applied before similarity search
  • Metadata fields are indexed for fast filtering
  • Frontend tokens automatically scope to their dataset
  • Search doesn't modify or process content

Related Endpoints

  • POST /v1/query - AI-generated answers
  • POST /v1/files/upload - Upload searchable files
  • GET /v1/files - List indexed files