Guides

Searching Documents

Find relevant information in your data

Learn how to search your documents using semantic similarity to find relevant information.

What is Semantic Search?

Semantic search finds information based on meaning, not just exact keywords. This means:

  • "How do I reset my password?" matches "To recover your account credentials..."
  • "refund policy" matches "money-back guarantee within 30 days"
  • Works across different phrasings and languages

Unlike traditional keyword search, semantic search understands context and intent.

How It Works

When you search:

  1. Your question is converted to a vector embedding
  2. Vector similarity finds the most relevant document chunks
  3. Results are ranked by relevance score (0-1, higher is better)
  4. Metadata from your uploads is included

Basic Search

Using cURL

bash
curl -X POST https://api.easyrag.com/v1/search \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "datasetId": "my-documents", "question": "What are the key features?" }'

Using JavaScript

javascript
const response = await fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'my-documents', question: 'What are the key features?' }) }); const { data } = await response.json(); console.log('Found', data.length, 'results');

Response

json
{ "success": true, "data": [ { "score": 0.892, "pageContent": "The key features include automatic document processing, semantic search, and AI-powered answers...", "metadata": { "fileId": "f7a3b2c1-4d5e", "originalName": "features.pdf", "datasetId": "my-documents" } }, { "score": 0.856, "pageContent": "Our platform offers three main capabilities: document upload, intelligent search, and question answering...", "metadata": { "fileId": "a1b2c3d4-5e6f", "originalName": "overview.pdf", "datasetId": "my-documents" } } ] }

Understanding Results

Score

The score field indicates relevance (0-1):

  • 0.9+ - Highly relevant, almost exact match
  • 0.8-0.9 - Very relevant, strong semantic similarity
  • 0.7-0.8 - Relevant, related content
  • Below 0.7 - Possibly relevant, weaker match

Page Content

The actual text chunk from your document:

javascript
result.pageContent // "The key features include automatic document processing..."

This is typically 200-400 words (depending on your chunk size).

Metadata

Information about the source document:

javascript
result.metadata // { // fileId: "f7a3b2c1-4d5e", // originalName: "features.pdf", // datasetId: "my-documents", // // ... any custom metadata you added during upload // }

Filtering Results

Use metadata filters to narrow your search:

javascript
const response = await fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'company-docs', question: 'vacation policy', filters: [ { key: 'department', match: { value: 'HR' } }, { key: 'year', match: { value: 2024 } } ] }) });

This searches only within HR documents from 2024.

See Filtering with Metadata for more details.

React Example

Here's a complete search component:

javascript
import { useState } from 'react'; function DocumentSearch({ datasetId, apiKey }) { const [query, setQuery] = useState(''); const [results, setResults] = useState([]); const [loading, setLoading] = useState(false); const handleSearch = async (e) => { e.preventDefault(); if (!query.trim()) return; setLoading(true); try { const response = await fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId, question: query }) }); const { data } = await response.json(); setResults(data); } catch (error) { console.error('Search failed:', error); } finally { setLoading(false); } }; return ( <div> <form onSubmit={handleSearch}> <input type="text" value={query} onChange={(e) => setQuery(e.target.value)} placeholder="Search your documents..." style={{ width: '100%', padding: '12px', fontSize: '16px', border: '2px solid #ddd', borderRadius: '8px' }} /> <button type="submit" disabled={loading} style={{ marginTop: '10px', padding: '12px 24px', background: '#4f46e5', color: 'white', border: 'none', borderRadius: '8px', cursor: loading ? 'wait' : 'pointer' }} > {loading ? 'Searching...' : 'Search'} </button> </form> <div style={{ marginTop: '20px' }}> {results.map((result, i) => ( <div key={i} style={{ padding: '16px', margin: '10px 0', border: '1px solid #ddd', borderRadius: '8px' }} > <div style={{ fontSize: '12px', color: '#666', marginBottom: '8px' }}> Score: {result.score.toFixed(3)}{result.metadata.originalName} </div> <div>{result.pageContent}</div> </div> ))} </div> </div> ); } export default DocumentSearch;

Advanced Features

Search with Auto-Retry

javascript
async function searchWithRetry(datasetId, question, maxRetries = 3) { for (let i = 0; i < maxRetries; i++) { try { const response = await fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId, question }) }); if (response.ok) { return await response.json(); } // Retry on 500 errors if (response.status === 500 && i < maxRetries - 1) { await new Promise(r => setTimeout(r, 1000 * (i + 1))); continue; } throw new Error(`Search failed: ${response.status}`); } catch (error) { if (i === maxRetries - 1) throw error; } } }

Highlight Matching Text

javascript
function highlightText(text, query) { const words = query.toLowerCase().split(' '); let highlighted = text; words.forEach(word => { const regex = new RegExp(`(${word})`, 'gi'); highlighted = highlighted.replace( regex, '<mark>$1</mark>' ); }); return highlighted; } // Usage <div dangerouslySetInnerHTML={{ __html: highlightText(result.pageContent, query) }} />

Group Results by File

javascript
function groupByFile(results) { const grouped = {}; results.forEach(result => { const fileName = result.metadata.originalName; if (!grouped[fileName]) { grouped[fileName] = []; } grouped[fileName].push(result); }); return grouped; } // Usage const grouped = groupByFile(results); Object.entries(grouped).forEach(([fileName, chunks]) => { console.log(`${fileName}: ${chunks.length} matches`); });

Common Use Cases

Document Discovery

Help users find relevant documents:

javascript
const results = await search('company-docs', 'meeting notes from last week'); const files = [...new Set(results.map(r => r.metadata.originalName))]; console.log('Relevant documents:', files);

FAQ Search

Build a self-service help center:

javascript
const results = await search('help-docs', 'how to reset password'); // Show top result as "best answer" const bestMatch = results[0]; if (bestMatch.score > 0.8) { console.log('Answer:', bestMatch.pageContent); }

Multi-Tenant Search

Search only a user's documents:

javascript
await search('shared-dataset', query, { filters: [ { key: 'userId', match: { value: currentUserId } } ] });

Search vs Query

Use /v1/search when:

  • Building your own UI
  • Need raw chunks for processing
  • Want to show multiple results
  • Building custom chat interface

Use /v1/query when:

  • Want a ready-to-use answer
  • Need AI-generated responses
  • Building standard Q&A

See Asking Questions for the query endpoint.

Best Practices

1. Write Natural Questions

javascript
// ✅ Good: Natural language "How do I reset my password?" "What are the refund policies?" "Explain the installation process" // ❌ Bad: Keywords only "password reset" "refund" "install"

Natural language gives better results because embeddings understand context.

2. Handle Empty Results

javascript
const { data } = await search(datasetId, question); if (data.length === 0) { console.log('No results found. Try different keywords.'); } else if (data[0].score < 0.7) { console.log('Results may not be very relevant.'); }

3. Show Source Information

Always show users where results came from:

javascript
{results.map(r => ( <div key={r.metadata.fileId}> <small>From: {r.metadata.originalName}</small> <p>{r.pageContent}</p> </div> ))}

4. Use Filters for Multi-Tenant

javascript
// ✅ Good: Filter by user await search('shared-docs', query, { filters: [{ key: 'userId', match: { value: userId } }] }); // ❌ Bad: No filtering - shows all users' data await search('shared-docs', query);

Billing

Each search costs 0.1 credit (same as query).

OperationCost
1 search0.1 credit
10 searches1 credit
100 searches10 credits

Troubleshooting

No Results

Problem: Search returns empty array

Solutions:

  • Verify files were uploaded successfully
  • Check that datasetId is correct
  • Try broader search terms
  • List files to confirm they're indexed: GET /v1/files

Low Relevance Scores

Problem: All results have scores below 0.7

Solutions:

  • Try different phrasing
  • Use more descriptive questions
  • Check if content actually exists in your documents
  • Verify documents are in the correct language

Wrong Results

Problem: Results don't match the question

Solutions:

  • Add metadata filters to narrow scope
  • Check if question is too vague
  • Verify the right dataset is being searched
  • Review what content was actually indexed

Next Steps

API Reference

For technical details, see Search API Reference.