Guides

Filtering with Metadata

Narrow results using custom metadata

Add custom metadata to your files and use powerful filters to build multi-tenant applications, user-specific search, and advanced document organization.

What is Metadata Filtering?

Metadata filtering lets you:

  1. Tag files with custom information during upload
  2. Search specific subsets of your data
  3. Build multi-tenant apps with complete data isolation
  4. Organize documents by category, user, date, etc.

Think of it like adding labels to files, then searching only within files with specific labels.

Quick Example

javascript
// Upload a file with metadata await uploadFile(file, { metadata: { 'contract.pdf': { userId: 'user_123', department: 'legal', year: 2024 } } }); // Search only legal documents from 2024 await search('my-dataset', 'termination clause', { filters: [ { key: 'department', match: { value: 'legal' } }, { key: 'year', match: { value: 2024 } } ] });

Adding Metadata to Files

During Upload

bash
curl -X POST https://api.easyrag.com/v1/files/upload \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "datasetId=my-dataset" \ -F 'metadata={"contract.pdf":{"userId":"user_123","department":"legal"}}' \ -F "file=@contract.pdf"

JavaScript Example

javascript
const formData = new FormData(); formData.append('datasetId', 'my-dataset'); formData.append('file', file); // Metadata matches filename const metadata = { [file.name]: { userId: currentUserId, department: 'engineering', uploadedAt: new Date().toISOString(), isPublic: false } }; formData.append('metadata', JSON.stringify(metadata)); await fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData });

Metadata Matching

You can specify metadata by:

1. Filename (Recommended)

javascript
{ "report.pdf": { userId: "user_123" }, "invoice.xlsx": { userId: "user_456" } }

2. File ID

javascript
{ "f7a3b2c1-4d5e": { category: "important" } }

3. Array Index

javascript
{ "0": { priority: "high" }, "1": { priority: "low" } }

Using Filters in Search

Basic Filter

javascript
const response = await fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'company-docs', question: 'vacation policy', filters: [ { key: 'department', match: { value: 'HR' } } ] }) });

This searches only documents tagged with department: 'HR'.

Multiple Filters (AND Logic)

javascript
const response = await fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'legal-docs', question: 'contract terms', filters: [ { key: 'department', match: { value: 'legal' } }, { key: 'year', match: { value: 2024 } }, { key: 'status', match: { value: 'active' } } ] }) });

All filters must match (AND logic).

Filter Syntax

javascript
{ key: "metadata_field_name", match: { value: "exact_value" } }

Common Patterns

Pattern 1: User Isolation

Each user only sees their own documents:

javascript
// Upload with userId await upload(file, { metadata: { [file.name]: { userId: currentUserId } } }); // Search user's documents await search('shared-dataset', query, { filters: [ { key: 'userId', match: { value: currentUserId } } ] });

Pattern 2: Department Access

Organize by department:

javascript
// Upload await upload(file, { metadata: { [file.name]: { department: 'engineering', team: 'backend' } } }); // Search department docs await search('company-docs', query, { filters: [ { key: 'department', match: { value: 'engineering' } } ] });

Pattern 3: Date-Based Filtering

Filter by year or month:

javascript
// Upload with date metadata await upload(file, { metadata: { [file.name]: { year: 2024, month: 12, quarter: 'Q4' } } }); // Search Q4 2024 documents await search('reports', query, { filters: [ { key: 'year', match: { value: 2024 } }, { key: 'quarter', match: { value: 'Q4' } } ] });

Pattern 4: Boolean Flags

Use true/false for status:

javascript
// Upload with flags await upload(file, { metadata: { [file.name]: { isPublic: false, isArchived: false, requiresApproval: true } } }); // Search only active, non-archived docs await search('documents', query, { filters: [ { key: 'isArchived', match: { value: false } } ] });

Pattern 5: Client/Project Tagging

Tag by customer or project:

javascript
// Upload await upload(file, { metadata: { [file.name]: { clientId: 'acme_corp', projectId: 'project_123', contractType: 'NDA' } } }); // Search client documents await search('legal-docs', query, { filters: [ { key: 'clientId', match: { value: 'acme_corp' } } ] });

Multi-Tenant Strategies

Strategy 1: Dataset Per User (Simplest)

Create a unique dataset for each user:

javascript
const datasetId = `user-${userId}`; // No filters needed - complete isolation await upload(datasetId, file); await search(datasetId, query);

Pros:

  • ✅ Complete data isolation
  • ✅ No filter complexity
  • ✅ Simpler security

Cons:

  • ❌ More datasets to manage
  • ❌ Can't share between users

Strategy 2: Shared Dataset + Filters (Flexible)

One dataset with user filters:

javascript
// Always filter by userId await search('shared-dataset', query, { filters: [ { key: 'userId', match: { value: currentUserId } } ] });

Pros:

  • ✅ Single dataset
  • ✅ Easy to share documents
  • ✅ Flexible permissions

Cons:

  • ❌ Must always apply filters
  • ❌ Risk of data leakage if forgotten

Strategy 3: Hierarchical (Organization → Department → User)

Multi-level organization:

javascript
// Upload with hierarchy await upload(file, { metadata: { [file.name]: { organizationId: 'acme_corp', departmentId: 'engineering', userId: 'user_123' } } }); // Search at different levels // Organization-wide await search('docs', query, { filters: [ { key: 'organizationId', match: { value: 'acme_corp' } } ] }); // Department-wide await search('docs', query, { filters: [ { key: 'organizationId', match: { value: 'acme_corp' } }, { key: 'departmentId', match: { value: 'engineering' } } ] }); // User-specific await search('docs', query, { filters: [ { key: 'userId', match: { value: 'user_123' } } ] });

Real-World Examples

SaaS Application

javascript
// Each user uploads to shared dataset const uploadUserDoc = async (file, userId) => { const formData = new FormData(); formData.append('datasetId', 'saas-documents'); formData.append('file', file); formData.append('metadata', JSON.stringify({ [file.name]: { userId, uploadedAt: Date.now(), isPrivate: true } })); return fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData }); }; // User searches only their documents const searchUserDocs = async (userId, query) => { return fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'saas-documents', question: query, filters: [ { key: 'userId', match: { value: userId } } ] }) }); };

Customer Support

javascript
// Upload support ticket const uploadTicket = async (file, ticketData) => { const formData = new FormData(); formData.append('datasetId', 'support-docs'); formData.append('file', file); formData.append('metadata', JSON.stringify({ [file.name]: { customerId: ticketData.customerId, ticketId: ticketData.ticketId, priority: ticketData.priority, status: 'open', category: ticketData.category } })); return fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData }); }; // Search customer history const searchCustomer = async (customerId) => { return fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'support-docs', question: 'What issues has this customer reported?', filters: [ { key: 'customerId', match: { value: customerId } } ] }) }); };

Legal Documents

javascript
// Upload contract const uploadContract = async (file, contractData) => { const formData = new FormData(); formData.append('datasetId', 'legal-contracts'); formData.append('file', file); formData.append('metadata', JSON.stringify({ [file.name]: { clientId: contractData.clientId, contractType: contractData.type, effectiveDate: contractData.effectiveDate, expiryDate: contractData.expiryDate, status: 'active' } })); return fetch('https://api.easyrag.com/v1/files/upload', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}` }, body: formData }); }; // Search active contracts by type const searchContracts = async (contractType) => { return fetch('https://api.easyrag.com/v1/search', { method: 'POST', headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ datasetId: 'legal-contracts', question: 'What are the termination clauses?', filters: [ { key: 'contractType', match: { value: contractType } }, { key: 'status', match: { value: 'active' } } ] }) }); };

Best Practices

1. Always Include User/Owner ID

javascript
// ✅ Good await upload(file, { metadata: { [file.name]: { userId: currentUserId, // ... other fields } } }); // ❌ Bad - no ownership await upload(file, { metadata: { [file.name]: { category: 'document' } } });

2. Use Consistent Field Names

javascript
// ✅ Good - consistent camelCase { userId: 'user_123', organizationId: 'org_456', departmentId: 'dept_789' } // ❌ Bad - inconsistent { user_id: 'user_123', orgId: 'org_456', department: 'dept_789' }

3. Keep Metadata Flat

javascript
// ✅ Good - flat structure { userId: 'user_123', clientName: 'Acme Corp', clientId: 'client_456' } // ❌ Bad - nested (hard to filter) { userId: 'user_123', client: { name: 'Acme Corp', id: 'client_456' } }

4. Use Boolean for Binary States

javascript
// ✅ Good { isPublic: true, isArchived: false, requiresApproval: false } // ❌ Bad { visibility: 'public', // Use boolean instead archived: 'no', // Use boolean instead }

5. Think About Common Queries

javascript
// Design metadata for how you'll search { userId: 'user_123', // Temporal year: 2024, month: 12, quarter: 'Q4', // Categorical department: 'legal', category: 'contract', // Status isActive: true, isPriority: false }

Security Considerations

Always Enforce Filters Server-Side

javascript
// ✅ SECURE - Backend controls filters app.post('/api/search', authenticateUser, async (req, res) => { const results = await search('shared-docs', req.body.question, { filters: [ // Backend enforces this { key: 'userId', match: { value: req.user.id } } ] }); res.json(results); }); // ❌ INSECURE - Client controls filters app.post('/api/search', async (req, res) => { const results = await search('shared-docs', req.body.question, { filters: req.body.filters // Client can change this! }); res.json(results); });

Validate Metadata on Backend

javascript
// ✅ SECURE app.post('/api/upload', authenticateUser, async (req, res) => { const metadata = { userId: req.user.id, // From authenticated session organizationId: req.user.organizationId }; await upload(req.file, { metadata }); }); // ❌ INSECURE app.post('/api/upload', async (req, res) => { const metadata = req.body.metadata; // Client controls! await upload(req.file, { metadata }); });

Troubleshooting

Filters Not Working

Problem: Search ignores filters

Solutions:

  1. Verify metadata was saved:

    javascript
    const file = await getFile(fileId); console.log(file.extraMeta);
  2. Check filter syntax:

    javascript
    // ✅ Correct { key: 'userId', match: { value: 'user_123' } } // ❌ Wrong { userId: 'user_123' }
  3. Ensure field names match exactly (case-sensitive)

Metadata Not Appearing

Problem: Files don't have metadata

Solutions:

  1. Stringify metadata:

    javascript
    formData.append('metadata', JSON.stringify(metadata));
  2. Match filename:

    javascript
    const metadata = { [file.name]: { userId: '123' } };

Next Steps

Filter Reference

Supported Filters

TypeSyntaxExample
Exact Match{ key: 'field', match: { value: 'x' } }{ key: 'userId', match: { value: 'user_123' } }
Multiple (AND)Array of filters[{key: 'a', match: {value: 1}}, {key: 'b', match: {value: 2}}]

Note: Currently only exact match is supported. For range queries or OR logic, contact support.