Building Multi-Tenant RAG Applications
Best practices for building secure, scalable multi-tenant RAG applications.
Building Multi-Tenant RAG Applications
Multi-tenancy is crucial for SaaS applications, but it presents unique challenges when building RAG systems. In this guide, we'll explore best practices for building secure, scalable multi-tenant RAG applications.
What is Multi-Tenancy in RAG?
Multi-tenancy means serving multiple customers (tenants) from a single application instance while keeping their data completely isolated. In RAG systems, this includes:
- Documents: Each tenant's files are separate
- Vector embeddings: Tenant data must not leak across boundaries
- Search results: Users only see their own data
- Metadata: Custom fields per tenant
Architecture Patterns
Pattern 1: Shared Collection with Filtering
The most cost-effective approach for most applications.
javascript// Store all tenants in one vector collection const vectorDb = new QdrantDb({ collection: 'shared-collection', customerId: 'customer_123', datasetId: 'dataset_456' }); // Filter is applied automatically const results = await vectorDb.search(query);
Pros:
- Simple to manage
- Cost-effective (1 collection for all tenants)
- Scales to 100,000+ tenants
- Lower memory overhead
Cons:
- All customers share resources
- One slow query can affect others
- Perceived isolation concerns
When to use: Start-ups to mid-scale SaaS (< 10,000 customers)
Pattern 2: Per-Customer Collections
Complete isolation with dedicated vector collections per customer.
javascript// Each customer gets their own collection const collectionName = `customer_${customerId}`; const vectorDb = new QdrantDb({ collection: collectionName });
Pros:
- True hardware isolation
- Easy to identify resource usage
- Can move customers independently
- No noisy neighbor issues
Cons:
- Management complexity
- Higher costs (5-10x more)
- Collection limits (Qdrant: ~1000)
- Wasted resources for small customers
When to use: Enterprise-focused, < 100 customers, high-value accounts
Pattern 3: Hybrid (Recommended at Scale)
Combine both approaches for optimal cost and performance.
javascriptasync function getCollectionStrategy(customerId) { const customer = await getCustomer(customerId); // Enterprise: dedicated collection if (customer.tier === 'enterprise' || customer.vectorCount > 1000000) { return { collection: `customer_${customerId}`, useFilters: false }; } // Everyone else: shared with filters return { collection: 'shared-collection', useFilters: true }; }
Pros:
- Best of both worlds
- 95% of customers in cost-effective shared
- 5% enterprise customers get isolation
- Scales to 100,000+ total customers
Cons:
- More complex logic
- Need migration path
- Two codepaths to maintain
When to use: Scaling SaaS with enterprise tier (10,000+ customers)
Security Best Practices
1. Always Filter by Customer ID
Never trust customer IDs from the request - always get from authentication.
javascript// ❌ INSECURE - user controls customerId app.post('/search', async (req, res) => { const { customerId, query } = req.body; // User could spoof this! const results = await search(customerId, query); res.json(results); }); // ✅ SECURE - customerId from auth app.post('/search', authenticate, async (req, res) => { const customerId = req.user.id; // From verified token/session const { query } = req.body; const results = await search(customerId, query); res.json(results); });
2. Server-Side Filtering
Always apply tenant filters on the server, never trust client-side filters.
javascript// Server enforces tenant isolation export function makeVectorDb({ customerId, datasetId }) { return new QdrantDb({ collection: 'shared-collection', // These are baked into the DB instance customerId, // From auth datasetId // From auth or request }); } // All queries are automatically scoped const results = await vectorDb.search(query); // Qdrant only searches this customer's vectors
3. Dataset-Level Access Control
Allow customers to segment their own data.
javascript// Upload with metadata await client.upload('dataset-123', file, { metadata: { 'document.pdf': { userId: 'user_456', department: 'legal', confidential: true } } }); // Search with filters const results = await client.search('dataset-123', query, { filters: [ { key: 'department', match: { value: 'legal' } }, { key: 'confidential', match: { value: false } } ] });
4. Validate Dataset Ownership
Ensure users can only access their own datasets.
javascriptapp.post('/search', authenticate, async (req, res) => { const customerId = req.user.id; const { datasetId, query } = req.body; // Verify ownership const dataset = await getDataset(datasetId); if (dataset.customerId !== customerId) { return res.status(403).json({ error: 'Access denied' }); } // Now safe to search const results = await search(customerId, datasetId, query); res.json(results); });
Performance Optimization
1. Index Payload Fields
Create indexes for all fields you filter on.
javascript// Qdrant needs indexes for efficient filtering await qdrantClient.createPayloadIndex(collection, { field_name: 'customerId', field_schema: 'keyword' }); await qdrantClient.createPayloadIndex(collection, { field_name: 'datasetId', field_schema: 'keyword' });
Without indexes, filters are slow (full collection scan).
2. Monitor Per-Tenant Usage
Track usage to identify problem customers.
javascript// Log query metadata await logQuery({ customerId, datasetId, queryTime: endTime - startTime, resultsCount: results.length, timestamp: new Date() }); // Alert on heavy usage if (queryTime > 5000) { await alertSlowQuery(customerId, datasetId); }
3. Rate Limiting Per Tenant
Prevent one customer from overwhelming your system.
javascriptimport rateLimit from 'express-rate-limit'; const limiter = rateLimit({ windowMs: 60 * 1000, // 1 minute max: async (req) => { const customer = await getCustomer(req.user.id); return customer.tier === 'enterprise' ? 1000 : 100; }, keyGenerator: (req) => req.user.id // Per customer }); app.use('/api', limiter);
Data Isolation Patterns
Pattern 1: Separate Datasets
Customers manage multiple isolated datasets.
javascript// Customer has multiple datasets const datasets = [ 'customer-123-public', 'customer-123-private', 'customer-123-archived' ]; // Each dataset is fully isolated await client.upload('customer-123-private', sensitiveFile); await client.search('customer-123-private', query);
Use case: Different projects, departments, or security levels
Pattern 2: User-Level Isolation
Sub-tenant isolation within a customer.
javascript// Upload with user metadata await client.upload('customer-dataset', file, { metadata: { 'file.pdf': { userId: 'user_789', sharedWith: ['user_790', 'user_791'] } } }); // Search only user's documents const results = await client.search('customer-dataset', query, { filters: [ { key: 'userId', match: { value: currentUserId } } ] });
Use case: Multi-user SaaS apps, collaborative platforms
Pattern 3: Hierarchical Isolation
Nested tenant structures.
javascript// Organization → Team → User hierarchy await client.upload('shared', file, { metadata: { 'file.pdf': { orgId: 'org_123', teamId: 'team_456', userId: 'user_789' } } }); // Filter at any level const orgResults = await client.search('shared', query, { filters: [ { key: 'orgId', match: { value: 'org_123' } } ] }); const teamResults = await client.search('shared', query, { filters: [ { key: 'orgId', match: { value: 'org_123' } }, { key: 'teamId', match: { value: 'team_456' } } ] });
Use case: Enterprise apps with complex org structures
Billing and Usage Tracking
Track Credits Per Customer
javascript// Charge for operations async function chargeCredits({ customerId, operation, amount }) { const customer = await getCustomer(customerId); if (customer.credits < amount) { throw new Error('INSUFFICIENT_CREDITS'); } await updateCustomer(customerId, { credits: customer.credits - amount, usage: { [operation]: (customer.usage[operation] || 0) + 1 } }); } // Apply to operations await chargeCredits({ customerId, operation: 'upload', amount: 1 // 1 credit per file }); await chargeCredits({ customerId, operation: 'query', amount: 0.1 // 0.1 credits per query });
Usage Analytics
javascript// Track vector count per customer async function getCustomerUsage(customerId) { const vectorCount = await vectorDb.count({ filter: { must: [{ key: 'customerId', match: { value: customerId } }] } }); const fileCount = await db.collection('files') .where('customerId', '==', customerId) .count(); return { vectorCount, fileCount }; }
Testing Multi-Tenancy
Test Data Isolation
javascriptdescribe('Multi-tenancy', () => { it('should not return other customers data', async () => { // Upload for customer A await client.upload('customer-a-dataset', fileA, { metadata: { 'file.pdf': { customerId: 'customer-a' } } }); // Upload for customer B await client.upload('customer-b-dataset', fileB, { metadata: { 'file.pdf': { customerId: 'customer-b' } } }); // Search as customer A const clientA = new EasyRAG(customerAToken); const resultsA = await clientA.search('customer-a-dataset', 'test'); // Should only see customer A's data expect(resultsA.data.every(r => r.metadata.customerId === 'customer-a' )).toBe(true); }); });
Load Testing Per Tenant
javascript// Simulate multiple tenants async function loadTest() { const tenants = Array.from({ length: 100 }, (_, i) => ({ id: `customer-${i}`, token: generateToken(`customer-${i}`) })); // Concurrent queries from all tenants await Promise.all( tenants.map(tenant => fetch('/api/search', { headers: { Authorization: `Bearer ${tenant.token}` }, body: JSON.stringify({ query: 'test' }) }) ) ); }
Migration Strategy
Moving from Single to Multi-Tenant
javascript// 1. Add tenant metadata to existing vectors async function backfillTenantMetadata() { const files = await db.collection('files').get(); for (const file of files) { await vectorDb.updatePayload({ points: file.vectorIds, payload: { customerId: file.customerId, datasetId: file.datasetId } }); } } // 2. Update queries to use filters // Before const results = await vectorDb.search(query); // After const results = await vectorDb.search(query, { filter: { must: [ { key: 'customerId', match: { value: customerId } } ] } });
Moving Customers Between Collections
javascriptasync function migrateCustomer(customerId) { const sourceCollection = 'shared-collection'; const targetCollection = `customer-${customerId}`; // 1. Create new collection await createCollection(targetCollection); // 2. Copy vectors const vectors = await getCustomerVectors(customerId, sourceCollection); await insertVectors(targetCollection, vectors); // 3. Update references await updateCustomerMetadata(customerId, { collection: targetCollection }); // 4. Delete from source (after verification) await deleteCustomerVectors(customerId, sourceCollection); }
Conclusion
Multi-tenancy in RAG systems requires careful consideration of:
- Architecture: Choose the right pattern for your scale
- Security: Server-side filtering, auth-based isolation
- Performance: Proper indexing, monitoring, rate limiting
- Testing: Verify data isolation, load test per tenant
- Economics: Balance cost vs. isolation needs
Start simple with shared collections and filters. Evolve to hybrid as you scale. Most SaaS companies never need per-customer collections.
Resources
Questions? Reach out at support@easyrag.com
