RAG (Retrieval-Augmented Generation) System in SmartAdmin
Overview
SmartAdmin implements a sophisticated RAG system that enhances AI responses with contextual knowledge from local documents, creating more informed and accurate AI interactions.
Architecture
Core Components
1. RAG Handler (rag-handler.js)
- Primary Controller: Orchestrates the entire RAG pipeline
- Initialization:
initializeRAGSystem Functionsets up FastEmbed and LanceDB - Document Management: Handles adding, searching, and refreshing knowledge base
- Query Processing: Converts user queries into vector searches
2. FastEmbed Engine (fastembed-handler.js)
- Model: BGE-base-en (768-dimensional embeddings)
- Alternative Models: BGE-small-en (384D), BGE-large-en (1024D)
- Functionality: Converts text into dense vector representations
- Similarity Metrics: Multiple calculation methods for document relevance
3. LanceDB Vector Store (lancedb-handler.js)
- Database: Apache Arrow-based vector database
- Schema: Structured storage for embeddings with metadata
- Search: Efficient similarity search with distance thresholds
- Storage Location:
~/Documents/smartadmin/shelf_db/
Knowledge Base Structure
Document Storage
- Source Directory:
~/Documents/smartadmin/shelf/ - Format: Markdown files (
.md) - Content: Administrative procedures, policies, SOPs
- Processing: Automatic description extraction from markdown
Vector Database Schema
// LanceDB Schema
{
id: 'string', // Unique document identifier
text: 'string', // Document description/content
vector: 'float32[768]', // 768-dimensional embedding
file: 'string' // Source file path
}
RAG Workflow
1. Document Ingestion
// From rag-handler.js
async function addDocument(text, id = null, filepath = '') {
// Generate embeddings using FastEmbed
const embeddingGen = await embedSentence(fastEmbedModel, [text]);
// Store in LanceDB with metadata
const item = { id, text, vector, file: filepath };
return await addItem(item);
}
2. Library Refresh Process
// Automatic knowledge base updates
async function refreshLibrary() {
// Load markdown files from shelf directory
const latestKnowledge = await loadShelfData();
// Process each document
for (const item of latestKnowledge) {
await addDocument(item.description, thisId, item.file);
}
}
3. Query Processing
// Vector similarity search
async function searchRelevantDocs(query, limit = 5, distanceThreshold = null) {
// Convert query to vector
const queryVector = await embedSentence(fastEmbedModel, [query]);
// Search similar documents
return await searchSimilar(queryVector, limit, 'embeddings', distanceThreshold);
}
Integration with AI System
Context Enhancement Pipeline
The RAG system integrates with ai-manager.js to enhance AI responses:
async function generateResponse(message, options = {}) {
// Get 3 most relevant documents
const relatedDocs = await searchRelevantDocs(message, 3, 0.5);
// Build enhanced context
const contextParts = [];
for (const doc of relatedDocs) {
let content = fs.readFileSync(doc.file, 'utf-8');
contextParts.push(`[Source: ${filename}]\n${content}\n`);
}
// Inject context into user query
contextMessage = `Context info:\n${context}\n\nUser Query:\n${message}`;
}
Similarity Metrics
Multi-Metric Evaluation
The system implements six similarity calculation methods in fastembed-handler.js:
- Cosine Similarity: Primary metric for document relevance
- Euclidean Distance: Geometric distance measurement
- Manhattan Distance: L1 distance calculation
- Dot Product Similarity: Vector multiplication scoring
- Jaccard Similarity: Binary vector comparison
- Enhanced Cosine Similarity: Threshold-based filtering (≥0.7)
Example Output
📊 Cosine Similarity ("delivery policy"):
Rank | Score | Document
----------------------------------------
1 | 0.9598 | Delivery Address Change SOP
2 | 0.9333 | Time Window Modification Rules
3 | 0.8749 | Customer Service Guidelines
Database Management
Retrieval Integration
The RAG system can trigger database queries when documents indicate data retrieval needs:
// From database-handler.js integration
async function checkDatabaseRetrievalRelated(content) {
if (content.includes("## Database Query")) {
const sqlQuery = await extractSQLFromMarkdown(content);
const returnData = await executeQuery(sqlQuery);
return { isDBRelated: true, data: returnData };
}
}
Performance Features
Efficient Processing
- Lazy Loading: Models initialized on first use
- Caching: Vector embeddings cached for quick retrieval
- Batch Processing: Multiple documents processed simultaneously
- Distance Thresholds: Configurable relevance filtering
Real-time Updates
- Auto-refresh: Library updates when new documents added
- File Monitoring: Detects changes in shelf directory
- Incremental Updates: Only processes modified documents
Configuration & Initialization
Startup Sequence
// From main.js
await initializeRAGSystem(); // Initialize FastEmbed + LanceDB
await refreshLibrary(); // Load knowledge base
Status Monitoring
// Get RAG system status
async function getRAGStatus() {
const allItems = await table.query().toArray();
return {
itemCount: allItems.length,
itemIds: allItems.map(item => item.id)
};
}
Use Cases
1. Policy Consultation
- Query: "What's the delivery address change policy?"
- RAG Response: Retrieves relevant SOP documents
- AI Output: Policy-aware response with specific procedures
2. Administrative Guidance
- Query: "How do I process a refund?"
- RAG Response: Finds refund processing guidelines
- AI Output: Step-by-step instructions based on documentation
3. Data-Driven Responses
- Query: "Show recent orders with delivery issues"
- RAG Response: Triggers database query via markdown instructions
- AI Output: Live data analysis with contextual insights
Benefits
Enhanced Accuracy
- Contextual Responses: AI answers grounded in actual documentation
- Reduced Hallucination: Factual information from verified sources
- Policy Compliance: Responses aligned with organizational procedures
Knowledge Management
- Centralized Repository: Single source of truth for procedures
- Version Control: Track document changes and updates
- Searchable Knowledge: Vector-based semantic search capabilities
Scalability
- Modular Design: Easy to add new knowledge domains
- Performance Optimization: Efficient vector operations
- Multi-Model Support: Compatible with different embedding models
This RAG implementation transforms SmartAdmin from a simple chatbot into an intelligent administrative assistant that leverages organizational knowledge to provide accurate, context-aware responses for complex administrative tasks.