Deployment Guide
About this document
This page covers tested deployment scenarios for the Chat-with-RAG system, including Docker setup, production considerations, and security best practices.
Note: If you landed here directly (for example from documentation hosting or search), start with the repository README to see how to run the system locally and try the interactive demo.
Table of Contents
- Deployment Overview
- Docker Deployment (Supported)
- Production Considerations
- Security Best Practices
- Backup and Maintenance
Deployment Overview
The chat-with-rag system consists of several components:
Core Components
- Web Application (FastAPI backend + frontend)
- Vector Database (Qdrant)
- LLM Providers (OpenAI, Gemini, or custom)
- Optional External Services (weather, airports, web search)
Currently Supported Deployment Scenarios
- Local Development - Docker Compose on single machine ✅
- Single-Node Docker - All services on one host ✅
- Basic Production - Docker with environment variables ✅
Not Yet Tested/Documented
- Multi-Node deployments (separate app/database servers)
- Cloud platforms (AWS, Google Cloud, Azure)
- Load balancers (nginx, HAProxy)
- Container orchestration (Kubernetes)
- Managed cloud services
Note: The system is designed to be cloud-compatible, but specific cloud deployment guides are not yet tested or documented. Contributions welcome!
Docker Deployment (Supported)
Quick Start (Development)
# Clone and setup
git clone https://github.com/vrraj/chat-with-rag.git
cd chat-with-rag
# Run the bootstrap script
bash scripts/rag_setup.sh
Manual Docker Setup
# Create environment file
cp .env.example .env
# Edit .env with your API keys
# Start services
make start
# Seed sample data
make seed
Production Docker Compose
For production, create a docker-compose.prod.yml:
version: '3.8'
services:
app:
build: .
ports:
- "8000:8000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- GEMINI_API_KEY=${GEMINI_API_KEY}
- QDRANT_HOST=qdrant
- QDRANT_PORT=6333
depends_on:
- qdrant
restart: unless-stopped
volumes:
- ./logs:/app/logs
qdrant:
image: qdrant/qdrant:v1.14.1
ports:
- "6333:6333"
volumes:
- qdrant_storage:/qdrant/storage
restart: unless-stopped
volumes:
qdrant_storage:
Production Considerations
Environment Variables
# Required
OPENAI_API_KEY=your_openai_key_here
GEMINI_API_KEY=your_gemini_key_here
# Optional
QDRANT_HOST=localhost
QDRANT_PORT=6333
DEBUG=false
LOG_LEVEL=INFO
Resource Requirements
Minimum (Development/Small Dataset):
- CPU: 2 cores
- Memory: 4GB RAM
- Storage: 1GB (for Qdrant data + models)
Recommended (Production/Large Dataset):
- CPU: 4 cores
- Memory: 8GB RAM
- Storage: 10GB SSD (for large knowledge bases)
Storage Scaling:
- Small datasets (~100 documents): ~100MB storage
- Medium datasets (~1K documents): ~500MB storage
- Large datasets (~10K documents): ~2-5GB storage
Health Checks
# Check application health
curl http://localhost:8000/health
# Check Qdrant health
curl http://localhost:6333/health
Security Best Practices
API Key Management
# Use environment variables, never commit keys
echo ".env" >> .gitignore
# In production, use secret management:
# - Docker secrets
# - Environment variable injection
# - Cloud secret managers (when supported)
Network Security
# Use Docker networks for internal communication
docker network create app-network
# Don't expose Qdrant to public internet
# Only expose the web application (port 8000)
SSL/TLS
For production, consider:
- Reverse proxy with SSL termination
- Let’s Encrypt certificates
- Internal network encryption
Backup and Maintenance
Data Backup
# Backup Qdrant data
docker exec qdrant-server tar -czf /tmp/backup.tar.gz -C /qdrant/storage .
docker cp qdrant-server:/tmp/backup.tar.gz ./backups/
# Backup configuration
tar -czf config_backup.tar.gz .env docker-compose.prod.yml
Log Management
# Configure logging in production
LOG_LEVEL=INFO
LOG_FILE=/app/logs/app.log
# Rotate logs
logrotate -f /etc/logrotate.d/chat-with-rag
Updates
# Update application
git pull origin main
docker compose down
docker compose build
docker compose up -d
# Re-seed if needed (note: overwrites document_index and document_index_gemini collections)
make seed
Future Deployment Options
The following deployment scenarios are architecturally possible but not yet documented:
- Cloud Platforms: AWS ECS, Google Cloud Run, Azure Container Instances
- Load Balancing: nginx, HAProxy, cloud load balancers
- Container Orchestration: Kubernetes, Docker Swarm
- Managed Services: Qdrant Cloud, managed databases
Contributions: If you successfully deploy to any of these platforms, please consider contributing deployment documentation!
Troubleshooting
Common Issues
- Qdrant connection failed
- Check if Qdrant is running:
docker ps | grep qdrant - Verify network connectivity
- Check if Qdrant is running:
- API key errors
- Ensure keys are properly set in environment
- Check for typos in key names
- Memory issues
- Increase available RAM
- Monitor Qdrant memory usage
Debug Commands
# Check container logs
docker logs chat-with-rag_app_1
docker logs chat-with-rag_qdrant_1
# Check resource usage
docker stats
# Test API endpoints
curl http://localhost:8000/api/config