Troubleshooting Guide
About this document
This page provides troubleshooting guidance for the Chat-with-RAG system, including common issues, error solutions, and debugging steps.
Note: If you landed here directly (for example from documentation hosting or search), start with the repository README to see how to run the system locally and try the interactive demo.
This guide covers common issues, errors, and solutions for the chat-with-rag system.
Table of Contents
- Environment Setup Issues
- Docker & Infrastructure Issues
- API & Connection Issues
- Embedding & Indexing Issues
- Chat & Retrieval Issues
- Performance Issues
- Configuration Issues
Environment Setup Issues
Python Environment Issues
Issue: ModuleNotFoundError or ImportError after installation
Solutions:
# Ensure virtual environment is activated
source venv/bin/activate # On Windows: venv\Scripts\activate
# Reinstall dependencies
pip install -r requirements.txt
# Verify installation
python -c "import backend.main; print('Import successful')"
Issue: SSL Certificate errors on macOS
Solution:
# Install certificates for Python
open "/Applications/Python 3.12/Install Certificates.command"
API Key Issues
Issue: 401 Unauthorized or Invalid API key errors
Solutions:
- Verify API key in
.envfile:cat .env | grep API_KEY - Check API key format and permissions:
- OpenAI: Ensure key has appropriate permissions and budget limits
- Gemini: Verify quota limits in Google Cloud Console
- Test API key manually:
# OpenAI test curl -H "Authorization: Bearer $OPENAI_API_KEY" https://api.openai.com/v1/models # Gemini test curl -H "x-goog-api-key: $GEMINI_API_KEY" https://generativelanguage.googleapis.com/v1/models
Docker & Infrastructure Issues
Docker Desktop Not Running
Issue: Cannot connect to the Docker daemon
Solutions:
- Start Docker Desktop manually
- Verify Docker status:
docker --version docker compose version - On Linux, add user to docker group:
sudo usermod -aG docker $USER # Log out and back in
Qdrant Connection Issues
Issue: Connection refused to Qdrant
Solutions:
- Check if Qdrant container is running:
docker ps | grep qdrant - Restart services:
make stop make start - Check Qdrant logs:
docker logs qdrant_container_name
Container Updates After Code Changes
Issue: Services not reflecting recent code changes after git pull
Solution: If you’ve pulled new changes, rebuild the containers to pick up service updates:
docker compose up --build
Port Conflicts
Issue: Port 8000 or 6333 already in use
Solutions:
- Find process using the port:
lsof -i :8000 # or :6333 for Qdrant - Kill the process or change ports in
docker-compose.yml
API & Connection Issues
CORS Issues
Issue: CORS policy errors in browser
Solutions:
- Verify
ALLOWED_ORIGINSin.env:echo "ALLOWED_ORIGINS=http://localhost:8000,http://127.0.0.1:8000" >> .env - Check origin enforcement in backend logs
SSE Streaming Issues
Issue: Streaming not working or connection drops
Solutions:
- Check if
show_processing_stepsis enabled - Verify browser supports SSE (most modern browsers do)
- Check network tab in browser dev tools for connection status
- Ensure no proxy/firewall blocking SSE connections
Timeout Issues
Issue: Requests timing out during processing
Solutions:
- Increase timeout in client:
requests.post(url, json=payload, timeout=120) # Increase from default 30s - Check if processing is actually slow (check logs)
- Consider reducing
top_kor enablingestimatemode first
Embedding & Indexing Issues
Embedding Dimension Mismatch
Issue: Dimension mismatch errors during indexing or retrieval
Solutions:
- Check embedding model configuration in
backend/core/config.py - Ensure all documents use same embedding model
- Verify domain configuration matches collection:
# Check active domain and corresponding model from backend.core.config import settings print(f"Domain: {settings.active_domain}") print(f"Model: {settings.embedding_model_key}") print(f"Collection: {settings.collection_name}") - Clear and re-index if changing models:
python scripts/qdrant_scripts/qdrant_ops.py --delete-collection document_index make seed
Gemini Embedding Normalization
Issue: Poor retrieval quality with Gemini embeddings
Solutions:
- Ensure
gemini_embedding_normalize=truein configuration - Test normalization consistency:
python scripts/test_gemini_normalization.py - Compare OpenAI vs Gemini embeddings:
python scripts/test_gemini_embed_retrieval.py - Consider using OpenAI embeddings if issues persist
Rate Limiting
Issue: Rate limit exceeded from embedding providers
Solutions:
- Reduce
embedding_batch_sizein config - Add delays between batches
- Check provider rate limits and upgrade plan if needed
Memory Issues During Indexing
Issue: Out of memory errors with large documents
Solutions:
- Reduce
default_chunk_sizein configuration (try 300-500) - Process documents in smaller batches
- Enable
estimatemode first to preview costs - Reduce
embedding_batch_sizeif processing many documents
Chat & Retrieval Issues
No Results Found
Issue: Empty results from retrieval
Solutions:
- Lower
score_threshold(try 0.2-0.3) - Increase
top_k(try 10-20) - Check if documents are actually indexed:
python scripts/qdrant_scripts/qdrant_ops.py --list-titles --limit 10 - Verify collection name matches indexed data
- Check
active_domainconfiguration - ensure you’re using the correct domain/collection
Poor Quality Results
Issue: Retrieved documents not relevant
Solutions:
- Enable query rewrite:
"enable_query_rewrite": true - Adjust
rewrite_confidence_threshold(try 0.6-0.8) - Check document chunking strategy:
- Reduce
default_chunk_size(try 300-500) - Adjust
default_chunk_overlap(try 50-100)
- Reduce
- Consider re-indexing with different chunk size
- Enable reranking if not already enabled
Tool Calling Issues
Issue: Tools not being called or failing
Solutions:
- Ensure
"use_tools": truein params - Check tool configurations in backend
- Verify API keys for external services (weather, airports)
- Check tool execution logs
- Verify
max_tool_passesallows sufficient tool loops
Reasoning Model Issues
Issue: Reasoning models not working or showing reasoning
Solutions:
- Use correct model keys:
- OpenAI:
openai:reasoning_o3-minioropenai:reasoning_gpt-5-mini - Gemini:
gemini:openai-3-flash-previeworgemini:openai-reasoning-2.5-flash
- OpenAI:
- Set correct reasoning parameters:
- OpenAI:
"reasoning_effort": "low"|"medium"|"high" - Gemini:
"thinking_level": "minimal"|"low"|"medium"|"high"
- OpenAI:
- Enable
debug_thoughts=trueto see reasoning output - Check model registry for reasoning support
Query Rewrite Issues
Issue: Query rewrite not working or always rejected
Solutions:
- Ensure
"enable_query_rewrite": truein params - Check
rewrite_confidence_threshold(default 0.7, try 0.6) - Verify conversation history exists (rewrite needs context)
- Check
rewrite_tail_turnsandrewrite_summary_turnssettings - Look for rewrite display in response to see why rejected
Session Management Issues
Issue: Session-based chat not maintaining context
Solutions:
- Ensure consistent
session_idacross requests - Check session manager logs for session creation/deletion
- Verify session TTL settings (
chunk_manager_idle_ttl_seconds) - Test with simple session first to isolate the issue
Performance Issues
Slow Response Times
Issue: Chat responses taking too long
Solutions:
- Enable
estimatemode to preview costs/time - Reduce
top_kandmax_output_tokens - Use faster models for non-critical stages
- Enable caching where possible
High Memory Usage
Issue: System running out of memory
Solutions:
- Reduce batch sizes in configuration:
embedding_batch_size: try 50-100default_chunk_size: try 300-500
- Process documents sequentially instead of in parallel
- Monitor memory usage during indexing
- Consider using smaller models for non-critical stages
Configuration Issues
Invalid Configuration
Issue: Application fails to start with config errors
Solutions:
- Check configuration syntax:
python -c "from backend.core.config import settings; print('Config OK')" - Validate environment variables in
.env - Check for typos in configuration keys
Domain/Collection Issues
Issue: No search results or poor results due to wrong domain/collection
Common Scenarios:
- Indexed data with OpenAI but searching with Gemini (or vice versa)
active_domaindoesn’t match the collection that contains your data- Collection exists but has wrong embedding dimensions
Solutions:
- Check current domain configuration:
from backend.core.config import settings print(f"Active domain: {settings.active_domain}") print(f"Collection: {settings.collection_name}") print(f"Embedding model: {settings.embedding_model_key}") - Verify domain mappings:
from backend.core.config import settings for domain, config in settings.DOMAIN_EMBEDDING_CONFIG.items(): print(f"{domain}: {config['collection_name']} -> {config['embedding_model_key']}") - Check what’s actually indexed:
python scripts/qdrant_scripts/qdrant_ops.py --list-collections python scripts/qdrant_scripts/qdrant_ops.py --count-chunks --collection document_index python scripts/qdrant_scripts/qdrant_ops.py --count-chunks --collection document_index_gemini - Switch domain if needed:
- Mountains domain: Uses OpenAI embeddings (
document_index) - Oceans domain: Uses Gemini embeddings (
document_index_gemini) - Change
active_domaininbackend/core/config.pyor setACTIVE_DOMAINenvironment variable
- Mountains domain: Uses OpenAI embeddings (
- Re-index with correct model if domain/collection mismatch:
# Switch to correct domain first, then re-seed export ACTIVE_DOMAIN=oceans # or mountains make seed
Getting Help
Debug Mode
Enable debug logging for more detailed information:
# Add to .env
DEBUG_VERBOSE=true
DEBUG_LOG_KEYS=true
PROMPT_REGISTRY_LOG_FULL=1
Log Locations
- Application logs: Console output or configured log file
- Docker logs:
docker logs <container_name> - Qdrant logs:
docker logs qdrant_container
Common Debug Commands
# Check indexed documents
python scripts/qdrant_scripts/qdrant_ops.py --list-titles --limit 10
# Test API connections
python scripts/api_smoke_test_openai.py
python scripts/api_smoke_test_gemini.py
# Check collections
python scripts/qdrant_scripts/qdrant_ops.py --list-collections
# Test embedding generation
python scripts/embedding_compare.py
# Test Gemini embeddings specifically
python scripts/test_gemini_client_embeddings.py
# Test reasoning models
python scripts/test_gemini_reasoning.py
# Test chunked history manager
python scripts/test_chunked_history_manager.py
# Check Qdrant collection info
python scripts/qdrant_scripts/qdrant_ops.py --collection-info document_index
When to Ask for Help
If you’ve tried the above solutions and still have issues:
- Check the GitHub Issues for similar problems
- Create a new issue with:
- Error messages (full stack traces)
- Configuration details (remove sensitive info)
- Steps to reproduce
- System information (OS, Docker version, etc.)
Prevention Tips
- Always test with estimate mode first before large indexing operations
- Monitor API usage and costs regularly
- Keep backups of important configurations
- Document custom configurations for your specific use case
- Regular updates to stay current with fixes and improvements