Troubleshooting Guide

About this document

This page provides troubleshooting guidance for the Chat-with-RAG system, including common issues, error solutions, and debugging steps.

Note: If you landed here directly (for example from documentation hosting or search), start with the repository README to see how to run the system locally and try the interactive demo.

This guide covers common issues, errors, and solutions for the chat-with-rag system.

Environment Setup Issues
Docker & Infrastructure Issues
API & Connection Issues
Embedding & Indexing Issues
Chat & Retrieval Issues
Performance Issues
Configuration Issues

Environment Setup Issues

Python Environment Issues

Issue: ModuleNotFoundError or ImportError after installation

Solutions:

# Ensure virtual environment is activated
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Reinstall dependencies
pip install -r requirements.txt

# Verify installation
python -c "import backend.main; print('Import successful')"

Issue: SSL Certificate errors on macOS

Solution:

# Install certificates for Python
open "/Applications/Python 3.12/Install Certificates.command"

API Key Issues

Issue: 401 Unauthorized or Invalid API key errors

Solutions:

Verify API key in .env file:
```
cat .env | grep API_KEY
```
Check API key format and permissions:
- OpenAI: Ensure key has appropriate permissions and budget limits
- Gemini: Verify quota limits in Google Cloud Console

Test API key manually:

# OpenAI test
curl -H "Authorization: Bearer $OPENAI_API_KEY" https://api.openai.com/v1/models
   
# Gemini test  
curl -H "x-goog-api-key: $GEMINI_API_KEY" https://generativelanguage.googleapis.com/v1/models

Docker & Infrastructure Issues

Docker Desktop Not Running

Issue: Cannot connect to the Docker daemon

Solutions:

Start Docker Desktop manually

Verify Docker status:

docker --version
docker compose version

On Linux, add user to docker group:

sudo usermod -aG docker $USER
# Log out and back in

Qdrant Connection Issues

Issue: Connection refused to Qdrant

Solutions:

Check if Qdrant container is running:
```
docker ps | grep qdrant
```
Restart services:
```
make stop
make start
```
Check Qdrant logs:
```
docker logs qdrant_container_name
```

Container Updates After Code Changes

Issue: Services not reflecting recent code changes after git pull

Solution: If you’ve pulled new changes, rebuild the containers to pick up service updates:

docker compose up --build

Port Conflicts

Issue: Port 8000 or 6333 already in use

Solutions:

Find process using the port:
```
lsof -i :8000  # or :6333 for Qdrant
```
Kill the process or change ports in docker-compose.yml

API & Connection Issues

CORS Issues

Issue: CORS policy errors in browser

Solutions:

Verify ALLOWED_ORIGINS in .env:

echo "ALLOWED_ORIGINS=http://localhost:8000,http://127.0.0.1:8000" >> .env

Check origin enforcement in backend logs

SSE Streaming Issues

Issue: Streaming not working or connection drops

Solutions:

Check if show_processing_steps is enabled
Verify browser supports SSE (most modern browsers do)
Check network tab in browser dev tools for connection status
Ensure no proxy/firewall blocking SSE connections

Timeout Issues

Issue: Requests timing out during processing

Solutions:

Increase timeout in client:

requests.post(url, json=payload, timeout=120)  # Increase from default 30s

Check if processing is actually slow (check logs)
Consider reducing top_k or enabling estimate mode first

Embedding & Indexing Issues

Embedding Dimension Mismatch

Issue: Dimension mismatch errors during indexing or retrieval

Solutions:

Check embedding model configuration in backend/core/config.py
Ensure all documents use same embedding model

Verify domain configuration matches collection:

# Check active domain and corresponding model
from backend.core.config import settings
print(f"Domain: {settings.active_domain}")
print(f"Model: {settings.embedding_model_key}")
print(f"Collection: {settings.collection_name}")

Clear and re-index if changing models:

python scripts/qdrant_scripts/qdrant_ops.py --delete-collection document_index
make seed

Gemini Embedding Normalization

Issue: Poor retrieval quality with Gemini embeddings

Solutions:

Ensure gemini_embedding_normalize=true in configuration

Test normalization consistency:

python scripts/test_gemini_normalization.py

Compare OpenAI vs Gemini embeddings:

python scripts/test_gemini_embed_retrieval.py

Consider using OpenAI embeddings if issues persist

Rate Limiting

Issue: Rate limit exceeded from embedding providers

Solutions:

Reduce embedding_batch_size in config
Add delays between batches
Check provider rate limits and upgrade plan if needed

Memory Issues During Indexing

Issue: Out of memory errors with large documents

Solutions:

Reduce default_chunk_size in configuration (try 300-500)
Process documents in smaller batches
Enable estimate mode first to preview costs
Reduce embedding_batch_size if processing many documents

Chat & Retrieval Issues

No Results Found

Issue: Empty results from retrieval

Solutions:

Lower score_threshold (try 0.2-0.3)
Increase top_k (try 10-20)

Check if documents are actually indexed:

python scripts/qdrant_scripts/qdrant_ops.py --list-titles --limit 10

Verify collection name matches indexed data
Check active_domain configuration - ensure you’re using the correct domain/collection

Poor Quality Results

Issue: Retrieved documents not relevant

Solutions:

Enable query rewrite: "enable_query_rewrite": true
Adjust rewrite_confidence_threshold (try 0.6-0.8)
Check document chunking strategy:
- Reduce default_chunk_size (try 300-500)
- Adjust default_chunk_overlap (try 50-100)
Consider re-indexing with different chunk size
Enable reranking if not already enabled

Tool Calling Issues

Issue: Tools not being called or failing

Solutions:

Ensure "use_tools": true in params
Check tool configurations in backend
Verify API keys for external services (weather, airports)
Check tool execution logs
Verify max_tool_passes allows sufficient tool loops

Reasoning Model Issues

Issue: Reasoning models not working or showing reasoning

Solutions:

Use correct model keys:
- OpenAI: openai:reasoning_o3-mini or openai:reasoning_gpt-5-mini
- Gemini: gemini:openai-3-flash-preview or gemini:openai-reasoning-2.5-flash
Set correct reasoning parameters:
- OpenAI: "reasoning_effort": "low"|"medium"|"high"
- Gemini: "thinking_level": "minimal"|"low"|"medium"|"high"
Enable debug_thoughts=true to see reasoning output
Check model registry for reasoning support

Query Rewrite Issues

Issue: Query rewrite not working or always rejected

Solutions:

Ensure "enable_query_rewrite": true in params
Check rewrite_confidence_threshold (default 0.7, try 0.6)
Verify conversation history exists (rewrite needs context)
Check rewrite_tail_turns and rewrite_summary_turns settings
Look for rewrite display in response to see why rejected

Session Management Issues

Issue: Session-based chat not maintaining context

Solutions:

Ensure consistent session_id across requests
Check session manager logs for session creation/deletion
Verify session TTL settings (chunk_manager_idle_ttl_seconds)
Test with simple session first to isolate the issue

Performance Issues

Slow Response Times

Issue: Chat responses taking too long

Solutions:

Enable estimate mode to preview costs/time
Reduce top_k and max_output_tokens
Use faster models for non-critical stages
Enable caching where possible

High Memory Usage

Issue: System running out of memory

Solutions:

Reduce batch sizes in configuration:
- embedding_batch_size: try 50-100
- default_chunk_size: try 300-500
Process documents sequentially instead of in parallel
Monitor memory usage during indexing
Consider using smaller models for non-critical stages

Configuration Issues

Invalid Configuration

Issue: Application fails to start with config errors

Solutions:

Check configuration syntax:

python -c "from backend.core.config import settings; print('Config OK')"

Validate environment variables in .env
Check for typos in configuration keys

Domain/Collection Issues

Issue: No search results or poor results due to wrong domain/collection

Common Scenarios:

Indexed data with OpenAI but searching with Gemini (or vice versa)
active_domain doesn’t match the collection that contains your data
Collection exists but has wrong embedding dimensions

Solutions:

Check current domain configuration:

from backend.core.config import settings
print(f"Active domain: {settings.active_domain}")
print(f"Collection: {settings.collection_name}")
print(f"Embedding model: {settings.embedding_model_key}")

Verify domain mappings:

from backend.core.config import settings
for domain, config in settings.DOMAIN_EMBEDDING_CONFIG.items():
    print(f"{domain}: {config['collection_name']} -> {config['embedding_model_key']}")

Check what’s actually indexed:

python scripts/qdrant_scripts/qdrant_ops.py --list-collections
python scripts/qdrant_scripts/qdrant_ops.py --count-chunks --collection document_index
python scripts/qdrant_scripts/qdrant_ops.py --count-chunks --collection document_index_gemini

Switch domain if needed:
- Mountains domain: Uses OpenAI embeddings (document_index)
- Oceans domain: Uses Gemini embeddings (document_index_gemini)
- Change active_domain in backend/core/config.py or set ACTIVE_DOMAIN environment variable

Re-index with correct model if domain/collection mismatch:

# Switch to correct domain first, then re-seed
export ACTIVE_DOMAIN=oceans  # or mountains
make seed

Getting Help

Debug Mode

Enable debug logging for more detailed information:

# Add to .env
DEBUG_VERBOSE=true
DEBUG_LOG_KEYS=true
PROMPT_REGISTRY_LOG_FULL=1

Log Locations

Application logs: Console output or configured log file
Docker logs: docker logs <container_name>
Qdrant logs: docker logs qdrant_container

Common Debug Commands

# Check indexed documents
python scripts/qdrant_scripts/qdrant_ops.py --list-titles --limit 10

# Test API connections
python scripts/api_smoke_test_openai.py
python scripts/api_smoke_test_gemini.py

# Check collections
python scripts/qdrant_scripts/qdrant_ops.py --list-collections

# Test embedding generation
python scripts/embedding_compare.py

# Test Gemini embeddings specifically
python scripts/test_gemini_client_embeddings.py

# Test reasoning models
python scripts/test_gemini_reasoning.py

# Test chunked history manager
python scripts/test_chunked_history_manager.py

# Check Qdrant collection info
python scripts/qdrant_scripts/qdrant_ops.py --collection-info document_index

When to Ask for Help

If you’ve tried the above solutions and still have issues:

Check the GitHub Issues for similar problems
Create a new issue with:
- Error messages (full stack traces)
- Configuration details (remove sensitive info)
- Steps to reproduce
- System information (OS, Docker version, etc.)

Prevention Tips

Always test with estimate mode first before large indexing operations
Monitor API usage and costs regularly
Keep backups of important configurations
Document custom configurations for your specific use case
Regular updates to stay current with fixes and improvements

Chat with RAG

Retrieval-Augmented Generation system with streaming capabilities, multi-provider support, and comprehensive API documentation.

Troubleshooting Guide

Table of Contents

Environment Setup Issues

Python Environment Issues

API Key Issues

Docker & Infrastructure Issues

Docker Desktop Not Running

Qdrant Connection Issues

Container Updates After Code Changes

Port Conflicts

API & Connection Issues

CORS Issues

SSE Streaming Issues

Timeout Issues

Embedding & Indexing Issues

Embedding Dimension Mismatch

Gemini Embedding Normalization

Rate Limiting

Memory Issues During Indexing

Chat & Retrieval Issues

No Results Found

Poor Quality Results

Tool Calling Issues

Reasoning Model Issues

Query Rewrite Issues

Session Management Issues

Performance Issues

Slow Response Times

High Memory Usage

Configuration Issues

Invalid Configuration

Domain/Collection Issues

Getting Help

Debug Mode

Log Locations

Common Debug Commands

When to Ask for Help

Prevention Tips