Skip to content

pmaske-aihub/rag-application

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contextual RAG Chatbot with LlamaIndex, Ollama & PGVector

This project is an interactive chatbot built as part of an interview exercise, showcasing a modern Contextual RAG (Retrieval-Augmented Generation) pipeline.
It demonstrates document ingestion, vector storage, local LLM inference, dynamic routing, evaluation, and monitoring — all running locally using open-source tools.


Features

Document Ingestion & Storage

  • Supports PDFs, DOCX, and other document formats.
  • Stored in PostgreSQL (ragdb) with PGVector extension for embeddings.

Contextual RAG Pipeline

  • LlamaIndex used to build retrievers and query engines.
  • Ollama models (nomic-embed-text + llama3.2:3b) for embeddings and generation.
  • Dynamic Router: Chooses between RAG and direct LLM responses (for chit-chat / non-document queries).

Interactive Query API

  • FastAPI backend with /chat/completions endpoint.
  • Compatible with Open WebUI for interactive chatbot usage.

Evaluation & Monitoring

  • RAGAs: Evaluates precision, recall, faithfulness, and answer relevance.
  • Arize Phoenix: Observability for prompts, RAG pipeline monitoring, and agent tracing.

Extensible Design

  • Ready for Crew.AI-based prompt optimization, rerankers, and agent orchestration.

Tech Stack

  • LLM & Embeddings: Ollama (nomic-embed-text, llama3.2:3b)
  • RAG Framework: LlamaIndex
  • Database: PostgreSQL with PGVector
  • Backend: FastAPI
  • Evaluation: RAGAs
  • Tracing & Monitoring: Arize Phoenix
  • Chatbot interface: Open WebUI

Getting Started

  1. Clone the repository

    git clone https://github.com/pmaske-aihub/rag-application.git
    cd rag-application
  2. Install Prerequisites

    ollama pull llama3.2:3b
    ollama pull nomic-embed-text

    Ensure Postgres is installed, and enable pgvector.

    CREATE DATABASE ragdb;
    \c ragdb
    CREATE EXTENSION IF NOT EXISTS vector;

    Ensure that Open WebUI is installed and running either locally or via Docker Desktop. See How to install Open WebUI

  3. Setup Environment

     python -m venv venv
     .\venv\Scripts\activate   # Windows
     source venv/bin/activate  # Linux/Mac
     pip install -r requirements.txt
  4. Run the FastAPI backend

    uvicorn src.api:app --host 0.0.0.0 --port 5601 --workers 4

Access the chatbot

On the web broweser, access http://localhost:3000. This will open an Open WebUI interface. Go to Admin Panel > Settings > Connections and add locally running FastAPI app.

{AB9E68AE-78E9-4CDA-9BEA-12DA4671E1F7}

Create a new Workflow and select model as llama3.2.3b

{2E86E0A2-DCBD-4F54-8DAB-43EA0A3EBC96}

Select New Chat and switch to Custom RAG Pipeline

{6DD11EC6-95D1-44CB-BA2F-463780CFA9C7}

For quick test, use SwaggerUI which can be accessed on http://localhost:5601/docs

{1DFBE23C-927C-4ED1-97FF-A801B527EC0D}

Example Usage

POST /chat/completions

 {
   "model": "llama3.2:3b",
   "messages": [
     {
       "role": "user",
       "content": "Based on the penalties section, what are the different levels of disciplinary actions?"
     }
   ]
 }

Evaluation (RAGAs)

python src/evaluate_rag.py

Outputs per-sample metrics: Context Precision / Recall, Faithfulness, Answer Relevancy.

Monitoring (Phoenix)

On the web browser, visit http://localhost:6006. This will open Phoenix dashboard and will show Query pipeline traces, Latency breakdown and Prompt optimizations.

Future Considerations

  • Crew.AI: Multi-agent prompt optimization.
  • Rerankers (e.g., Cohere / bge-reranker).
  • Docker deployment for portability.

Acknowledgements

This project was built as part of an interview technical exercise, showcasing an end-to-end RAG application with monitoring, evaluation, and extensibility in mind.

About

Contextual RAG Chatbot with LlamaIndex, Ollama & PGVector

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages