Contextual RAG Chatbot with LlamaIndex, Ollama & PGVector

This project is an interactive chatbot built as part of an interview exercise, showcasing a modern Contextual RAG (Retrieval-Augmented Generation) pipeline.
It demonstrates document ingestion, vector storage, local LLM inference, dynamic routing, evaluation, and monitoring — all running locally using open-source tools.

Features

Document Ingestion & Storage

Supports PDFs, DOCX, and other document formats.
Stored in PostgreSQL (ragdb) with PGVector extension for embeddings.

Contextual RAG Pipeline

LlamaIndex used to build retrievers and query engines.
Ollama models (nomic-embed-text + llama3.2:3b) for embeddings and generation.
Dynamic Router: Chooses between RAG and direct LLM responses (for chit-chat / non-document queries).

Interactive Query API

FastAPI backend with /chat/completions endpoint.
Compatible with Open WebUI for interactive chatbot usage.

Evaluation & Monitoring

RAGAs: Evaluates precision, recall, faithfulness, and answer relevance.
Arize Phoenix: Observability for prompts, RAG pipeline monitoring, and agent tracing.

Extensible Design

Ready for Crew.AI-based prompt optimization, rerankers, and agent orchestration.

Tech Stack

LLM & Embeddings: Ollama (nomic-embed-text, llama3.2:3b)
RAG Framework: LlamaIndex
Database: PostgreSQL with PGVector
Backend: FastAPI
Evaluation: RAGAs
Tracing & Monitoring: Arize Phoenix
Chatbot interface: Open WebUI

Getting Started

Clone the repository

git clone https://github.com/pmaske-aihub/rag-application.git
cd rag-application

Install Prerequisites
```
ollama pull llama3.2:3b
ollama pull nomic-embed-text
```
Ensure Postgres is installed, and enable pgvector.
```
CREATE DATABASE ragdb;
\c ragdb
CREATE EXTENSION IF NOT EXISTS vector;
```
Ensure that Open WebUI is installed and running either locally or via Docker Desktop. See How to install Open WebUI

Setup Environment

 python -m venv venv
 .\venv\Scripts\activate   # Windows
 source venv/bin/activate  # Linux/Mac
 pip install -r requirements.txt

Run the FastAPI backend

uvicorn src.api:app --host 0.0.0.0 --port 5601 --workers 4

Access the chatbot

On the web broweser, access http://localhost:3000. This will open an Open WebUI interface. Go to Admin Panel > Settings > Connections and add locally running FastAPI app.

Create a new Workflow and select model as llama3.2.3b

Select New Chat and switch to Custom RAG Pipeline

For quick test, use SwaggerUI which can be accessed on http://localhost:5601/docs

Example Usage

POST /chat/completions

 {
   "model": "llama3.2:3b",
   "messages": [
     {
       "role": "user",
       "content": "Based on the penalties section, what are the different levels of disciplinary actions?"
     }
   ]
 }

Evaluation (RAGAs)

python src/evaluate_rag.py

Outputs per-sample metrics: Context Precision / Recall, Faithfulness, Answer Relevancy.

Monitoring (Phoenix)

On the web browser, visit http://localhost:6006. This will open Phoenix dashboard and will show Query pipeline traces, Latency breakdown and Prompt optimizations.

Future Considerations

Crew.AI: Multi-agent prompt optimization.
Rerankers (e.g., Cohere / bge-reranker).
Docker deployment for portability.

Acknowledgements

This project was built as part of an interview technical exercise, showcasing an end-to-end RAG application with monitoring, evaluation, and extensibility in mind.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
src		src
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Contextual RAG Chatbot with LlamaIndex, Ollama & PGVector

Features

Document Ingestion & Storage

Contextual RAG Pipeline

Interactive Query API

Evaluation & Monitoring

Extensible Design

Tech Stack

Getting Started

Access the chatbot

Evaluation (RAGAs)

Monitoring (Phoenix)

Future Considerations

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

pmaske-aihub/rag-application

Folders and files

Latest commit

History

Repository files navigation

Contextual RAG Chatbot with LlamaIndex, Ollama & PGVector

Features

Document Ingestion & Storage

Contextual RAG Pipeline

Interactive Query API

Evaluation & Monitoring

Extensible Design

Tech Stack

Getting Started

Access the chatbot

Evaluation (RAGAs)

Monitoring (Phoenix)

Future Considerations

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages