RAG-Powered Document Intelligence Chat System

LLM
Go
Python
FastAPI
Next.js
AWS
Docker
Gemini
NeonDB
Pinecone

Upload documents. Ask questions. Get AI-powered answers.

A full-stack document intelligence platform that transforms static documents into interactive knowledge bases using Retrieval-Augmented Generation (RAG). Users upload PDFs and text files, then query them conversationally through a natural language chat interface powered by OpenAI's GPT-4.

Key Features

Intelligent Document Processing: Automatic text extraction, chunking, and semantic embedding generation
Conversational AI Interface: Natural language queries with context-aware responses
Event-Driven Architecture: Asynchronous processing pipeline using AWS EventBridge and SQS
Vector Search: Fast semantic search powered by Pinecone embeddings
Secure Authentication: User management and JWT validation via Clerk
Real-time Chat: Persistent conversation history with document context

Tech Highlights

Frontend: Next.js 14, shadcn/ui, TypeScript
Backend: Go (REST API), Python (FastAPI, async workers)
AI/ML: OpenAI GPT-4.1-mini, text-embedding-3-small
Infrastructure: Docker, LocalStack (S3, SQS, EventBridge), Pinecone, NeonDB

Architecture

Event-driven microservices architecture with async document processing:

Go API handles uploads and generates presigned S3 URLs
EventBridge triggers SQS queue on document upload
Python worker processes documents, generates embeddings, stores in Pinecone
RAG service retrieves relevant chunks and generates contextual answers using GPT-5