A full-stack PDF document query system that allows users to upload PDF documents and query their content using AI-powered semantic search. Built with FastAPI backend and React frontend.
- PDF Upload & Processing: Securely upload PDF documents for processing
- Semantic Search: Query documents using natural language
- AI-Powered Responses: Get accurate, contextual answers from your documents
- Vector Embeddings: Uses OpenAI embeddings for intelligent document chunking
- PostgreSQL with pgvector: Efficient similarity search capabilities
- Modern UI: Clean, responsive interface built with React and Tailwind CSS
- Backend: FastAPI with Python
- Frontend: React with TypeScript and Vite
- Database: PostgreSQL with pgvector extension
- AI/ML: OpenAI API for embeddings and text generation
- Styling: Tailwind CSS with custom design system
- Python 3.10+
- Node.js 18+
- PostgreSQL 14+ with pgvector extension
- OpenAI API key
git clone <repository-url>
cd pdf-ragcd backendpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txt- Copy the sample environment file:
cp sample.env .env- Edit
.envfile with your configuration:
# Database
DATABASE_URL=postgresql://username:password@localhost:5432/pdf_rag_db
# OpenAI
OPENAI_API_KEY=your_openai_api_key_here
# App Settings
ENVIRONMENT=development
LOG_LEVEL=INFO- Create PostgreSQL database:
CREATE DATABASE pdf_rag_db;- Install pgvector extension:
CREATE EXTENSION vector;# From backend directory
python main.pyThe backend will be available at http://localhost:8000
cd frontendnpm install
# or using bun
bun installCreate .env file in frontend directory:
VITE_API_BASE_URL=http://localhost:8000npm run dev
# or using bun
bun devThe frontend will be available at http://localhost:8080
You can also run the application using Docker:
# Build and run with docker-compose
docker-compose up --buildThis will start both backend and frontend services with the database.
- API documentation:
http://localhost:8000/docs - Health check:
http://localhost:8000/api/v1/health - Statistics:
http://localhost:8000/api/v1/stats
- Built with Vite for fast hot reloading
- Tailwind CSS for styling
- TypeScript for type safety
- React Query for API state management
backend/app/main.py- FastAPI application entry pointbackend/app/routes/routes.py- API routes definitionbackend/app/controller/controller.py- Business logic controllersbackend/app/services/db_interaction.py- Database operations
frontend/src/App.tsx- Main application componentfrontend/src/pages/Index.tsx- Main pagefrontend/src/components/PDFUpload.tsx- PDF upload component
POST /api/v1/ingest- Upload and process PDF documentsPOST /api/v1/query- Query documents with natural languageGET /api/v1/health- Health check endpointGET /api/v1/stats- System statistics
The frontend uses a custom design system with:
- Responsive layout
- Dark theme optimized
- Custom gradients and animations
- Accessible components built with Radix UI
cd backend
python -m pytestcd frontend
npm test- Set environment variables for production
- Configure PostgreSQL with proper security settings
- Set up reverse proxy (nginx recommended)
- Enable SSL/TLS certificates
- Configure CORS settings appropriately
DATABASE_URL- PostgreSQL connection stringOPENAI_API_KEY- OpenAI API key for embeddingsENVIRONMENT- deployment environment (development/production)LOG_LEVEL- logging level
VITE_API_BASE_URL- Backend API base URL
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License.