🤖 Autonomous Research Agent

An intelligent AI agent that autonomously routes queries between web search and private knowledge bases, built with LangGraph's agentic workflow framework.

🎯 What Does This Do?

This project demonstrates an autonomous research agent that intelligently decides how to answer your questions:

Public Information → Searches the web using Tavily API
Private Information → Searches your local documents using RAG (Retrieval-Augmented Generation)

The agent uses a cyclic graph architecture (not a linear chain) to self-correct and re-assess retrieved data before generating final answers, making it more reliable and accurate than traditional chatbots.

🏗️ Technical Architecture

Core Components

Orchestration Framework: LangGraph (State Machine with Cyclic Graphs)
Language Model: Google Gemini 2.5 Flash
Vector Database: ChromaDB with Google Generative AI Embeddings (768-dimensional)
Web Search Tool: Tavily API (optimized for AI agents)
Memory System: RAG (Retrieval-Augmented Generation)

How It Works

User Query
    ↓
[Agent Node] → Analyzes intent
    ↓
[Router Logic] → Conditional edge decides:
    ├─→ Public info? → [Tavily Search Tool]
    └─→ Private info? → [RAG Retrieval Tool]
         ↓
    [Tool Node] → Executes search
         ↓
    [Agent Node] → Re-evaluates results (self-correction loop)
         ↓
    Final Answer

Key Feature: The agent can loop back to tools multiple times until it has sufficient information, enabling multi-step reasoning and verification.

🚀 Getting Started

Prerequisites

Python 3.11+
Google Gemini API Key
Tavily API Key

Installation

Clone the repository

git clone <your-repo-url>
cd "Research Agent"

Create and activate virtual environment

python3.11 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies

Option A: Using uv (Recommended - Fast & Modern)

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv pip install -r requirements.txt

Option B: Using pip (Traditional)

pip install -r requirements.txt

Set up environment variables

Create a .env file in the project root:

GEMINI_API_KEY=your_gemini_api_key_here
TAVILY_API_KEY=your_tavily_api_key_here

Initial Setup & Testing

Test API connections
```
python test_gemini.py
python tools.py
```
This verifies your Gemini and Tavily API keys are working correctly.
Build the vector database
```
python database.py
```
This loads notes.txt and creates a ChromaDB vector database for RAG retrieval.
Run the agent
```
python agent.py
```

📁 Project Structure

Research Agent/
├── agent.py          # Main agent logic with LangGraph workflow
├── database.py       # Vector database setup and document ingestion
├── tools.py          # Tool definitions (Tavily search, RAG retrieval)
├── test_gemini.py    # API connection test
├── requirements.txt  # Python dependencies
├── notes.txt         # Your private knowledge base (customize this!)
├── chroma_db/        # Vector database storage (auto-generated)
├── .env              # API keys (not tracked in git)
└── README.MD         # This file

💡 Usage Examples

Example 1: Public Information Query

You: What's the latest news about AI?
🤖 Agent: [Searches web via Tavily and returns current information]

Example 2: Private Information Query

You: What's my project deadline?
🤖 Agent: [Searches your notes.txt via RAG and returns stored information]

Example 3: Mixed Query

You: Compare my favorite movie to current box office hits
🤖 Agent: [Uses RAG for your favorite, Tavily for current hits, then synthesizes]

🧠 Understanding LangGraph

Unlike traditional AI agents that run linearly, this project uses LangGraph to create a state machine with loops:

Traditional Agent (Linear)

User → LLM → Tool → Answer

LangGraph Agent (Cyclic)

User → [Agent Node] ⟷ [Tool Node] → Answer
           ↑____________↓
        (Self-correction loop)

State Management

The agent maintains conversation state using AgentState:

Tracks all messages (user questions, tool results, agent responses)
Uses add_messages reducer to append new messages without overwriting history
Enables multi-turn conversations with full context

🔧 Customization

Add Your Own Documents

Edit notes.txt with your private information
Run python database.py to rebuild the vector database
The agent will now search your custom knowledge base

Adjust Retrieval Settings

In agent.py, modify the retriever parameters:

retriever = db.as_retriever(search_kwargs={"k": 3})  # Return top 3 results

Change the LLM

Replace Gemini with another model:

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")

🛠️ Key Dependencies

Package	Purpose
`langchain`	Core LLM framework
`langgraph`	State machine and agentic workflow orchestration
`langchain-google-genai`	Google Gemini integration
`tavily-python`	Web search optimized for AI agents
`chromadb`	Local vector database for semantic search
`python-dotenv`	Secure API key management

🔍 How RAG Works

RAG (Retrieval-Augmented Generation) enables semantic search over your documents:

Embedding: Text is converted to 768-dimensional vectors
Storage: Vectors are stored in ChromaDB
Search: User queries are embedded and compared using cosine similarity
Retrieval: Most similar documents are returned to the LLM

Example: Searching for "food" will find "pizza" and "burger" even if you didn't type those exact words, because they live in the same semantic "neighborhood."

🎓 Learning Resources

This project demonstrates:

✅ Agentic AI workflows with LangGraph
✅ Tool-calling and function execution
✅ RAG implementation with vector databases
✅ Conditional routing and decision-making
✅ State management in AI applications
✅ Self-correcting AI systems

📝 License

MIT License - Feel free to use this project for learning and development.

🤝 Contributing

Contributions welcome! Feel free to:

Add new tools (e.g., calculator, database queries)
Improve the routing logic
Enhance the RAG retrieval quality
Add conversation memory persistence

Built with ❤️ using LangGraph and Google Gemini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Autonomous Research Agent

🎯 What Does This Do?

🏗️ Technical Architecture

Core Components

How It Works

🚀 Getting Started

Prerequisites

Installation

Initial Setup & Testing

📁 Project Structure

💡 Usage Examples

Example 1: Public Information Query

Example 2: Private Information Query

Example 3: Mixed Query

🧠 Understanding LangGraph

Traditional Agent (Linear)

LangGraph Agent (Cyclic)

State Management

🔧 Customization

Add Your Own Documents

Adjust Retrieval Settings

Change the LLM

🛠️ Key Dependencies

🔍 How RAG Works

🎓 Learning Resources

📝 License

🤝 Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
README.MD		README.MD
agent.py		agent.py
database.py		database.py
notes.txt		notes.txt
requirements.txt		requirements.txt
test_gemini.py		test_gemini.py
tools.py		tools.py

Folders and files

Latest commit

History

Repository files navigation

🤖 Autonomous Research Agent

🎯 What Does This Do?

🏗️ Technical Architecture

Core Components

How It Works

🚀 Getting Started

Prerequisites

Installation

Initial Setup & Testing

📁 Project Structure

💡 Usage Examples

Example 1: Public Information Query

Example 2: Private Information Query

Example 3: Mixed Query

🧠 Understanding LangGraph

Traditional Agent (Linear)

LangGraph Agent (Cyclic)

State Management

🔧 Customization

Add Your Own Documents

Adjust Retrieval Settings

Change the LLM

🛠️ Key Dependencies

🔍 How RAG Works

🎓 Learning Resources

📝 License

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages