Skip to content

kaantopcuw/Spring-Docs-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🍃 Spring Docs Scraper

All Spring documentation in single Markdown files — perfect for AI assistants like Claude, ChatGPT, and NotebookLM.

Ever wished you could just throw the entire Spring Framework documentation at an AI and ask questions? Now you can.

📦 Pre-built Documentation

Good news: You don't need to run anything. Just grab the files from the output/ folder:

Reference Docs

File Size Description
spring-boot.md 4.5 MB Spring Boot reference
spring-framework.md 3.1 MB Core Spring Framework
spring-security.md 2.1 MB Security reference
spring-integration.md 2.8 MB Integration patterns
spring-data-*.md various JPA, MongoDB, Redis, etc.
spring-cloud-*.md various Gateway, Config, Stream, etc.
... and 35+ more

API Docs (Javadoc)

File Size Description
api/spring-boot-api.md ~1 MB Spring Boot packages, classes with descriptions & links
api/spring-framework-api.md Core Framework API
api/spring-security-api.md Security API
...

API Docs Format:

  • Each package with description
  • All classes/interfaces/enums listed with clickable links
  • Full descriptions with references to other classes
  • AI can click through to read specific class documentation

Just download and drop into your AI context window. Done.

🛠️ Build It Yourself

Want the freshest docs? Build from source:

# Clone
git clone https://github.com/yourusername/spring-docs-scraper.git
cd spring-docs-scraper

# Build
./mvnw clean package -DskipTests

# Scrape reference docs
java --enable-preview -jar target/spring-docs-scraper-1.0.0.jar --project spring-boot

# Scrape API docs (Javadoc)
java --enable-preview -jar target/spring-docs-scraper-1.0.0.jar --api spring-boot

# Scrape both
java --enable-preview -jar target/spring-docs-scraper-1.0.0.jar --project spring-boot --with-api

# Scrape everything
java --enable-preview -jar target/spring-docs-scraper-1.0.0.jar --all --with-api

# See all available projects
java --enable-preview -jar target/spring-docs-scraper-1.0.0.jar --list

⚙️ Options

Reference Docs

Flag Description
-p, --project <id> Scrape one project
--projects <a,b,c> Scrape multiple
-a, --all Scrape all 41 projects

API Docs (Javadoc)

Flag Description
--api <id> Scrape API docs for one project
--api-all Scrape API docs for all projects
--with-api Also scrape API when using --project/--all

Common Options

Flag Description
-o, --output <dir> Output directory (default: ./output)
-l, --list Show available projects (with API availability)
--concurrent <n> Max parallel requests (default: 5)
--delay <ms> Delay between requests (default: 300)

📋 Supported Projects

Core: Spring Framework, Spring Boot

Security: Spring Security, Authorization Server, Session, LDAP, Vault

Data: JPA, MongoDB, Redis, Elasticsearch, Cassandra, Neo4j, REST, JDBC/R2DBC

Cloud: Gateway, Config, Netflix, OpenFeign, Stream, Kubernetes, Circuit Breaker, Commons, Contract, Function, Task, Consul, Zookeeper, Vault, Bus

Messaging: Integration, Batch, Kafka, AMQP, Pulsar

Other: GraphQL, AI, Shell, Modulith, CLI

Use --list to see which projects have API docs available (marked with ✅)

🔧 Requirements

  • Java 25+ (uses Virtual Threads)
  • Maven 3.9+ (or use the included wrapper)

💡 How It Works

Reference Docs

  1. Fetches sitemap.xml from each Spring project
  2. Filters to current/stable version only (no SNAPSHOTs)
  3. Downloads each HTML page using Virtual Threads
  4. Strips navigation, headers, footers
  5. Converts to clean Markdown
  6. Removes Kotlin/Groovy examples (Java only)
  7. Outputs a single file per project with Table of Contents

API Docs

  1. Fetches package index from Javadoc (allpackages-index.html)
  2. Downloads each package-summary.html in parallel
  3. Extracts package descriptions
  4. For each class/interface:
    • Name with direct link to Javadoc page
    • Full description with clickable references
  5. Outputs to output/api/{project}-api.md

Example output:

## org.springframework.boot

*Source: https://docs.spring.io/spring-boot/api/java/org/springframework/boot/package-summary.html *

Core Spring Boot classes.

### Types

- **[`SpringApplication`](https://...SpringApplication.html)** — Class that can be used to bootstrap and launch a Spring application from a Java main method.
- **[`CommandLineRunner`](https://...CommandLineRunner.html)** — Interface used to indicate that a bean should run when it is contained within a [`SpringApplication`](https://...SpringApplication.html).

📄 License

MIT — do whatever you want with it.


Built for developers who want to ask AI about Spring without copy-pasting 500 docs pages.

About

All Spring documentation (Framework, Boot, Security, Data, Cloud) in single Markdown files — ready for AI assistants like Claude, ChatGPT, and NotebookLM. Pre-built docs included!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages