Awesome AI Models Matrix 🧠

Research-based list of AI models, development tools, and automation resources. Use it to compare releases, pricing, benchmarks, and deployment options from official sources.

Document Version: 3.0 Last Updated: 2026-05-04 21:38 UTC Repository: https://github.com/ReadyPixels/AI_Models_Matrix

Models 🧠

Comprehensive documentation of Large Language Models (LLMs), Small Language Models (SLMs), and specialized AI models available today.

Frontier Models 🚀

State-of-the-art proprietary AI models with cutting-edge capabilities from leading AI labs.

Model	Company	Context	GPQA Diamond	Arena Elo	SWE-bench Verified	AIME 2025	Pricing	Verified
GPT-5.5	OpenAI	1M	93.2%	—	—	—	$5.00 / $30.00	2026-04-26
GPT-5.5 Pro	OpenAI	1M	95.1%	—	92.3%	98.5%	$15.00 / $60.00	2026-04-24
Claude Opus 4.7	Anthropic	1M	94.2%	—	87.6%	~95%	$5 / $25	2026-04-26
Claude Sonnet 4.6	Anthropic	1M	89.9%	~1438 (Text) / 1523 (Code)	79.6%	~95%	$3 / $15	2026-04-26
GPT-5.3-Codex	OpenAI	400K	91.5%	—	85.0%	—	$1.75 / $14.00	2026-04-26
Gemini 3.1 Pro	Google	1M	94.3%	1494 (Text) / 1455 (Code)	80.6%	100%	$2 / $12	2026-04-26
Gemini 3 Deep Think	Google	1M+	~97%	—	~58%	—	Ultra subscription	2026-04-26
GLM-5	Zhipu AI	200K	82.0%	~1451 (Text) / 1445 (Code)	77.8%	92.7%	$1.00 / $3.20	2026-04-26
GLM-5.1	Zhipu AI	200K	—	—	~80.4% (est.)	—	$1.00 / $3.20	2026-04-26
MiniMax-M2.5	MiniMax	200K	85.2%	—	80.2%	86.3%	$0.30 / $1.20	2026-04-26
Kimi K2.6	Moonshot AI	256K	90.5%	—	80.2%	96.4%	$0.60 / $3.00	2026-04-26
DeepSeek-V4	DeepSeek	1M	—	—	—	—	$0.30 / $0.50	2026-04-26
DeepSeek-V3.2	DeepSeek	164K	87.1%	—	67.8%	89.3%	$0.28 / $0.42	2026-04-26
Qwen3.5-Max	Alibaba	128K	89.3%	—	76.4%	91.3%	Pay-per-token	2026-04-26
Gemini 3 Pro	Google	1M+	91.9%	1486 (Text) / 1438 (Code)	76.2%	98–100%	Tiered pricing	2026-04-26
Gemini 3 Flash	Google	10M	90.4%	1474 (Text) / 1438 (Code)	78.0%	—	$0.30 / $2.50	2026-04-26
Gemini 3.1 Flash-Lite	Google	1M	86.9%	1432	—	—	$0.25 / $1.50	2026-04-26
GPT-5.4	OpenAI	1M	92.0%	1484 (Text) / 1457 (Code)	~80%	88%	$2.50 / $15.00	2026-04-26
GPT-5.4 mini	OpenAI	400K	87.5%	—	—	—	$0.75 / $4.50	2026-04-26
GPT-5.4 nano	OpenAI	400K	—	—	—	—	$0.20 / $1.25	2026-04-26
Step-3.5-Flash	StepFun	256K	83.1%	—	74.4%	97.3%	Pay-per-token	2026-04-26
Mistral Large 3	Mistral AI	128K	43.9%	—	—	—	$0.50 / $1.50	2026-04-26
Claude Sonnet 4.5	Anthropic	200K	83.4%	—	77.2%	87%	$3 / $15	2026-04-26
Llama 4 Scout	Meta	10M	57.2%	—	—	—	Free (self-host)	2026-04-26
Llama 4 Maverick	Meta	128K	69.8%	—	—	—	Free (self-host)	2026-04-26
Grok 4	xAI	128K	~91.5%	~1493 (Text)	—	100%	$3 / $15	2026-04-26
Grok 4 Fast	xAI	128K	—	—	—	—	$0.20 / $1.50	2026-04-26

Top Models by Category

Category	#1	#2	#3
Coding	Claude Opus 4.7	GPT-5.5 Pro	GPT-5.5
Reasoning	Gemini 3 Deep Think	GPT-5.5 Pro	Qwen3-Max-Thinking
Open Source	DeepSeek-V4	Qwen3.5-Max	Llama 4
Cost Efficiency	DeepSeek-V3.2	Grok 4 Fast	GLM-4.7-FlashX
Context Window	Gemini 3 Flash (10M)	Llama 4 Scout (10M)	Claude Opus 4.6 (1M)

Model Specifications 📋

Detailed technical specifications, pricing, and capabilities for all frontier models. Data as of April 2026.

Output Token Limits

Maximum output tokens per single API request.

Model	Max Output	Context Window	Notes
Claude Opus 4.6	128K (300K via beta)	1M	Extended output via `output-128k-2025-02-19` beta header
Claude Opus 4.7	128K (300K via beta)	1M	Extended output via `output-128k-2025-02-19` beta header
Claude Sonnet 4.6	64K	1M	—
Claude Sonnet 4.5	64K	200K	—
GPT-5.4	128K	1.05M	—
GPT-5.4 mini	128K	400K	—
GPT-5.4 nano	128K	400K	—
GPT-5.3-Codex	128K	400K	—
Gemini 3.1 Pro	64K	1M	—
Gemini 3 Pro	64K	2M	—
Gemini 3 Flash	64K	1M	—
Gemini 3.1 Flash-Lite	32K	1M	—
DeepSeek-V4	DeepSeek	1M	—
DeepSeek-V3.2	8K / 64K (reasoner)	128K	Reasoner mode unlocks 64K output
Qwen3.5-Max	65K	1M	—
GLM-5	128K	200K	—
GLM-5.1	131K	200K	—
MiniMax-M2.5	131K	1M	—
Kimi K2.6	—	256K	Not publicly specified
Step-3.5-Flash	66K	256K	—
Grok 4	—	256K	Not publicly specified
Grok 4 Fast	30K	2M	—
Mistral Large 3	32K	128K	—
Llama 4 Scout	16K	10M	—
Llama 4 Maverick	16K	1M	—

Cached & Batch Pricing

Discounted pricing tiers for high-volume usage. All prices in USD per million tokens.

Model	Standard Input	Cached Input	Batch Discount	Notes
Claude Opus 4.6	$5.00	$0.50 (hit) / $6.25 (5m write)	50% off	Batch: $2.50 in / $12.50 out
Claude Sonnet 4.6	$3.00	$0.30 (hit) / $3.75 (5m write)	50% off	Batch: $1.50 in / $7.50 out
Claude Sonnet 4.5	$3.00	$0.30 (hit) / $3.75 (5m write)	50% off	Batch: $1.50 in / $7.50 out
GPT-5.4	$2.50	$0.25	50% off	Data residency +10%
GPT-5.4 mini	$0.75	$0.075	50% off	—
GPT-5.4 nano	$0.20	$0.02	50% off	—
GPT-5.3-Codex	$1.75	$0.175	50% off	—
Gemini 3.1 Pro	$2.00	$0.20–$0.40 + $4.50/hr storage	50% off	Tiered by input length
Gemini 3 Flash	$0.50	$0.05 + $1.00/hr storage	50% off	—
Gemini 3.1 Flash-Lite	$0.025	$0.0025 + $0.25/hr storage	50% off	Most affordable Google model
DeepSeek-V4	$0.30	$0.03 (90% off)	Off-peak 50% off	75% discount until 2026-05-05: ~$0.035 in
DeepSeek-V3.2	$0.28	$0.028	—	No formal batch API
Qwen3.5-Max	$0.40	Available	50% off	—
GLM-5 / GLM-5.1	$1.00	$0.20	—	—
Grok 4	$3.00	$0.75	—	—
Grok 4 Fast	$0.20	$0.05	—	—
Mistral Large 3	$0.50	—	—	No formal batch/cache tier
Step-3.5-Flash	$0.10	—	—	—

Speed & Latency

Output throughput and time-to-first-token from Artificial Analysis and provider benchmarks.

Model	Output Speed (tok/s)	TTFT	Notes
Gemini 3.1 Flash-Lite	~250	~2.1s	Fastest budget Google model
Step-3.5-Flash	85–350	—	Variable by provider; peak ~350 tok/s
Gemini 3 Flash	~193	~4.16s	—
MiniMax-M2.5 Lightning	~100	—	Faster tier
GPT-5.3-Codex	~86	~77.86s	High TTFT due to extended reasoning
Grok 4	~56	~8.96s	—
MiniMax-M2.5 Standard	~50	—	—

Most frontier models (Claude Opus/Sonnet 4.6, GPT-5.4, Gemini 3.1 Pro, etc.) have not yet been benchmarked on Artificial Analysis as of April 2026.

Training Data Cutoffs

Knowledge cutoff dates — the point after which a model has no training data.

Model	Training Cutoff	Notes
Claude Sonnet 4.6	Jan 2026	Most recent cutoff among frontier models
Claude Opus 4.6	Aug 2025	Reliable knowledge: May 2025
GPT-5.4 / mini / nano	Aug 31, 2025	—
GPT-5.3-Codex	Aug 31, 2025	—
Grok 4 Fast	Jul 2025	—
DeepSeek-V4	May 2025	—
Gemini 3.1 Flash-Lite	Jan 2025	—
Gemini 3.1 Pro / 3 Pro / 3 Flash	Jan 2025	—
Grok 4	~Nov–Dec 2024	Approximate
DeepSeek-V3.2	Jul 2024	—
Llama 4 Scout / Maverick	Aug 2024	—
DeepSeek-R1	~Oct 2023	Based on base model

Models not listed (Qwen, GLM, MiniMax, Kimi, Step, Mistral): training cutoff not publicly disclosed.

Multilingual Support

Model	Languages	Details
Qwen3.5-Max	201	Largest language coverage
Llama 4 Scout	200	Pre-training languages
Qwen3-Max-Thinking	119	Qwen3 series
Gemini 3 Flash	100	91.8% MMMLU score across 100 languages
Gemini 3.1 Pro / 3 Pro	100+	—
Gemini 3.1 Flash-Lite	100	91.3% MMMLU score
Llama 4 Maverick	12	Output languages
Claude (all)	Many	English-optimized; broad multilingual
GPT-5.4 (all)	Many	Broad multilingual coverage
DeepSeek (all)	Many	Chinese + English focused
Grok (all)	Many	—
GLM-5 / GLM-5.1	Many	28.5T token training data

Structured Output & Function Calling

All frontier models support structured JSON output and function/tool calling except where noted.

Capability	Supported Models	Not Supported
Structured Output (JSON mode)	All models listed in Frontier table	Gemini 3 Deep Think (no API)
Function Calling / Tool Use	All models listed in Frontier table	Gemini 3 Deep Think (no API)

Gemini 3 Deep Think is available only via Gemini's in-app Think mode — no API access for structured output or function calling.

Regional Availability

Provider	API Availability	Cloud Partners	Notes
Anthropic	Global	AWS Bedrock, GCP Vertex AI	US-only inference at 1.1x via `inference_geo`
OpenAI	Global	Azure OpenAI	Data residency endpoints +10% (post-3/5/26)
Google	Global	Google AI Studio, Vertex AI	Some regional restrictions per Google terms
DeepSeek	Global	Azure (R1 only, select regions)	China-based servers
Alibaba (Qwen)	Global	Alibaba Cloud Model Studio	China-based; globally accessible
Zhipu AI (GLM)	Global	Z.AI API	MIT license enables self-hosting anywhere
MiniMax	Global	MiniMax API	—
Moonshot AI (Kimi)	Global	platform.kimi.ai	MIT open-weight
xAI (Grok)	US-focused	Oracle OCI (East/Midwest/West)	Limited non-US availability
Mistral	Global	Azure AI Foundry, AWS, GCP	—
Meta (Llama)	Global (self-host)	All major cloud providers	Llama 4 Community License
StepFun	Global	HuggingFace	Apache 2.0 open-source

Free-Source Models 🆓

Self-hostable models with permissive licenses or open weights for privacy, cost control, and customization.

Model	Company	Params	Context	License
DeepSeek-V4	DeepSeek	1.6T / 49B active (MoE)	1M	Open Weight
Qwen3.5-Max	Alibaba	397B / 17B active (MoE)	262K	Apache 2.0
Qwen3-Max-Thinking	Alibaba	1T+	128K	Apache 2.0
Qwen3.6-27B	Alibaba	27B dense	262K	Apache 2.0
Qwen3.5-122B	Alibaba	397B / 17B active (MoE)	262K	Apache 2.0
Qwen3.5-27B	Alibaba	27B dense	262K	Apache 2.0
Mistral Large 3	Mistral AI	123B	128K	Apache 2.0
Llama 4 Scout	Meta	109B	10M	Community
Llama 4 Maverick	Meta	400B	128K	Community
GPT-OSS-120B	OpenAI	117B	128K	Apache 2.0
GPT-OSS-20B	OpenAI	21B	128K	Apache 2.0
Qwen3-Coder	Alibaba	480B	262K	Apache 2.0
GLM-5.1	Zhipu AI	754B / 40B active (MoE)	200K	MIT
GLM-4.7	Zhipu AI	400B+ MoE	205K	Open Weight
Gemma 4 31B	Google	31B dense	256K	Apache 2.0
Gemma 4 27B	Google	27B MoE (4B active)	256K	Apache 2.0
Gemma 4 E4B	Google	4B dense	256K	Apache 2.0
Gemma 4 E2B	Google	2B dense	256K	Apache 2.0
Qwen3-Coder 7B	Alibaba	7B dense	128K	Apache 2.0
Qwen 2.5 Coder 32B	Alibaba	32B dense	128K	Apache 2.0
DeepSeek Coder-V2	DeepSeek	236B / 2.4B active	128K	MIT
Step-3.5-flash	StepFun	196B / 11B active (MoE)	256K	Open Weight
Yi-Coder	01.AI	9B/1.5B	128K	Apache 2.0
Lizzy-7B	Flower Labs	7B	—	MIT
MiMo-V2.5	Xiaomi	309B / 15B active	262K	MIT
MiMo-V2.5-Pro	Xiaomi	1.02T / 42B active	1M	MIT
Lizzy-7B	Flower Labs	7B	—	MIT
MiMo-V2.5	Xiaomi	310B (15B active)	1M	MIT
MiMo-V2.5-Pro	Xiaomi	1.02T (42B active)	1M	MIT

Deployment Options

Local Inference Tools:

Ollama - Easy local deployment
LM Studio - User-friendly GUI
llama.cpp - Efficient CPU inference
vLLM - High-throughput serving
SGLang - Structured generation

Cloud Deployment:

Hugging Face Inference - Managed deployment
AWS SageMaker - Full control
Google Cloud Vertex - Integrated
RunPod - GPU rental

Coding Models 💻

Specialized AI models optimized for software development tasks.

SWE-bench Verified Leaderboard

Rank	Model	Company	SWE-bench Verified
🥇 #1	GPT-5.5 Pro	OpenAI	92.3%
🥈 #2	GPT-5.5	OpenAI	88.5%
🥉 #3	Claude Opus 4.7	Anthropic	87.6%
#4	Claude Opus 4.6	Anthropic	80.8%
#5	Gemini 3.1 Pro	Google	80.6%
#6	MiniMax-M2.5	MiniMax	80.2%
#7	GPT-5.4	OpenAI	~80%
#8	GPT-5.2	OpenAI	80.0%
#9	Claude Sonnet 4.6	Anthropic	79.6%
#10	Gemini 3 Flash	Google	78.0%
#11	GLM-5	Zhipu AI	77.8%
#12	Claude Sonnet 4.5	Anthropic	77.2%
#13	Kimi K2.6	Moonshot AI	80.2%

Commercial Coding Models

Model	Developer	Pricing	Best For
Claude Opus 4.6	Anthropic	$5 / $25 per 1M	Agentic coding, complex tasks
GPT-5.5 Pro	OpenAI	$15.00 / $60.00 per 1M	Highest benchmark coding
GPT-5.3-Codex	OpenAI	$1.75 / $14.00 per 1M	Agentic coding, 7+ hour autonomy
Claude Haiku 4.5	Anthropic	$1 / $5 per 1M	Low-latency coding, sub-agents, computer use
GLM-5-Code	Zhipu AI	$1.20 / $5.00 per 1M	Code generation, refactoring
MiniMax-M2.5	MiniMax	$0.30 / $1.20 per 1M	Code generation, refactoring
Claude Sonnet 4.5	Anthropic	$3 / $15 per 1M	Code review, refactoring
Codestral	Mistral AI	$0.30 / $0.90	Real-time completion
Grok 4 Fast	xAI	$0.20 / $1.50	Most used (50% share)

Open-Source Coding Models

Model	Developer	License	Hardware
GPT-OSS-120B	OpenAI	Apache 2.0	80-160 GB VRAM
Qwen3-Coder	Alibaba	Apache 2.0	160-320 GB VRAM
DeepSeek-Coder-V2	DeepSeek	MIT	48-80 GB VRAM
GLM-4.6	Zhipu AI	Open Weight	80-160 GB VRAM
Phi-4	Microsoft	MIT	24-48 GB VRAM

Reasoning Models 🧠

Models optimized for step-by-step reasoning, mathematical problem-solving, and complex logical inference.

AIME 2025 Leaderboard

Rank	Model	AIME 2025	ARC-AGI-2	Notes
🥇 #1	GPT-5.5 Pro	100%	78.5%	Highest combined
🥈 #2	Gemini 3.1 Pro	100%	77.1%	Highest combined reasoning
🥉 #3	GPT-5.2	100%	52.9%	No tools needed
#4	Grok 4	100%	—	First-principles reasoning
#5	Claude Opus 4.6	99.8%	68.8%	Near-perfect AIME
#6	Gemini 3 Pro	98–100%	31.1–45.1%	With code execution
#7	Step-3.5-Flash	97.3%	—	Best efficiency ratio
#8	Kimi K2.6	96.4%	—	Strong multimodal reasoning
#9	Claude Sonnet 4.6	~95%	58.3%	Near-Opus performance
#10	GLM-5	92.7%	—	Thinking mode

Reasoning Model Details

Model	Type	Context	Pricing
Gemini 3 Deep Think	Reasoning	1M+	Ultra subscription
Qwen3-Max-Thinking	Reasoning/Coding	128K	$1.20 / $6.00
o3 / o1-Pro	Reasoning	128K	$2-150 / $8-600
GPT-5.5 Pro	Reasoning	1M	$15.00 / $60.00
Gemini 3 Pro	General/Multimodal	1M+	$2 / $12
DeepSeek-R1	Reasoning	128K	$0.50 / $2.15
Claude Sonnet 4.5	Hybrid	200K	$3 / $15
GPT-Rosalind	Life Sciences Reasoning	128K	Pay-per-token (Research Preview)

Use Cases

Mathematical Problem Solving: Qwen3-Max-Thinking, GPT-5.5 Pro, Gemini 3 Pro
Scientific Analysis: Claude Opus 4.6, GPT-5.5, Gemini 3 Pro
Strategic Planning: o3/o1-Pro, Claude Sonnet 4.5, DeepSeek-R1
Code Debugging: Claude Sonnet 4.5, GPT-5.3-Codex, DeepSeek-V3.2

Multimodal Models 🎨

Models capable of processing and generating multiple types of content: text, images, audio, and video.

Leading Multimodal Models

Model	Developer	Context	Key Features
GPT-5.4	OpenAI	1M	Unified multimodal, audio
Gemini 3 Pro	Google	1M+	Native multimodal, video
Claude Sonnet 4.5	Anthropic	200K	Document understanding
Llama 4 Maverick	Meta	128K	Open multimodal
Nemotron 3 Nano Omni	NVIDIA	30B (3B active)	Vision, audio, language unified, 9x throughput

Vision Capabilities

Model	MMMU / MMMU-Pro	MathVista	DocVQA
Gemini 3.1 Pro	95% (MMMU-Pro)	—	—
GPT-5.4	94% (MMMU-Pro)	—	—
Gemini 3 Pro	81% (MMMU-Pro)	—	—
Gemini 3 Flash	80% (MMMU-Pro)	—	—
Claude Sonnet 4.5	77.8% (MMMU)	—	—
Llama 4 Maverick	73.4% (MMMU)	—	—

Audio & Video

Model	Speech-to-Text	Text-to-Speech	Video Input
Gemini 3 Pro	✅	✅	✅
GPT-5	✅	✅	⚠️
Whisper v3	✅	❌	✅

Image Generation

Model	Developer	License	Best For
MAI-Image-2-Efficient	Microsoft	Proprietary	Production-ready quality, 41% lower cost
Flux.1	Black Forest Labs	Apache 2.0	High-fidelity art
Stable Diffusion 3.5	Stability AI	Community License	Fine-tuning
GLM-Image	Zhipu AI (Z.ai)	API	Fast image generation
CogView-4	Zhipu AI (Z.ai)	API	Creative image generation
Firefly AI Assistant	Adobe	Public Beta (2026-04-27)	Creative agent, 60+ tools, Photoshop/Premiere integration

Hardware Requirements 🖥️

Comprehensive hardware specifications for self-hosting AI models.

Quick Reference by Model Size

Model	Params	Q4 Size	Min VRAM	Rec VRAM	Min RAM
Phi-4	14B	8 GB	24 GB	48 GB	32 GB
GPT-OSS-20B	21B	12 GB	24 GB	48 GB	32 GB
Llama 4 Scout	109B	66 GB	48 GB	80 GB	96 GB
GPT-OSS-120B	117B	70 GB	80 GB	160 GB	128 GB
DeepSeek-Coder-V2	236B	143 GB	48 GB	80 GB	192 GB
Llama 4 Maverick	400B	242 GB	160 GB	320 GB	320 GB
DeepSeek-V4	671B	404 GB	80 GB	320 GB	512 GB
Qwen3-Max-Thinking	1T+	600+ GB	160 GB	640 GB	768 GB

By Hardware Tier

Consumer/Entry Level (24-48 GB VRAM):

Phi-4, GPT-OSS-20B, Yi-Coder, Qwen2.5-Coder
Recommended GPUs: RTX 3090 (24GB), RTX 4090 (24GB)

Professional (80-160 GB VRAM):

Llama 4 Scout, GPT-OSS-120B, DeepSeek-Coder-V2
Recommended GPUs: A100 80GB, 2x A100 40GB

Enterprise (320+ GB VRAM):

Llama 4 Maverick, GLM-4.7, DeepSeek-V4, Qwen3-Max-Thinking
Recommended GPUs: 4x A100 80GB, 8x A100 80GB

Quantization Explained

Level	Bits	Size vs FP16	Quality	Use Case
FP16/BF16	16	100%	Best	Training
Q8_0	8	~50%	Excellent	High-quality inference
Q4_K_M	4	~25%	Good	Recommended for deployment
Q3_K_M	3	~19%	Fair	Limited resources

Comprehensive Benchmark Reference 📈

Detailed benchmark scores across all major evaluations. Scores are percentages (%) unless noted. Arena Elo scores are integers. — = not publicly reported. Data as of April 2026.

Full Benchmark Table

Model	GPQA Diamond	MMLU-Pro	Arena Elo (Text)	HLE	SWE-bench Verified	SWE-bench Pro	LiveCodeBench	AIME 2025	ARC-AGI-2	MMMU-Pro	IFEval	FrontierMath
Claude Opus 4.6	91.3%	—	1500	40.0–53.0%	80.8%	—	—	99.8%	68.8%	—	—	—
GPT-5.5	93.2%	—	1495	42.1–55.0%	88.5%	—	—	99.9%	71.2%	—	—	52%
GPT-5.5 Pro	95.1%	96%	1520	48.5–62.0%	92.3%	—	—	100%	78.5%	—	97%	58%
Claude Sonnet 4.6	89.9%	—	~1438	33.2–49.0%	79.6%	—	—	~95%	58.3%	—	—	—
Claude Sonnet 4.5	83.4%	88.0%	—	—	77.2%	—	—	87–100%	—	—	—	—
GPT-5.4	92.0%	94%	1484	36.6–41.6%	~80%	57.7%	84–88%	88%	73.3%	94%	—	50% (Pro)
GPT-5.4 mini	87.5%	—	—	—	—	54.4%	—	—	—	—	—	—
GPT-5.3-Codex	91.5%	—	—	—	—	56.8%	85%	—	—	—	—	—
GPT-5.2	92.4%	—	1479	35.2%	80.0%	55.6%	—	100%	52.9%	—	95.6%	~40.3%
Gemini 3.1 Pro	94.3%	92%	1494	44.4–51.4%	80.6%	54.2–72%	71%	100%	77.1%	95%	95%	—
Gemini 3 Pro	91.9–93.8%	83%	1486	37.5%	76.2%	43.3%	49%	98–100%	31.1–45.1%	81%	88%	38%
Gemini 3 Flash	90.4%	72%	1474	33.7%	78.0%	44%	—	—	—	80%	85%	—
Gemini 3 Deep Think	~97%	81%	—	48.4%	~58%	63%	58%	—	84.6%	—	—	—
DeepSeek-V3.2	87.1%	85.0%	—	25.1%	67.8%	—	—	89.3%	—	—	—	—
DeepSeek-R1	71.5%	84.0%	—	8.5%	49.2%	—	63.5%	70.0%	—	—	—	—
Qwen3.5-Max	89.3%	—	—	—	76.4%	—	—	91.3%	—	79%	—	—
Qwen3-Max-Thinking	86.1%	—	—	26.2%	—	—	—	—	—	—	—	—
GLM-5	82.0%	—	~1451	10.4%	77.8%	—	—	92.7%	—	—	—	—
GLM-5.1	—	—	—	—	~80.4% (est.)	—	—	—	—	—	—	—
Kimi K2.6	90.5%	87.1%	—	31.5–50.2%	80.2%	—	85.0%	96.4%	—	78.5%	—	—
MiniMax-M2.5	85.2%	—	—	—	80.2%	55.4%	—	86.3%	—	—	—	—
Step-3.5-Flash	83.1%	—	—	—	74.4%	—	86.4%	97.3%	—	—	—	—
Grok 4	~91.5%	91.5%	~1493	50.7%	—	—	—	100%	—	—	—	—
Llama 4 Maverick	69.8%	80.5%	—	—	—	—	43.4%	—	—	—	—	—
Llama 4 Scout	57.2%	74.3%	—	—	—	—	32.8%	—	—	—	—	—

FrontierMath Scores

FrontierMath is a benchmark of 350 original, exceptionally challenging mathematics problems created by expert mathematicians (Epoch AI). Problems span number theory, analysis, algebraic geometry, and category theory. Tier 4 problems can take research mathematicians multiple days.

Model	Tiers 1–3	Tier 4	Source
GPT-5.4 Pro	50%	~36–38%	Epoch AI
GPT-5.2 Pro	~40.3%	31%	Epoch AI
Gemini 3 Pro	38%	19%	Epoch AI
GPT-5.1 Thinking	~25%	—	llm-stats

Benchmark Glossary

Benchmark	Description	Source
GPQA Diamond	Graduate-level science questions (PhD difficulty)	Google Research
MMLU-Pro	Extended multi-task language understanding (harder than MMLU)	TIGER-Lab
Arena Elo	Crowdsourced human preference ranking	lmarena.ai
HLE	Humanity's Last Exam — expert-level questions	Scale AI
SWE-bench Verified	Real GitHub issue resolution (human-verified subset)	SWE-bench
SWE-bench Pro	More challenging subset of SWE-bench	SWE-bench
LiveCodeBench	Live competitive programming problems (not in training data)	LiveCodeBench
AIME 2025	American Invitational Mathematics Examination	MAA
ARC-AGI-2	Abstract reasoning challenge (fluid intelligence)	ARC Prize
MMMU / MMMU-Pro	Multi-discipline multimodal understanding	MMMU
IFEval	Instruction-following evaluation	Google Research
FrontierMath	Expert-level research mathematics (Epoch AI)	Epoch AI

Development Tools 🛠️

AI-powered tools for software development, from IDEs and CLI tools to API providers and IDE extensions.

IDEs 💻

Integrated Development Environments with built-in AI capabilities.

Agentic IDEs

IDE	Platform	Version	Release Date	Pricing	Key Features	GitHub
Firebase Studio	Web	-	-	Free (3 workspaces, up to 30 with Google Developer Program)	Cloud-based, Gemini, MCP	🔗
Lingma IDE (通义灵码)	Windows, macOS	-	-	Free (download)	Built-in agent, MCP tool use, terminal command execution	❌
Tonkotsu	Windows, macOS	-	-	Free (during early access)	Team of agents, workflow	🔗
OpenCode	Windows, macOS, Linux	-	-	Free (OSS)	Terminal, desktop, IDE extension, multi-provider	🔗
Codex app	Windows	-	2026-03-04 00:00 UTC	Included with Codex plans	Multiple agents, isolated worktrees, reviewable diffs, CLI and IDE interop	🔗
Visual Studio	Windows, macOS	17.14.12+, 18.1.0+	2026-01-06 00:00 UTC	Free / $250/yr	Gemini 3 Flash integration, faster performance, zero-migration upgrades, real-time profiler agent	❌
IntelliJ IDEA	Windows, macOS, Linux	2025.3.2	2026-01	Free / $149/yr	Java 24 support, Kotlin K2 mode, performance and memory improvements	❌
IBM Bob	Cross-platform	GA (April 28, 2026)	2026-04-28	Free trial + Enterprise plans	Multi-model orchestration, full SDLC, 45% productivity gain	🔗
PolyAI ADK	PolyAI	GA (April 22, 2026)	2026-04-22	Enterprise CX	AI-native dev, Cursor/Claude Code integration	🔗
JAT	Windows, macOS, Linux	-	2026-04-14	Free (MIT)	Self-contained agentic IDE, 20+ parallel agents, task management, unified environment	🔗

Native AI Editors

Editor	Platform	Version	Release Date	Pricing	Key Features	GitHub
Zed	macOS, Windows, Linux	0.226.3	2026-03-03 00:00 UTC	Free (OSS) + Copilot $10/mo	Fast, collaboration, Gemini and Claude, Zeta AI, agent thread history, edit prediction providers, self-hosted OpenAI-compatible servers	🔗
Dyad	Windows, macOS, Linux	-	-	Free (OSS)	Local generation, BYO keys	🔗
Memex	macOS, Windows	-	-	Freemium (Free + $10/mo)	Agentic, browser↔desktop	🔗

VS Code Forks

IDE	Platform	Version	Release Date	Pricing	Autonomous	MCP	GitHub
Cursor	Windows, macOS, Linux	3.2 (May 1, 2026)	2026-05-01 00:00 UTC	Freemium (Free + Pro $19/mo or $39/mo)	✅	❌	❌
Windsurf	Windows, macOS, Linux	2.0.0 (May 3, 2026)	2026-05-03 00:00 UTC	Freemium (Free + Pro)	✅	✅	❌
Trae	macOS, Windows	-	-	Free	❌	❌	🔗
PearAI	Windows, macOS, Linux	-	-	Free (OSS)	✅	❌	🔗
Void	Windows, macOS, Linux	-	-	Free (OSS)	✅	✅	🔗
Kiro	Windows, macOS, Linux	-	-	Free (Preview)	✅	✅	🔗
VS Code Agents	Windows, macOS, Linux	Insiders	2026-04-21	Free	✅	✅	🔗

Web-Based IDEs

IDE	Platform	Version	Release Date	Pricing	Self-Hostable	Best For	GitHub
Replit 3	Web	-	-	Free Starter, Core $20/mo, Pro $100/mo	❌	Learning/Prototyping	❌
Bolt.new	Web	-	-	Free, Pro $20-25/mo, Teams $30/user/mo	❌	Quick apps	❌
Bolt.diy	Self-hosted	-	-	Free (MIT), bring your own API	✅	Self-hosted	🔗
Lovable	Web	-	-	Free (5 credits/day), Pro $25/mo, Business $50/mo	❌	UI/Full-stack	❌
v0	Web	-	-	Free ($5 credits/mo), Premium $20/mo, Teams $30/user	❌	React components	❌
Gitpod	Web	-	-	Free + Paid	❌	Cloud dev environments	❌
Rork	Web	-	-	Free & Paid (credits)	❌	Mobile apps (iOS/Android)	❌
Google Stitch	Web	-	2026-03	Free (Google account, 550 gen/mo)	❌	UI design, Figma/React export	❌
Google Antigravity	Web	-	-	Google AI Pro / Ultra	Agent-first development with Gemini-powered coding	❌
Jules	Web	-	2025-05-20 00:00 UTC	Free beta, higher limits on Google AI Pro / Ultra	Async repo agent, reviewable diffs, GitHub integration	❌

CLI Tools 🖥️

Command-line AI tools for autonomous coding and terminal enhancement.

Autonomous Coding Agents

Tool	Platform	Pricing	Key Features	GitHub
Aider	Windows, macOS, Linux	Free	Gold standard, Architect mode, thinking tokens	🔗
Claude Code 2.2.1+	macOS, Linux, Windows	Free + API	Fast mode for Opus 4.7, simple mode file editing, multi-session support	🔗
Codex CLI	Windows, macOS, Linux	Included	Sandbox, approval modes	🔗
Junie CLI	Windows, macOS, Linux	Free (BYOK)	LLM-agnostic, JetBrains IDE integration, MCP	🔗
Goose	Windows, macOS, Linux	Free (Apache-2.0)	MCP, extensible, desktop app, 25+ providers	🔗
GPT-Pilot	Windows, macOS, Linux	Free	Full dev team simulation	🔗
OpenHands	Windows, macOS, Linux	Free	Cloud agents, MCP	🔗
Mentat	Windows, macOS, Linux	Free	Multi-file coordination	🔗
SERA	Linux, macOS	Free (Apache 2.0)	Open-source coding agent, 200K synthetic trajectories	🔗
AI Dev Kit	Cross-platform	Free	59 skills, 33 agents, TDD, security audit, CI/CD	🔗

Assisted CLI Tools

Tool	Developer	Pricing	Best For
Gemini CLI	Google	Free	Google ecosystem
Cursor CLI	Cursor	Free tier	Terminal + IDE bridge
Qwen Code	Alibaba	Free	Qwen optimization
Qodo CLI	Qodo	Free tier	Testing and review

CLI Tools by Programming Language

AI coding CLI tools categorized by their primary language support. All tools below accept plain English prompts.

Tool	Primary Languages	Multi-Language	Local LLM	Cloud API	Pricing	GitHub
Aider	Python, JS, TS, Go, Rust, Ruby, Java, C/C++	✅ (100+ langs)	✅ (Ollama, LM Studio)	✅	Free (OSS)	🔗
Claude Code	All (polyglot)	✅	❌	Claude API ($3–$15/M)	Free tool / API cost	❌
Codex CLI	Python, JS, TS, Bash	✅	❌	OpenAI API	Free OSS / API cost	🔗
OpenHands	Python, JS, TS, Go, Rust, Java	✅	✅	✅	Free (OSS)	🔗
Goose (Block)	Polyglot (25+ providers)	✅	✅ (Ollama, LM Studio)	✅	Free (OSS)	🔗
Continue	Polyglot (VS Code, JetBrains)	✅	✅ (Ollama, LM Studio)	✅	Free (OSS)	🔗
Qwen Code	Python, JS, TS, Go, Java	✅	✅ (Qwen models)	✅	Free (OSS)	🔗
Devstral CLI	Python, JS, TS, Go, Rust	✅	✅	Mistral API	Free OSS model / API cost	❌
OpenCode	Polyglot	✅	✅	✅	Free (OSS)	🔗
Mentat	Python, JS, TS, Go	✅	❌	OpenAI API	Free (OSS)	🔗
Amp (Sourcegraph)	All	✅	❌	✅	Free / Enterprise	❌

CLI for Programming Languages & Multiple Use

Purpose-built CLI tools for coding across specific languages or polyglot multi-stack workflows.

Tool	Language Focus	Platform	Pricing	Key Features	GitHub
Aider	Polyglot (Python, JS, TS, Go, Rust, any)	All	Free (BYOK)	Git-native multi-file edits, Architect mode, repo maps, thinking tokens	🔗
Claude Code	Polyglot	All	Free + API	Computer use, sub-agents, CLAUDE.md skills, Opus 4.7, multi-session	🔗
Codex CLI	Python, JS, TS	All	Free (OpenAI account)	Sandbox execution, approval modes, OpenAI models	🔗
OpenHands	Python, JS, TS, Go, Rust	All	Free (OSS)	Full SDLC agent, MCP, local LLM via Ollama	🔗
Goose	Polyglot	All	Free (Apache-2.0)	25+ providers, MCP, extensible extensions, desktop app	🔗
Continue	Polyglot	All	Free (OSS)	VS Code + JetBrains, custom models via Ollama/LM Studio	🔗
Qwen Code	Python, JS, TS, Go	All	Free	Optimized for Qwen3-Coder 480B, Apache 2.0	❌
Mentat	Polyglot	All	Free	Multi-file coordination, context-aware diffs	🔗
AI Dev Kit	Polyglot	All	Free	59 skills, 33 agents, TDD, security audit, CI/CD pipeline	🔗
Devstral CLI	Python, JS, TS, Go	All	Free (Mistral free tier)	Mistral's open coding model, OpenRouter free access	❌
Junie CLI	Polyglot	All	Free (BYOK)	LLM-agnostic, JetBrains IDE integration, MCP	🔗
SERA	Python, JS, TS	Linux, macOS	Free (Apache 2.0)	Open-source coding agent, 200K synthetic trajectories	🔗

Terminal Enhancers

Tool	Platform	Pricing	Key Features
Warp Terminal	macOS, Linux, Windows	Free	AI Agents, workflow sharing
Fig	macOS, Linux	Free	Autocomplete, AI suggestions

IDE Add-ons 🧩

Extensions and plugins that add AI capabilities to existing IDEs.

Universal (Cross-Platform)

Add-on	Platform	Pricing	Context	Best For	GitHub
GitHub Copilot	VS Code, JetBrains, Vim	Free / $10/mo / $39/mo	Large	General coding	❌
Supermaven	VS Code, JetBrains, Neovim	Free / $10/mo	1M	Large codebases	❌
Codeium	VS Code, JetBrains, Vim	Free / $15/mo / $60/mo	Medium	Free alternative	❌
Continue	VS Code, JetBrains	Free (OSS)	Custom	Self-hosted	🔗
Cody	VS Code, JetBrains, Web	Free (discontinued) / Enterprise Starter $19/mo / Enterprise $59/mo	Enterprise	Code search	🔗
Tabnine	VS Code, JetBrains, VS, Eclipse	Free / $39/mo	Local	Privacy	❌
Tabby	VS Code, JetBrains, Vim, Neovim	Free (OSS)	Self-hosted	Self-hosted code completion	🔗

VS Code Specific

Add-on	Pricing	Autonomous	MCP	Best For	GitHub
Codex	Free (with ChatGPT Plus $20/mo or Pro $200/mo)	✅	✅	OpenAI's official coding agent	🔗
Cline	Free	✅	✅	Full agent	🔗
GitHub Copilot (Agent Mode)	$0 / $10 / $39/mo	⚠️	❌	Guided agent workflows	❌
RooCode	Free/Pro	⚠️	❌	Complex tasks	🔗
Keploy	OSS/Enterprise	❌	❌	Testing	❌

JetBrains Specific

Add-on	Pricing	Claude Agent	Best For
JetBrains AI Assistant	$10/mo (Pro), $249/yr (Ultimate)	✅	Deep IDE integration
JetBrains Claude Agent	Included in subscription	✅	Native agent

API Providers 🔌

Services for accessing AI models via API.

Model Labs (Direct)

Provider	Models	Pricing
OpenAI	GPT-5.4, GPT-5.4 mini, GPT-5.4 nano, o3, Codex	Pay-per-token
Anthropic	Claude Opus 4.6, Sonnet 4.6, Haiku 4.5	Pay-per-token
Alibaba Cloud	Qwen3.5-Max, Qwen3-Coder, Qwen3.6-27B	Pay-per-token / Coding Plan $50/mo
Gemini (Google)	Gemini 3.1 Pro, 3 Pro, 3 Flash	Pay-per-token
Z.ai (Zhipu AI)	GLM-5, GLM-5.1, GLM-4.7, GLM-5-Code	Pay-per-token
MiniMax	MiniMax-M2.5/M2.7/M2	Pay-per-token
Cohere	Command, Embed, Rerank	Pay-per-token
AI21 Labs	Jamba	Pay-per-token
Perplexity	Sonar / Sonar Pro / Sonar Reasoning Pro	Pay-per-token + request fees
Moonshot AI	Kimi (kimi-k2.5, kimi-k2-thinking)	Pay-per-token
ByteDance (Volcengine)	Doubao, Seed 1.6/2.0	Pay-per-token
Tencent (Hunyuan)	Hunyuan, Hunyuan-a13b	Pay-per-token
StepFun	Step-3.5-Flash, Step-3.5	Pay-per-token (OpenRouter free)
PaleBlueDot AI	Unified platform, 100+ models	Token-based pricing
Osirus AI	Unified platform + Agent Studio	Free tier + paid plans
Logic	Spec-driven managed agents	Free tier + $49/mo
DeepSeek	DeepSeek-V4/R1	Pay-per-token
Mistral AI	Mistral Large 3, Codestral	Pay-per-token
xAI	Grok-4	Pay-per-token

Unified APIs & Aggregators

Provider	Models	Key Features
OpenRouter	200+	Crypto/fiat, rankings
Hugging Face	Thousands	Serverless inference

Inference Clouds

Provider	Specialization	Speed
Together AI	Llama/Qwen/Mistral	Fast
Fireworks AI	FireAttention	Low-latency, 6 free models
Groq	LPU	>500 T/s
Cerebras	Wafer-Scale	>2000 T/s
NVIDIA NIM	91 free endpoints, DGX Cloud	20× faster than NVIDIA GPU

GPU Clouds

Provider	Type	Best For
RunPod	GPU Rental	Flexibility, cost-effective fine-tuning & inference
Replicate	Model-as-a-Service	Quick deployment, serverless inference
Vultr	Global Cloud	Hourly GPU instances
Hyperbolic	Decentralized	Crypto/Fiat payments
Cerebrium	Serverless GPU	Python-native ML inference & fine-tuning
Together AI	AI-Native Cloud	Fast, cost-effective inference & fine-tuning for open models
Modal Labs	Serverless GPU	Fine-tuning with LoRA, distributed training
Fireworks AI	Inference & Fine-tuning	Fast inference, RFT for model shaping
Databricks Mosaic AI	Integrated ML Platform	Enterprise fine-tuning, governed serving, RAG
NVIDIA DGX Cloud	Managed AI Training	Co-engineered clusters, maximum ROI for training
Vast.ai	GPU Marketplace	Serverless endpoints, diverse GPU options
DigitalOcean	GPU Droplets	Simple fine-tuning workflows, scalable GPU infrastructure

Automation 🤖

AI-powered tools for automating browser and desktop tasks.

Browser Automation 🌐

Tools and frameworks for AI-powered browser automation.

Standalone AI Browsers

Browser	Platform	Pricing	Open Source	Local AI	Agent/Computer Use	API Access	Multi-Agent	Parallel Sessions	Best For	GitHub
Perplexity Comet	Windows, macOS, iOS, Android	Free / Pro $20/mo	❌	❌	✅	❌	❌	❌	Research + background tasks, voice mode, Computer Max agent	❌
ChatGPT Agent Mode	Web, iOS, Android	Plus $20/mo, Pro $200/mo	❌	❌	✅	❌	❌	❌	Full computer use: browse, code, fill forms, book travel	❌
Dia	macOS (M1+ / macOS 14+)	Free / Pro $20/mo	❌	❌	⚠️	❌	❌	❌	Tab intelligence, Skills, browsing history AI context	❌
Google Chrome (Auto Browse)	Windows, macOS, Linux, ChromeOS	Free / Gemini Pro $19.99/mo	❌	❌	✅	❌	❌	❌	Gemini 3 built-in, auto browse agentic tasks (enterprise)	❌
Microsoft Edge (Copilot Agent)	Windows, macOS, iOS, Android	Free / Copilot Pro $20/mo	❌	❌	✅	❌	❌	❌	Cross-tab context, voice commands, form automation, bookings	❌
Genspark	Web, iOS, Android	Free / Plus $25/mo / Pro $249/mo	❌	✅ (169 local models)	✅	❌	✅	✅	Super Agent, AI slides, AI websites, deep research, Call For Me	❌
Brave Leo (AI Browser)	Windows, macOS, Linux, iOS, Android	Free / Premium $14.99/mo	✅ (Chromium)	✅ (Leo local)	⚠️	❌	❌	❌	Privacy-first, zero-log AI, Skills, Memories, local models	❌
SigmaOS (Airis)	macOS	Free / Pro (subscription)	❌	❌	⚠️	❌	❌	❌	NL commands: "Book Airbnb in Iceland", cross-tab AI, YC-backed	❌
Opera Neon	Windows, macOS	$19.90/mo	❌	❌	✅	❌	❌	❌	Agentic browsing, Aria assistant, built-in AI tools	❌
Opera One (Aria)	Windows, macOS, Linux, iOS, Android	Free	❌	❌	⚠️	❌	❌	❌	Built-in Aria AI assistant, sidebar AI tools	❌
Firefox (AI Sidebar)	Windows, macOS, Linux, iOS, Android	Free	✅	❌	⚠️	❌	❌	❌	AI Controls dashboard (v148+), ChatGPT/Claude/Mistral sidebars	❌
BrowserOS	Linux, macOS	Free	✅	✅	✅	❌	✅	✅	Privacy-focused, built-in MCP, agentic	🔗
Manus AI	Web (Cloud)	Free 300 credits/day / Plus $20/mo / Pro $200/mo	❌	❌	✅	❌	✅	✅	Cloud agent, full computer: code, deploy, files, search	❌
Sigma AI Browser	Windows, macOS, Linux	Free / Pro $29/mo	❌	✅	✅	❌	❌	❌	Built-in local AI agent, offline, no tracking	🔗
Fellou	Windows, macOS	Free 4 tasks/day / Pro $20/mo	❌	❌	✅	❌	❌	❌	Complex multi-step automation, agentic tasks	🔗
Arc Max	macOS, Windows	Free	❌	❌	⚠️	❌	❌	❌	AI-enhanced browsing, pinch-to-summarize, Ask on Page	❌
Maxthon	Windows, macOS, iOS, Android	Free / Premium	❌	❌	⚠️	❌	❌	❌	MaxAsk AI answers, built-in VPN, ad-blocker, resource sniffer	❌
ChatGPT Atlas	macOS	Free (with ChatGPT subscription)	❌	❌	✅	❌	❌	❌	OpenAI integration, macOS computer use overlay	🔗
AnythingLLM	Windows, macOS, Linux	Free (OSS)	✅	✅	⚠️	✅ (local API)	❌	❌	All-in-one desktop AI, document chat, local + API	🔗
BrowserGPT	iOS, Android	Free / Premium	❌	❌	⚠️	❌	❌	❌	Mobile-first AI browser	❌
Sidekick Browser	Windows, macOS, Linux	Free / Pro $10/mo	❌	❌	✅	❌	❌	❌	AI assistant, natural language tab management, summarize, automate tasks	❌

Browser Extensions

Extension	Pricing	Free	Multi-Agent	Best For	GitHub
Monica.im	Freemium (Free + ~$9/mo)	❌	✅	Chrome extension, no browser switch	❌
Harpa AI	Free	✅	❌	Automation recipes	🔗
MultiOn	Free/Paid	⚠️	✅	Complex tasks	🔗
NanoBrowser	Free	✅	✅	Local control, Ollama	🔗
Neobrowser	Free (OSS)	✅	❌	Local LLMs via Ollama, privacy-first, Chrome/Edge	❌
Open Operator	Free	✅	❌	Browserbase-powered, open NL browser control	🔗
Openator	Free (OSS)	✅	❌	Docker-based headless NL browser agent	🔗

Developer Libraries

Library	Language	Pricing	Best For	API Access	Multi-Agent	Parallel Sessions	GitHub
Chrome DevTools MCP	TypeScript	Free (OSS)	AI web debugging, 29 DevTools	❌	❌	❌	🔗
Cloudflare Browser Run	Cloud API	Free Workers / $5+/mo	CDP + MCP, WebMCP, Live View	✅	❌	✅	🔗
Browser-use	Python	Free OSS / Cloud $29/mo	Agentic automation, Workflow Use	✅	✅	✅	🔗
Stagehand	TypeScript/Python	Free (OSS)	Hybrid deterministic + AI, action caching	✅	❌	✅	🔗
LaVague	Python	Free (OSS)	NL to code	❌	❌	✅	🔗
Skyvern	Python	Free tier / $29–$149/mo	CV-based automation, Ollama support	✅	✅	✅	🔗
Notte	Python/Cloud	Free tier / $29/mo+	Deterministic replay, demo→script	✅	❌	✅	🔗
Firecrawl	Python / CLI	Free tier / $49/mo+	LLM-powered crawling & scraping	✅	❌	✅	🔗
Playwright MCP	TypeScript	Free (OSS)	Cross-browser automation, VS Code	✅	❌	✅	🔗
Langflow	Python	Free (OSS) / Cloud $29/mo	Visual multi-agent & RAG workflows	✅	✅	✅	🔗
LlamaIndex	Python	Free (OSS) / Cloud $29/mo	Document-heavy RAG, retrieval quality	✅	✅	✅	🔗
Haystack	Python	Free (OSS) / Cloud $49/mo	Regulated deployments, structured pipelines	✅	✅	✅	🔗
AgentQL	TypeScript/Python	Free (1K req/mo) / $49/mo / $149/mo	Natural language web querying/automation	✅	✅	✅	🔗
ScrapeGraphAI	Python	Free OSS / Cloud $29/mo	Natural language web scraping	✅	✅	✅	🔗
WebVoyager	Python	Free (OSS)	Autonomous web browsing research	❌	❌	✅	🔗

Cloud Automation

Service	Platform	Pricing	Best For	GitHub
ChatGPT agent	ChatGPT	Plus / Pro / Team	Guided browser tasks, research, forms, and spreadsheets	❌
Project Mariner	Google AI Ultra	Included with Google AI Ultra	Multi-step browser tasks, shopping, and reservations	❌
Skyvern Cloud	Cloud API	Paid	Resilient automation	🔗
Browserbase	Cloud API	Paid	Stealth mode, session recording	❌

Autonomous Agents — Plain English Prompts 🤖

Control a computer or cloud sandbox using plain English text — no coding required. Just describe the task and the agent handles everything: clicking, typing, navigating, running code, and completing multi-step workflows.

Legend: 🖥️ = runs on your physical computer | ☁️ = cloud/sandbox computer | 🌐 = controls a browser | 🔁 = multi-agent/parallel | 💬 = simple English prompt | 🔓 = free/open-source | 💰 = paid

☁️ Cloud Sandbox Computer Use (English Prompts)

These services run in a cloud sandbox (virtual Linux/Windows desktop), control the computer for you, and are driven purely by natural language instructions.

Agent	Interface	Pricing	Multi-Agent	Parallel Sessions	Local LLM	English Prompt	GitHub
Manus AI	Web dashboard	Free (300 credits/day) / Plus $20/mo / Pro $200/mo	✅	✅	❌	✅	❌
ChatGPT Agent	ChatGPT Web/App	Plus $20/mo / Pro $200/mo	❌	❌	❌	✅	❌
Gemini Computer Use	API / AI Studio	Gemini Pro $19.99/mo / API metered	❌	❌	❌	✅	❌
Devin	Web dashboard	Core $20/mo ($2.25/ACU) / Team $500/seat/mo	❌	❌	❌	✅	❌
OpenHands	Web UI / CLI	Free OSS / Cloud Individual free	✅	✅	✅ (any API)	✅	🔗
E2B Desktop Sandbox	API / SDK	Hobby free / Pro $150/mo	❌	✅ (via code)	✅	✅	🔗
Cua (trycua)	CLI / Python SDK	Free (OSS)	❌	✅	✅	✅	🔗
Airtop	Web dashboard / API	Starter $26/mo (3 sessions) / Pro $80/mo (30 sessions)	❌	✅	❌	✅	❌
Skyvern Cloud	Web dashboard / API	Free 1K credits / Hobby $29/mo / Pro $149/mo	❌	✅	✅ (Ollama)	✅	🔗
Convergence Proxy	Web / API	Free tier / Pro $20/mo (acquired by Salesforce)	❌	❌	❌	✅	❌
Amazon Nova Act	API (AWS)	Pay-per-use (AWS pricing)	❌	✅	❌	✅	❌
Project Mariner	Google AI Ultra	Included ($249.99/mo Ultra plan)	❌	❌	❌	✅	❌
Perplexity Computer	Web dashboard	Perplexity Pro $20/mo	❌	❌	❌	✅	❌
OpenAI Computer Use (API)	API / ChatGPT	$15/M input, $60/M output	✅	✅	❌	✅	❌

🖥️ Local Machine / Physical Computer Use

These agents run on your own machine, see your screen, and control your keyboard/mouse — no cloud required.

Agent	Windows	macOS	Linux	Dashboard/UI	CLI	API/LLM	Multi-Agent	Parallel Sessions	Pricing	GitHub
Claude Computer Use	✅	✅	✅	❌	✅ (API)	Claude API ($3–$15/M)	❌	❌	Claude API ($3–$15/M tokens)	Commercial
Agent TARS (ByteDance)	✅	✅	✅	✅ Web UI	✅ `npx @agent-tars/cli@latest`	Any LLM	❌	✅	Free (OSS)	🔗
UI-TARS Desktop (ByteDance)	✅	✅	✅	✅ Desktop app	❌	UI-TARS-2 model	❌	❌	Free (OSS)	🔗
Open Interpreter	✅	✅	✅	✅ Web	✅ `interpreter`	Any (OpenAI, Claude, local)	❌	❌	Free (OSS)	🔗
Open-Interface	✅	✅	✅	❌	✅	GPT-4V / any vision LLM	❌	❌	Free (OSS)	🔗
Agent S / S2	✅	✅	✅	❌	✅	Any LLM API	❌	❌	Free (OSS)	🔗
UFO (Microsoft)	✅	❌	❌	✅ UI	✅	GPT-4V / Azure	❌	❌	Free (OSS)	🔗
Windows-Use	✅	❌	❌	❌	✅	Any vision LLM	❌	❌	Free (OSS)	🔗
Bytebot	❌	❌	✅	✅ (Docker)	✅	Any LLM	❌	❌	Free (OSS)	🔗
OpenCUA	✅	✅	✅	❌	✅	Any	❌	❌	Free (OSS)	🔗
Khoj	✅	✅	✅	✅ Web UI	✅	Any (Ollama, LM Studio, OpenAI)	❌	❌	Free (OSS) / Cloud $10/mo	🔗

🌐 Browser-Only Agents

Control a browser with natural language — click, fill forms, scrape, automate. No script writing needed.

Agent	Type	Pricing	Dashboard	CLI	Multi-Agent	Parallel Sessions	Local LLM	GitHub
Browser-use	OSS Python lib + Cloud	Free OSS / Cloud: Free 3 sessions / Dev $29/mo / Business $299/mo	✅ Cloud	❌	✅	✅	✅ (Ollama)	🔗
Stagehand	OSS TypeScript	Free (OSS)	❌	✅	❌	✅	✅	🔗
NanoBrowser	Chrome extension	Free (OSS)	✅ Extension	❌	✅	❌	✅ (Ollama)	🔗
Skyvern	Python / Cloud	Free tier / $29–$149/mo	✅ Cloud	✅	✅	✅	✅ (Ollama)	🔗
Openator	Python	Free (OSS)	❌	✅	❌	✅	✅	🔗
Open Operator	Web UI	Free	✅	❌	❌	❌	❌	🔗
Airtop	Web / API	$26–$80/mo	✅	❌	✅	✅	❌	❌
MultiOn	API / Chrome ext	Free / Paid	✅	❌	✅	❌	❌	🔗

🔁 Multi-Agent / Parallel Agent Platforms (Plain English Orchestration)

Coordinate multiple AI agents in parallel to complete complex workflows — driven by plain English goals.

Platform	Type	Dashboard	CLI	Cloud	Local LLM	Parallel	Pricing	GitHub
CrewAI	Multi-agent OSS + Cloud	✅ AMP Studio	✅ `crewai`	✅ AMP	✅	✅	Free OSS / Starter $99/mo / Pro $299/mo / Enterprise custom	🔗
AutoGen (Microsoft)	Multi-agent conversations	⚠️	✅ Python	✅ Azure	✅	✅	Free (OSS) / Azure pay-per-token	🔗
LangGraph	Stateful agent graphs	✅ LangSmith	✅ Python	✅ Cloud	✅	✅	Free OSS / Professional $99/mo	🔗
OpenHands	Dev-focused multi-agent	✅ Web UI	✅	✅ Cloud	✅	✅	Free (OSS + Cloud free tier)	🔗
OWL (Camel-AI)	Distributed multi-agent	❌	✅ Python	❌	✅	✅	Free (OSS)	🔗
Manus AI	Cloud multi-agent	✅ Web	❌	✅	❌	✅	Free 300 credits/day / $20–$200/mo	❌
n8n	Workflow + AI agents	✅ Visual canvas	✅ `n8n`	✅ Cloud	✅ (Ollama node)	✅	Free OSS / Starter $24/mo / Pro $60/mo	🔗
Devin	Software engineering	✅ Web	❌	✅	❌	❌	Core $20/mo ($2.25/ACU) / Team $500/seat	❌
Smolagents (HuggingFace)	Lightweight code agents	❌	✅ Python	❌	✅	⚠️	Free (OSS)	🔗
Dify	Visual LLM platform	✅ Web UI	✅	✅ Cloud	✅	✅	Free OSS / Cloud plans	🔗

Multi-Agent & Parallel Execution Summary

Tools supporting parallel agent orchestration (✅) vs single-agent only (❌):

Category	Supports Parallel Agents	Tools
Cloud Sandbox	✅	Manus AI, OpenHands, E2B Desktop Sandbox, Cua (trycua), Airtop, Skyvern Cloud, Amazon Nova Act, Perplexity Computer, OpenAI Computer Use (API)
Cloud Sandbox	❌	ChatGPT Agent, Gemini Computer Use, Devin, Convergence Proxy, Project Mariner
Local Machine	✅	Agent TARS, E2B Desktop Sandbox, Cua (trycua)
Local Machine	❌	Claude Computer Use, UI-TARS Desktop, Open Interpreter, Open-Interface, Agent S/S2, UFO, Windows-Use, Bytebot, OpenCUA, Khoj
Browser-Only	✅	Browser-use, Skyvern, Airtop, MultiOn
Browser-Only	❌	Stagehand, NanoBrowser, Openator, Open Operator
Developer Libraries	✅	Browser-use, Skyvern, Cloudflare Browser Run, Langflow, LlamaIndex, Haystack, AgentQL, ScrapeGraphAI, WebVoyager
Developer Libraries	❌	Chrome DevTools MCP, Stagehand, LaVague, Notte, Firecrawl, Playwright MCP
Multi-Agent Platforms	✅	CrewAI, AutoGen, LangGraph, OpenHands, OWL, Manus AI, n8n, Smolagents, Dify
Multi-Agent Platforms	❌	Devin

AI Infrastructure 🏗️

Tools, frameworks, and specialized models for building production AI systems — from embeddings and video generation to safety, evaluation, and model routing.

Embedding & Reranking Models 🧲

Specialized models for converting text (or images) into dense vector representations and for reranking retrieval results. Essential infrastructure for RAG pipelines and semantic search. Prices as of April 2026.

Embedding Models

Model	Developer	Dimensions	Max Tokens	Pricing	Best For	GitHub
text-embedding-3-small	OpenAI	1,536	8,191	$0.02/1M tokens	Cost-effective English embeddings	—
text-embedding-3-large	OpenAI	3,072	8,191	$0.13/1M tokens	Highest-quality English retrieval	—
Embed v4	Cohere	1,536	128K	$0.12/1M (text), $0.47/1M (image)	Multimodal text + image RAG	—
voyage-3-large	Voyage AI	256–2,048 (flex)	32K	~$0.18/1M tokens	Highest-quality retrieval, long context	—
jina-embeddings-v3	Jina AI	32–1,024 (flex)	8,192	API pay-per-use	Multilingual, task-adaptive (LoRA heads)	🔗
BGE-M3	BAAI	1,024	8,192	Free (open-source)	Multi-functional: dense + sparse + ColBERT	🔗
Nomic Embed v2 (MoE)	Nomic AI	256–768 (flex)	512	Free (open-source)	Multilingual, MoE efficiency (305M active)	🔗
text-embedding-005	Google (Vertex AI)	768	2,048	$0.10/1M tokens	GCP-native semantic search	—

Reranking Models

Model	Developer	Max Tokens	Pricing	Best For	GitHub
Rerank 4.0 Pro	Cohere	32K	$1.00/1K queries	High-accuracy domain-specific reranking	—
Rerank 4.0 Fast	Cohere	32K	$0.50/1K queries	Low-latency production reranking	—
rerank-2.5	Voyage AI	32K	API pay-per-use	Instruction-following, multilingual	—
BGE Reranker v2-m3	BAAI	8,192	Free (open-source)	Open-source cross-encoder reranking	🔗
Jina Reranker v2	Jina AI	8,192	API pay-per-use	Multilingual, long-context reranking	—

Video Generation Models 🎬

Text-to-video and image-to-video generation models for creating short clips from prompts. The field is moving rapidly — resolutions, durations, and pricing change frequently. Specs as of April 2026.

Model	Developer	Resolution	Duration	Pricing	Open Source	Best For	GitHub
Sora 2	OpenAI	Up to 1080p	Up to 20s (Pro)	$20–$200/mo via ChatGPT	No	Cinematic quality, long clips	—
Veo 3	Google DeepMind	720p–1080p	Up to 8s (extendable)	~$0.20–$0.40/s	No	Native audio + video, realistic physics	—
Runway Gen-4 / Gen-4.5	Runway	Up to 4K	Up to 16s	$12–$76/mo	No	Professional creative workflows	—
Kling 2.0	Kuaishou	1080p	Up to 10s	Free / $5.99–$66/mo	No	Budget production, fast turnaround	—
Pika 2.0	Pika Labs	1080p	Up to 5s	Free / $8–$58/mo	No	Social media, creative effects	—
MiniMax Video-01	MiniMax	720p	Up to 6s	~$0.40/video	No	Strong text-motion responsiveness	—
HunyuanVideo	Tencent	720p–2K	Up to 16s	Free (self-host; ~60GB VRAM)	Yes (Apache 2.0)	High per-frame fidelity, long clips	🔗
Wan 2.2 (14B)	Alibaba	480p–1080p	Up to 10s	~$0.10–$0.30/clip (API)	Yes (Apache 2.0)	Motion quality, VBench #1 benchmark	🔗
Mochi 1	Genmo	480p	Up to 5.4s @ 30fps	Free (open-source)	Yes (Apache 2.0)	High-quality open text-to-video	🔗
LTX Video	Lightricks	720p	Variable	Free (open-source)	Yes	Fast generation, ComfyUI-native	🔗
CogVideoX	Zhipu AI / Tsinghua	720p	~6s	Free (open-source)	Yes (Apache 2.0)	Image-to-video quality, LoRA fine-tuning	🔗

Speech & TTS Models 🔊

Text-to-speech (TTS) and speech-to-text (STT / ASR) models for voice generation, transcription, and real-time audio. Prices as of April 2026.

Text-to-Speech (TTS)

Model	Developer	Languages	Real-time	Open Source	Pricing	Best For	GitHub
ElevenLabs Turbo v2.5	ElevenLabs	29+	Yes	No	Free – $1,320/mo	Best quality (4.8 MOS), instant voice cloning	—
OpenAI TTS / TTS HD	OpenAI	57	Yes	No	$15 / $30 per 1M chars	Enterprise, seamless GPT integration	—
Sesame CSM	Sesame AI Labs	English	Yes	Yes	Free	Conversational, emotionally expressive (4.7 MOS)	🔗
Kokoro-82M	Hexgrad	Multilingual	Yes	Yes (Apache 2.0)	Free	Tiny (82M params), CPU-runnable, near-commercial quality	🔗
Fish Audio S1	Fish Audio	Multilingual	Yes	Yes	Free / $0.016/1K chars (API)	Voice cloning, multilingual fluency	🔗
Parler-TTS	HuggingFace	English	No	Yes (Apache 2.0)	Free	Style-controllable via text descriptions	🔗
XTTS v2	Coqui AI	17	Yes	Yes (MPL 2.0)	Free	Best open-source multilingual, 6s voice cloning	🔗
Bark	Suno AI	13+	No	Yes (MIT)	Free	Expressive, non-verbal sounds, long-form audio	🔗

Speech-to-Text (STT / ASR)

Model	Developer	Languages	Real-time	Open Source	Pricing	Best For	GitHub
Whisper large-v3	OpenAI	100+	No	Yes (MIT)	$0.006/min (API)	Open-source multilingual baseline	🔗
GPT-4o Transcribe	OpenAI	50+	Yes	No	$0.006/min	High-accuracy managed STT	—
Deepgram Nova-3	Deepgram	36+	Yes	No	$0.0043/min	Ultra-low latency, production STT	—
AssemblyAI Universal-2	AssemblyAI	Multilingual	Yes	No	$0.0025/min	Accurate, feature-rich transcription	—

AI Safety & Guardrails 🛡️

Tools and frameworks for detecting unsafe content, preventing prompt injection, validating outputs, and enforcing policy compliance in LLM-powered applications. As of April 2026.

Tool	Developer	Type	Open Source	Pricing	Best For	GitHub
Llama Guard 3	Meta	Safety classifier (8B LLM)	Yes (Meta license)	Free / ~$0.02/1M tokens (API)	Input/output safety classification, 8 languages	🔗
NeMo Guardrails	NVIDIA	Programmable guardrail toolkit (Colang DSL)	Yes (Apache 2.0)	Free	Dialog safety, policy enforcement, LangChain-native	🔗
OpenAI Privacy Filter	OpenAI	PII detection & redaction	Yes (Apache 2.0)	Free (OSS)	Detects & redacts personal info in text	🔗
Guardrails AI	Guardrails AI	Python validator framework	Yes	Free (OSS)	Output validation, PII detection, hallucination guards	🔗
Amazon Bedrock Guardrails	AWS	Managed safety layer	No	Pay-per-use (AWS)	AWS-native, zero-ops compliance and content filtering	—
ShieldGemma 2	Google	Safety classifier (open weights)	Yes (open weights)	Free	Text safety (2B/9B/27B), image safety (4B)	—
Rebuff	Protect AI	Prompt injection detector	Yes	Free	Self-hardening anti-injection using vector memory	🔗
Lakera Guard	Lakera	Managed LLM security API	No	Free tier + Enterprise	Runtime LLM security, <50ms latency, PII + injection	—

RAG Frameworks 🗂️

Frameworks and libraries for building Retrieval-Augmented Generation (RAG) pipelines — connecting LLMs to external knowledge sources. As of April 2026.

Framework	Developer	Language	Key Features	Open Source	GitHub
LlamaIndex	LlamaIndex	Python	160+ data connectors, hybrid search, multi-agent support	Yes (MIT)	🔗
LangChain	LangChain AI	Python / JS	Chains, agents, memory, 50K+ integrations, LangGraph	Yes (MIT)	🔗
RAGFlow	InfiniFlow	Python	Visual workflow builder, deep document parsing (PDF/tables)	Yes (Apache 2.0)	🔗
Haystack	deepset	Python	Modular pipelines, enterprise-grade, built-in monitoring	Yes (Apache 2.0)	🔗
Verba	Weaviate	Python	No-code UI, Weaviate-native vector search	Yes	🔗
Mem0	Mem0 AI	Python / JS	Persistent memory layer, graph memory, session recall	Yes (Apache 2.0)	🔗
txtai	NeuML	Python	All-in-one semantic search + workflow automation	Yes (Apache 2.0)	🔗
R2R	SciPhi	Python	Lightweight, low-latency, REST API, production-first	Yes (MIT)	🔗

Fine-tuning Platforms ⚙️

Tools and platforms for adapting pre-trained LLMs to specific tasks or domains via supervised fine-tuning, RLHF, LoRA/QLoRA, and related methods. Prices as of April 2026.

Platform	Type	Supported Models	Pricing	Best For	GitHub
Unsloth	OSS library	Llama, Mistral, Gemma, Qwen, Phi, + more	Free	2–5× faster training, 80% VRAM reduction via custom kernels	🔗
Axolotl	OSS framework	Most Hugging Face models	Free	Config-as-code (YAML), reproducibility, multi-GPU training	🔗
OpenAI Fine-tuning	Managed API	GPT-4o, GPT-4o-mini, GPT-3.5 Turbo	GPT-4o-mini: $0.30/1M training tokens	Managed, no infra, direct production deployment	—
Google Vertex AI	Managed cloud	Gemini 2.5 Pro/Flash, Gemma 3	Gemini 2.5 Pro: $25/1M training tokens	GCP-native, Gemini model access	—
Predibase / LoRAX	Cloud + OSS server	Llama, Mistral, 50+ HF models	Free tier + per-GPU pricing	Multi-adapter serving: many LoRA adapters on one GPU	🔗
PEFT	Hugging Face	All Hugging Face models	Free	LoRA, QLoRA, prefix tuning, prompt tuning — full HF ecosystem	🔗
LLaMA-Factory	Community	100+ models	Free	Web UI, low-code interface, beginner-friendly fine-tuning	🔗
torchtune	PyTorch	Llama, Gemma, Mistral, Phi	Free	PyTorch-native, composable training recipes	🔗

Evaluation & Observability 📊

Tools for tracing LLM calls, evaluating output quality, debugging RAG pipelines, and monitoring production AI systems. Prices as of April 2026.

Tool	Developer	Type	Open Source	Pricing	Best For	GitHub
LangSmith	LangChain AI	Tracing + evaluation platform	No (enterprise self-host)	Free (5K traces/mo), paid plans	LangChain apps, chain + agent debugging	—
Braintrust	Braintrust Data	Eval-first platform	Partial (AI proxy OSS)	Free (1M spans), enterprise	CI/CD evals, dataset management, LLM-as-judge	—
Helicone	Helicone	Proxy-based observability	Yes	Free tier, usage-based	Cost tracking, request caching, drop-in API proxy	🔗
Arize Phoenix	Arize AI	OSS tracing + evaluation	Yes	Free (OSS); Arize Cloud paid	RAG debugging, LLM-as-judge, local dev	🔗
Langfuse	Langfuse	Tracing + evaluation	Yes (MIT)	Free / self-host; cloud paid	Open-source, 19K+ GitHub stars, OpenTelemetry	🔗
Ragas	Ragas	RAG evaluation framework	Yes	Free	RAG-specific metrics: faithfulness, recall, precision	🔗
DeepEval	Confident AI	LLM evaluation framework	Yes	Free (OSS); cloud paid	14+ built-in metrics, pytest-style eval runner	🔗

MCP Ecosystem 🔌

The Model Context Protocol (MCP) is an open standard by Anthropic for connecting LLMs to external tools and data sources via a unified JSON-RPC 2.0 interface. It supports STDIO and Streamable HTTP transports. The official MCP registry at mcp.so lists 2,000+ servers.

MCP Clients: Claude Desktop, Claude Code, Cursor, Windsurf, VS Code (Copilot), Continue.dev, Zed, LibreChat, and more.

Popular MCP Servers

Tool / Server	Developer	Category	Open Source	Best For	GitHub
MCP Filesystem	Anthropic / Community	File I/O	Yes (MIT)	Read/write local files from any MCP client	🔗
MCP GitHub	GitHub / Anthropic	Code & DevOps	Yes	Repo management, issues, PRs, code search	🔗
MCP Slack	Community	Messaging	Yes	Slack workspace read/write interaction	🔗
MCP PostgreSQL	Community	Database	Yes	Read-only SQL queries against Postgres	🔗
MCP Google Drive	Community	Storage	Yes	Drive file access and search	🔗
MCP Docker	Community	DevOps	Yes	Container management and inspection	🔗
MCP Brave Search	Brave	Search	Yes	Web + local search via Brave API	🔗
MCP AWS	AWS Labs	Cloud	Yes (Apache 2.0)	AWS service integration	🔗
MCP Notion	Community	Productivity	Yes	Notion page and database access	🔗
FastMCP	Community	Framework	Yes	Python framework for building MCP servers fast	🔗
Context7	Upstash	Dev Tools	Yes	Up-to-date library docs for AI coding assistants	🔗

Agent Skills & Registries 🎯

Modular capability packages that extend AI agents with specialized knowledge, workflows, and procedural instructions — without bloating model context.

skills.sh

skills.sh is the primary registry and package manager for Agent Skills — an open standard developed by Anthropic for packaging and distributing reusable agent capabilities. Skills follow a progressive disclosure pattern: agents load only a skill's name and description at startup, then pull full instructions only when a task matches, keeping context overhead minimal.

Install a skill in one command:

npx skills add owner/repo

Feature	Detail
Standard	Agent Skills (open, SKILL.md format) — developed by Anthropic, hosted on GitHub
Registry URL	skills.sh
Total installs	90,989+ all-time
Compatible agents	Claude Code, Cursor, Windsurf, VS Code Copilot, Continue.dev, Zed, and any MCP-compatible agent
License	Open (skills are author-licensed; spec is open standard)

Top Skills by Category

Skill	Publisher	Category	Installs
find-skills	vercel-labs/skills	Discovery	1.3M
vercel-react-best-practices	vercel-labs/agent-skills	Frontend	366K
frontend-design	anthropics/skills	Design	361K
web-design-guidelines	vercel-labs/agent-skills	Design	291K
microsoft-foundry	microsoft/azure-skills	Cloud/Azure	286K
azure-ai	microsoft/azure-skills	AI/Cloud	276K
agent-browser	vercel-labs/agent-browser	Browser	229K
skill-creator	anthropics/skills	Meta	180K
browser-use	browser-use/browser-use	Automation	71.6K
systematic-debugging	obra/superpowers	Dev	78.5K
test-driven-development	obra/superpowers	Dev	68.0K
seo-audit	coreyhaines31/marketingskills	Marketing	95.4K
supabase-postgres-best-practices	supabase/agent-skills	Database	138K
playwright-best-practices	currents-dev/playwright	Testing	34.2K

Notable Publisher Ecosystems

Publisher	Skills Count	Focus
microsoft/azure-skills	19+	Azure cloud, AI, Kubernetes, cost optimization
vercel-labs/agent-skills	15+	React, Next.js, Tailwind, deployment
anthropics/skills	15+	Design, docs, coding, web artifacts
coreyhaines31/marketingskills	20+	SEO, marketing, content, analytics
obra/superpowers	12+	Dev workflows, parallel agents, TDD
firebase/agent-skills	10+	Firebase, Firestore, GenKit
larksuite/cli	13+	Lark workspace automation
pbakaus/impeccable	10+	Design polish, code quality

Model Routers & Load Balancers 🔀

Tools for routing LLM requests across multiple providers, models, and deployments — optimizing for cost, latency, quality, or reliability. Prices as of April 2026.

Tool	Developer	Key Features	Open Source	Pricing	GitHub
LiteLLM	BerriAI	100+ provider support, proxy server, load balancing, fallbacks, spend tracking	Yes (MIT)	Free (OSS) / $99/mo cloud	🔗
Portkey	Portkey	250+ LLMs, AI gateway, guardrails, observability, virtual keys	Yes (Apache 2.0)	Free tier / $49/mo+	🔗
OpenRouter	OpenRouter	200+ model catalog, unified API, pay-per-use credit system	No	~5% markup on provider cost	—
RouteLLM	LMSys	Open-source router (strong vs. weak model) using classifier or matrix factorization	Yes	Free	🔗
Not Diamond	Not Diamond	Pre-trained + custom task-specific routers, cost/quality tradeoff	No	Free tier + enterprise	—
Unify AI	Unify	Quality / cost / latency-aware routing across 100+ model deployments	No	Usage-based	—
Semantic Router	Aurelio AI	Embedding-based semantic intent routing for agents and pipelines	Yes	Free	🔗

Small Language Models (SLMs) 📱

Compact models designed for on-device inference, edge deployment, low-latency APIs, and resource-constrained environments. Generally defined as models under ~15B parameters. Specs as of April 2026.

Model	Developer	Params	Context	License	Best For
Phi-4	Microsoft	14B	16K	MIT	Reasoning, math, code — STEM benchmark leader at class size
Phi-4-mini	Microsoft	3.8B	128K	MIT	On-device STEM reasoning with long context
Phi-4-multimodal	Microsoft	5.6B	128K	MIT	Vision + audio + text multimodal, edge deployment
Gemma 3 27B	Google	27B	128K	Apache 2.0	Top open model, multilingual (140+ languages)
Gemma 3 4B	Google	4B	128K	Apache 2.0	CPU inference, 140+ languages, mobile-friendly
Gemma 3 1B	Google	1B	32K	Apache 2.0	On-device, embedded, ultra-lightweight
SmolLM3	Hugging Face	3B	128K	Apache 2.0	Efficient, tool use, multilingual, reasoning
Qwen2.5 3B	Alibaba	3B	128K	Apache 2.0	Asian and multilingual tasks, coding
Qwen2.5 7B	Alibaba	7B	128K	Apache 2.0	Strong multilingual baseline, function calling
Llama 3.2 3B	Meta	3B	128K	Llama 3.2 license	General-purpose, on-device, Meta ecosystem
Llama 3.2 1B	Meta	1B	128K	Llama 3.2 license	Lightweight edge inference, distillation target
Granite 3.3 8B	IBM	8B	128K	Apache 2.0	Enterprise tasks, tool use, business-domain
MiniCPM 3.0	ModelBest / Tsinghua	4B	32K	Apache 2.0	Compact yet capable, mobile and edge
Danube 3 500M	H2O.ai	500M	8K	Apache 2.0	Ultra-lightweight on-device, IoT

Notable GitHub repos:

SmolLM / SmolLM2: 🔗
Granite 3.x: 🔗
MiniCPM: 🔗
Danube 3: 🔗

Guides 📚

Tutorials, how-tos, and in-depth guides for getting the most out of AI models and tools.

Getting Started 🚀

A beginner-friendly introduction to AI models and how to start using them effectively.

Understanding LLMs

Concept	Description
Parameters	Size of model (B = billions). More = more capable
Context Window	How much text model can process (128K standard)
Tokens	Basic units of text (~0.75 words per token)

Accessing AI Models

Method	Best For	Setup Difficulty
Web Interfaces	Quick experiments	Easiest
API Access	Building applications	Easy
Self-Hosting	Privacy, no API costs	Medium-Hard
IDE Integration	Daily coding	Easy

Model Recommendations by Task

Task	Free Option	Premium Option
Chat	Llama 4 (self-hosted)	GPT-5.4, Claude Opus 4.6
Coding	DeepSeek-Coder-V2	Claude Opus 4.6
Reasoning	DeepSeek-R1	Gemini 3 Deep Think, o3
Long docs	Llama 4 Scout	Gemini 3 Flash
Vision	Llama 4 Maverick	GPT-5.4, Gemini 3 Pro

Free Models & APIs for Vibe Coding 💻

Vibe coding — describing what you want in natural language and letting AI generate the code — has exploded in 2026. The ecosystem splits into two tracks: free AI APIs you plug into your own editor/agent, and free vibe coding IDEs/platforms that bundle everything together.

Free AI APIs for Coding

These are the raw API endpoints you can use in tools like Cursor (BYOK), Cline, or any agent framework.

Provider	Free Models	Daily Limit	Best For
Google Gemini API	Gemini 2.5 Pro (100 req/day), Gemini 2.5 Flash (250 req/day), Gemini 2.5 Flash-Lite (1,000 req/day)	Per-project limits	Prototyping, large context (1M tokens), multimodal
Groq Cloud	Llama 4 Scout, DeepSeek R1, Qwen3, GPT-OSS	~1,000-14,400 req/day	Fast iteration, agentic workflows
OpenRouter	28+ free models including Qwen3 Coder 480B, Devstral 2, MiMo-V2-Flash, DeepSeek R1, GPT-OSS 120B, Llama 3.3 70B	Varies by model	Experimenting with many models
Cerebras	Llama 3.3 70B, Qwen3 32B/235B, GPT-OSS 120B	1M tokens/day	Batch tasks, raw speed (20× faster than GPUs)
Mistral AI	Codestral-2508, Devstral, Mistral Large, Pixtral	1B tokens/month	Code completion, FIM tasks
NVIDIA NIM	91 free endpoints including Chinese models	Varies	Production inference on DGX Cloud

Free Vibe Coding IDEs & Platforms

Tool	Type	Key Features	Best For
Cursor	AI IDE	Agent Mode, Composer 2, multi-agent workspace	Professional development
Cline	VS Code Extension	Open-source, BYOK/Ollama, MCP tools	Self-hosted, unlimited local LLM
Windsurf	AI IDE	Cascade agent, live browser preview	IDE with browser integration
OpenHands	Docker Agent	Self-hosted, local LLM support, full SDLC	Unlimited local development
bolt.diy	Browser IDE	19+ LLM providers, Ollama, full-stack apps	Free web app building
Open Interpreter	CLI	Natural language → code, local LLM	Simple local automation

Chrome DevTools MCP - Game Changer for Web Dev

Google's Chrome DevTools MCP connects AI agents directly to Chrome for debugging, profiling, and automation:

29 tools across 6 categories (input, navigation, emulation, performance, network, debugging)
Run Lighthouse audits, capture performance traces, inspect network requests
Works with Claude Code, Cursor, Copilot via MCP
Supports BYOK/local LLMs through MCP clients
GitHub | 37,783+ stars

Cloudflare Browser Run

Managed browser infrastructure for AI agents:

Chrome DevTools Protocol (CDP) direct endpoint
MCP client support (Claude, Cursor, OpenCode)
Session recordings, Live View, WebMCP
Free Workers plan / $5+/mo paid
Browser Run

Recommendations by Use Case

Free + Local LLM: Cline + Ollama, OpenHands + Qwen3 Coder, bolt.diy + Ollama

Fast API Iteration: Groq (speed) + Cerebras (high limits)

Web Development: Chrome DevTools MCP + Cline (zero-cost debugging)

Many Models: OpenRouter (unified API)

Production Inference: NVIDIA NIM, Cerebras

💡 Pro Tip: Combine Chrome DevTools MCP with a local LLM (Ollama) via Cline for completely free, unlimited AI-powered web development and debugging.

A comprehensive guide to running AI models on your own hardware.

Benefits

Benefit	Description
Privacy	Data never leaves your infrastructure
Cost Control	No per-token API costs for unlimited usage
Customization	Fine-tune models for specific needs
No Rate Limits	Process as much as hardware allows
Offline Access	Work without internet

Quick Start with Ollama

For installation and usage instructions, refer to the official Ollama documentation.

Local GPU Quick Guide

Recommended apps (local-first):

Ollama - Simple local runtime with a local HTTP API
LM Studio - Desktop UI for downloading and running models locally
llama.cpp - Fast local inference (CPU/GPU), great for quantized models
Open WebUI - Optional local web UI (pairs well with local runtimes)

If you want “server-style” hosting (advanced):

vLLM - High-throughput serving for NVIDIA GPUs
SGLang - Structured generation and serving workflows

Practical setup tips:

Install the latest NVIDIA drivers (enable GPU acceleration in your chosen app)
Start with smaller quantized models (Q4 is a common “best default”)
Keep context windows realistic for local hardware (lower context = faster, less memory)
Watch VRAM first, then system RAM; reduce model size or quantization if either saturates
Prefer running locally on localhost and only expose to LAN if you understand firewall rules

Example hardware configurations:

Hardware	Good starting point	Notes
Consumer GPU (24 GB VRAM)	7B–14B quantized	e.g., RTX 4090, RTX 3090 — great for chat/coding
Pro GPU (48–80 GB VRAM)	14B–70B quantized	e.g., A6000, A100 — coding agents, longer contexts
Multi-GPU (160+ GB VRAM)	70B+ quantized	e.g., 2×A100 — larger open-source models
CPU-only (32–64 GB RAM)	7B–14B quantized	Slower but viable for offline chat; keep context moderate

Deployment Options

Option	Best For	Pros	Cons
Local Machine	Personal use	Simple, no latency	Limited hardware
Dedicated Server	Team use	Full control	Maintenance
Cloud GPU Rental	Experimentation	On-demand	Hourly costs
Kubernetes	Enterprise	Scalable	Complex

Cost Analysis 💰

Comprehensive pricing comparisons and cost calculations.

Pricing Tiers

Tier	Price Range	Models
🆓 Free	$0	Self-hosted, free tiers
💸 Budget	$0.025 - $0.50/1M	Gemini 3.1 Flash-Lite, GLM-4.7-FlashX, GPT-5.4 nano, Grok 4 Fast
💰 Mid-range	$0.60 - $15.00/1M	GPT-5.4 mini, Claude Haiku 4.5, Kimi K2.5, Sonar, GLM-5, GPT-5.4, Claude Sonnet
💎 Premium	$15.00 - $600.00/1M	GPT-5.4 Pro, Claude Opus, o1-Pro

Subscription Pricing (Monthly, USD)

AI chat apps

Product	Plans (USD)	Notes	Official Source
ChatGPT	Go $8, Plus $20, Pro $200, Business $25/seat (annual) or $30/seat (monthly), Enterprise (contact sales)	Consumer prices are US-listed; Go is localized in some markets	🔗
Claude	Pro $20, Max $100 (5×) or $200 (20×), Team/Enterprise (see pricing)	Prices shown exclude applicable taxes; availability varies by region	🔗
Google AI (Gemini)	Plus $7.99, Pro $19.99, Ultra $249.99	US pricing; some regions/local pricing differ	🔗

Coding assistants

Tool	Plans (USD)	Notes	Official Source
GitHub Copilot	Free $0, Pro $10, Pro+ $39, Business $19/user, Enterprise $39/user	Annual options available for Pro/Pro+	🔗

Model Pricing Comparison

Model	Input	Output	Cached Input	Best For
GLM-4.7-FlashX	$0.07	$0.40	—	Fast budget tasks
Step-3.5-Flash	$0.10	$0.30	—	Ultra-fast reasoning (85–350 tok/s)
GLM-4-32B-0414-128K	$0.10	$0.10	—	Budget chat/coding
Llama 4 Maverick	$0.15	$0.60	—	Open multimodal (self-host: $0)
GPT-5.4 nano	$0.20	$1.25	$0.02	Classification and lightweight subagents
Grok 4 Fast	$0.20	$0.50	$0.05	Fast Grok reasoning
Gemini 3.1 Flash-Lite	$0.025	$1.50	$0.0025	Budget multimodal, fastest Google model
DeepSeek-V3.1	$0.27	$0.41	—	Everything
DeepSeek-V3.2	$0.28	$0.42	$0.028	Budget workhorse, reasoning
DeepSeek-V4	$0.30	$0.50	$0.03	Engram memory, coding (off-peak 50% off)
Gemini 3 Flash	$0.30	$2.50	$0.05 + $1/hr	Long context
MiniMax-M2.5	$0.30	$1.20	Auto (included)	Coding, long context
Mistral Large 3	$0.50	$1.50	—	Strong open-source frontier model
Kimi K2.5	$0.60	$3.00	Auto (included)	Multimodal + agent tasks
GPT-5.4 mini	$0.75	$4.50	$0.075	Fast coding and multimodal tasks
Claude Haiku 4.5	$1.00	$5.00	—	Low-latency coding and sub-agents
GLM-5	$1.00	$3.20	$0.20	Agentic engineering
Perplexity Sonar	$1.00	$1.00	—	Web-grounded chat (request fees apply)
GPT-5.3-Codex	$1.75	$14.00	$0.175	Agentic coding, 7+ hour autonomy
Gemini 3.1 Pro	$2.00	$12.00	$0.20–$0.40 + $4.50/hr	Frontier reasoning
Perplexity Sonar Reasoning Pro	$2.00	$8.00	—	Reasoning + search (request fees apply)
GPT-5.4	$2.50	$15.00	$0.25	Frontier coding and professional work
Grok 4	$3.00	$15.00	$0.75	First-principles reasoning
Perplexity Sonar Pro	$3.00	$15.00	—	Higher quality + search (request fees apply)
Claude Sonnet 4.5	$3.00	$15.00	$0.30 (hit)	Best coding
Claude Sonnet 4.6	$3.00	$15.00	$0.30 (hit)	Near-Opus performance
Claude Opus 4.6	$5.00	$25.00	$0.50 (hit)	Agentic coding

Self-Hosting vs API (Monthly)

Usage Level	Self-Host (A100)	API (GPT-5)	Winner
Light (1M tokens)	$300 (rental)	$10	API
Medium (100M tokens)	$300	$1,000	Self-host
Heavy (1B tokens)	$300	$10,000	Self-host
Enterprise (10B+ tokens)	$2,000 (owned)	$100,000+	Self-host

Reference 📖

Reference materials including glossary, comparison tables, and data sources.

Glossary 📖

Definitions of common terms used throughout the documentation.

A-E

Term	Definition
Agent	AI system that autonomously performs tasks and interacts with environments
API	Interface for programmatically accessing AI models
Attention Mechanism	Neural network component focusing on relevant input parts
Benchmark	Standardized test measuring model performance
Chain-of-Thought (CoT)	Prompting technique showing step-by-step reasoning

F-L

Term	Definition
Fine-Tuning	Adapting pre-trained model to specific tasks
Frontier Model	State-of-the-art proprietary model
GPU	Hardware accelerator essential for ML
LLM	Large Language Model
LoRA	Efficient fine-tuning method

M-R

Term	Definition
MCP	Model Context Protocol for tool interaction
MMLU	Massive Multitask Language Understanding benchmark
MoE	Mixture of Experts architecture
Multimodal	Processing multiple input types
RAG	Retrieval-Augmented Generation

S-Z

Term	Definition
Self-Hosting	Running models on own infrastructure
SLM	Small Language Model
SWE-bench	Benchmark for real GitHub issue resolution
Token	Basic unit of text processing
VRAM	GPU memory for model storage

Comparison Tables 📊

Side-by-side comparisons of AI models sorted by various criteria.

Sort by Latest Update (Default)

🏢 Company	🤖 Model	📦 Version	📅 Release Date	🔄 Latest Updated	💻 Coding	📊 Benchmarks	💰 Price	🖥️ Self-Host	🔗 Official Site
🤖 OpenAI	GPT-5	5.4 mini	2026-03-17 00:00 UTC	2026-03-17 00:00 UTC ⭐	✅	GPQA 87.5%	$0.75 / $4.50	❌	🔗
🤖 OpenAI	GPT-5	5.4	2026-03-05 00:00 UTC	2026-03-05 00:00 UTC ⭐	✅	GPQA 92.0%, SWE-bench ~80%	$2.50 / $15.00	❌	🔗
🌐 Google DeepMind	Gemini 3.1	Flash-Lite	2026-03-03 00:00 UTC	2026-03-03 00:00 UTC ⭐	✅	—	$0.25 / $1.50	❌	🔗
🔬 DeepSeek	DeepSeek	V4	2026-02-17 00:00 UTC	2026-02-17 00:00 UTC	✅	No public benchmarks	Pay-per-token	✅	🔗
🌐 Google DeepMind	Gemini 3	Deep Think	2026-02-12 00:00 UTC	2026-02-12 00:00 UTC ⭐	✅	GPQA ~97%, ARC-AGI-2 84.6%, HLE 48.4%	Ultra subscription	❌	🔗
🇨🇳 Zhipu AI	GLM	5	2026-02-12 00:00 UTC	2026-02-12 00:00 UTC ⭐	✅	GPQA 82.0%, SWE-bench 77.8%	$1.00 / $3.20	✅	🔗
🤖 Anthropic	Claude	Opus 4.6	2026-02-05 00:00 UTC	2026-02-05 00:00 UTC ⭐	✅	GPQA 91.3%, SWE-bench 80.8%	$5 / $25	❌	🔗
🤖 OpenAI	GPT-5	5.3-Codex	2026-02-05 00:00 UTC	2026-02-05 00:00 UTC ⭐	✅	GPQA 91.5%, SWE-bench Pro 56.8%	TBD	❌	🔗
🌙 Moonshot AI	Kimi	K2.5	2026-01-29 00:00 UTC	2026-02-02 00:00 UTC ⭐	✅	GPQA 87.6%, SWE-bench 76.8%	$0.60 / $3.00	❌	🔗

Release Windows (Month-level)

🏢 Company	🤖 Model	📅 Release Window	Notes	🔗 Official Site
🧠 MiniMax	MiniMax M2.5	2026-02	$0.30 / $1.20	🔗
🇨🇳 Alibaba/Qwen	Qwen 3.5-Max	2026-02	Open-source release window	🔗
🌐 Google DeepMind	Gemini 3.1 Flash-Lite	2026-02	Budget Gemini model	🔗
🌐 Google DeepMind	Gemini 3 Pro	2026-01	Tiered pricing	🔗
🤖 OpenAI	GPT-5.4 family	2026-03	GPT-5.4, GPT-5.4 mini, GPT-5.4 nano	🔗
🇫🇷 Mistral AI	Mistral Large 3	2025-11	Apache 2.0 open-source, 123B params	🔗

Sort by Price (Cheapest)

Rank	Model	Input	Output	License
1	Self-hosted	$0	$0	Various
2	GLM-4.7-Flash	$0	$0	Free
3	GLM-4.7-FlashX	$0.07	$0.40	API
4	GLM-4-32B-0414-128K	$0.10	$0.10	API
5	Yi-Lightning	$0.14	$0.42	Apache 2.0
6	GPT-5.4 nano	$0.20	$1.25	Proprietary
7	Gemini 3.1 Flash-Lite	$0.025	$1.50	Proprietary
8	DeepSeek-V3.1	$0.27	$0.41	MIT
9	Gemini 3 Flash	$0.30	$2.50	Proprietary
10	MiniMax-M2.5	$0.30	$1.20	Proprietary

Sort by Performance (Coding)

Rank	Model	HumanEval	Self-Host
1	Claude Sonnet 4.5	~92%	❌
2	GPT-OSS-120B	~89%	✅
3	DeepSeek-Coder-V2	~92%	✅
4	Qwen3-Coder	~92%	✅
5	DeepSeek-V3.1	82%+	✅

Sort by Context Window

Rank	Model	Context	Best For
1	Gemini 3 Flash	10M	Entire libraries
2	Llama 4 Scout	10M	Long-document RAG
3	Gemini 3 Pro	1M+	Research papers
4	Kimi K2.5	256K	Large codebases

Data Sources 📚

Attribution, verification sources, and methodology.

Primary Sources

Company	Source	URL
OpenAI	Official Documentation	openai.com
OpenAI	ChatGPT agent release notes	help.openai.com
OpenAI	Model release notes	help.openai.com
OpenAI	API pricing	platform.openai.com
OpenAI	March 2026 model news	openai.com
OpenAI	ChatGPT subscriptions (Go/Plus/Pro)	openai.com
OpenAI	ChatGPT Business pricing	help.openai.com
Anthropic	Claude Documentation	anthropic.com
Anthropic	Claude Haiku 4.5 announcement	anthropic.com
Anthropic	Claude Pro pricing	anthropic.com
Anthropic	Max plan pricing	anthropic.com
Google	Gemini Documentation	deepmind.google
Google	Gemini API models (Flash-Lite pricing)	ai.google.dev
Google	Project Mariner	deepmind.google
Google	Google AI plans	one.google.com
Google	Google AI Plus pricing	blog.google
Google	Google AI Pro pricing	one.google.com
Google	Google AI Ultra pricing	blog.google
GitHub	Copilot plans & pricing	github.com
Zhipu AI (Z.ai)	Developer Documentation	docs.z.ai
MiniMax	Developer Documentation	platform.minimax.io
MiniMax	Pricing (Pay‑as‑you‑go)	platform.minimax.io
Moonshot AI	Developer Documentation	platform.moonshot.ai
Moonshot AI	Models & Pricing	platform.moonshot.ai
Cohere	Developer Documentation	docs.cohere.com
AI21 Labs	Developer Documentation	docs.ai21.com
Perplexity	Developer Documentation	docs.perplexity.ai
ByteDance (Volcengine)	Developer Documentation	volcengine.com
Tencent (Hunyuan)	Cloud Documentation	cloud.tencent.com
Baidu (ERNIE)	AI Studio Documentation	ai.baidu.com
DeepSeek	Official Website	deepseek.com
Meta	Llama Documentation	llama.meta.com

Benchmark Sources

Benchmark	Source	Description
GPQA Diamond	Google Research	Graduate-level science questions (PhD difficulty)
MMLU-Pro	TIGER-Lab	Extended multi-task language understanding
Arena Elo	lmarena.ai	Crowdsourced human preference ranking
HLE	Scale AI	Humanity's Last Exam — expert-level questions
SWE-bench Verified	Princeton	Real GitHub issue resolution (human-verified)
SWE-bench Pro	Princeton	More challenging subset of SWE-bench
LiveCodeBench	LiveCodeBench	Live competitive programming problems
AIME 2025	MAA	American Invitational Mathematics Examination
ARC-AGI-2	ARC Prize	Abstract reasoning challenge (fluid intelligence)
MMMU / MMMU-Pro	MMMU	Multi-discipline multimodal understanding
IFEval	Google Research	Instruction-following evaluation
FrontierMath	Epoch AI	Expert-level research mathematics
HumanEval	OpenAI	164 Python programming problems

Verification Methodology

Primary Source Review - Check official documentation
Cross-Validation - Compare multiple sources
Timestamp Verification - All data includes verification date
Update Tracking - Monitor official channels

Last Updated: 2026-05-04 16:36 UTC Maintained by: ReadyPixels LLC

Made with ❤️ by ReadyPixels LLC

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
.github		.github
assets		assets
docs		docs
.gitattributes		.gitattributes
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
SECURITY.md		SECURITY.md

Folders and files

Latest commit

History

Repository files navigation

Awesome AI Models Matrix 🧠

Contents

Models 🧠

Frontier Models 🚀

Top Models by Category

Model Specifications 📋

Output Token Limits

Cached & Batch Pricing

Speed & Latency

Training Data Cutoffs

Multilingual Support

Structured Output & Function Calling

Regional Availability

Free-Source Models 🆓

Deployment Options

Coding Models 💻

SWE-bench Verified Leaderboard

Commercial Coding Models

Open-Source Coding Models

Reasoning Models 🧠

AIME 2025 Leaderboard

Reasoning Model Details

Use Cases

Multimodal Models 🎨

Leading Multimodal Models

Vision Capabilities

Audio & Video

Image Generation

Hardware Requirements 🖥️

Quick Reference by Model Size

By Hardware Tier

Quantization Explained

Comprehensive Benchmark Reference 📈

Full Benchmark Table

FrontierMath Scores

Benchmark Glossary

Development Tools 🛠️

IDEs 💻

Agentic IDEs

Native AI Editors

VS Code Forks

Web-Based IDEs

CLI Tools 🖥️

Autonomous Coding Agents

Assisted CLI Tools

CLI Tools by Programming Language

CLI for Programming Languages & Multiple Use

Terminal Enhancers

IDE Add-ons 🧩

Universal (Cross-Platform)

VS Code Specific

JetBrains Specific

API Providers 🔌

Model Labs (Direct)

Unified APIs & Aggregators

Inference Clouds

GPU Clouds

Automation 🤖

Browser Automation 🌐

Standalone AI Browsers

Browser Extensions

Developer Libraries

Cloud Automation

Autonomous Agents — Plain English Prompts 🤖

☁️ Cloud Sandbox Computer Use (English Prompts)

🖥️ Local Machine / Physical Computer Use

🌐 Browser-Only Agents

🔁 Multi-Agent / Parallel Agent Platforms (Plain English Orchestration)

Multi-Agent & Parallel Execution Summary

AI Infrastructure 🏗️

Embedding & Reranking Models 🧲

Embedding Models

Reranking Models

Video Generation Models 🎬

Speech & TTS Models 🔊

Text-to-Speech (TTS)

Packages