Skip to content

amugoodbad229/LLM-Training-Tutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

logo

The LLM Creation Journey

A Visual Knowledge Base & Resource Moodboard

Format Stack Focus


πŸ‘† Click the button above to enter the interactive workspace πŸ‘†


πŸ“– About

"A comprehensive journey to creating LLMs, featuring tutorials and resources primarily sourced from Zachary Huang (Researcher at Microsoft Research AI Frontiers), enhanced with my own supplementary materials for clarity."

This project is not a traditional repository of code files. Instead, it is an interactive spatial moodboard hosted on tldraw. It serves as a visual map that connects concepts, research papers, and implementation tutorials in one infinite canvas.


⚑ The Modern Toolkit

Beyond standard PyTorch, this roadmap emphasizes the modern tools that define state-of-the-art LLM development in 2025. We move away from clunky boilerplate code to efficient, readable implementations.

🧩 Core Libraries Covered:

  • Einops:
    • Why? Replaces confusing x.view() and x.transpose() calls with readable, semantic operations.
    • Use case: Multi-head attention implementation and tensor rearranging.
  • Flash Attention (DAO):
    • Why? The gold standard for IO-aware exact attention.
    • Use case: Speeding up training and reducing memory footprint significantly.
  • Bitsandbytes:
    • Why? Accessible quantization.
    • Use case: Loading large models on consumer GPUs (8-bit optimizers, NF4).
  • WandB (Weights & Biases):
    • Why? Industry-standard experiment tracking.
    • Use case: Visualizing loss curves and debugging training runs.

🧠 What's Inside the Board?

Everything is embedded directly into the canvas. You will find a structured path covering:

  • πŸ“„ Annotated PDFs: Key research papers (Attention Is All You Need, LLaMA, etc.) uploaded directly to the board with highlights and notes.
  • 🧩 Architecture Breakdowns: Visual diagrams explaining Transformers, Tokenization, and Attention mechanisms.
  • πŸ› οΈ Implementation Tutorials: Links to Colab notebooks and repositories for building from scratch.
  • πŸ›£οΈ Learning Pathways: A clear step-by-step flow from Pre-training to RLHF Alignment.

πŸ—ΊοΈ How to Use

  1. Open the Link: Click here to access the Tldraw board.
  2. Navigation:
    • Zoom: Use your mouse wheel or trackpad to zoom in on specific papers or notes.
    • Pan: Click and drag to move around the timeline.
  3. Resources: Double-click on embedded PDFs or links to open them.

πŸ† Credits

This knowledge base is curated based on the work of Zachary Huang (Researcher at Microsoft Research AI Frontiers).

About

A comprehensive journey to creating LLMs, featuring tutorials and resources primarily sourced from Zachary Huang (Researcher at Microsoft Research AI Frontiers), enhanced with my own supplementary materials for clarity.

Topics

Resources

License

Stars

Watchers

Forks

Contributors