SGLang Cookbook
A community-maintained repository of practical guides and recipes for deploying and using SGLang in production environments. Our mission is simple: answer the question "How do I use SGLang (and related models) on hardware Y for task Z?" with clear, actionable solutions.
🎯 What You'll Find Here
This cookbook aggregates battle-tested SGLang recipes covering:
- Models: Mainstream LLMs and Vision-Language Models (VLMs)
- Use Cases: Inference serving, deployment strategies, multimodal applications
- Hardware: GPU and CPU configurations, optimization for different accelerators
- Best Practices: Configuration templates, performance tuning, troubleshooting guides
Each recipe provides step-by-step instructions to help you quickly implement SGLang solutions for your specific requirements.
Guides
Autoregressive Models
Google
- Gemma 4 NEW
Qwen
DeepSeek
Llama
GLM
OpenAI
Moonshotai
MiniMax
NVIDIA
Ernie
InternVL
InternLM
Jina AI
Mistral
Xiaomi
FlashLabs
StepFun
InclusionAI
Diffusion Models
FLUX
Wan
Qwen-Image
Z-Image
MOVA
SGLang Omni
FishAudio
Benchmarks
Reference
- Installation (PyPI) - Install SGLang via pip or uv (stable and nightly)
- Server arguments - Understanding all the arguments
🚀 Quick Start
- Browse the recipe index above to find your model
- Follow the step-by-step instructions in each guide
- Adapt configurations to your specific hardware and requirements
- Join our community to share feedback and improvements
🤝 Contributing
We believe the best documentation comes from practitioners. Whether you've optimized SGLang for a specific model, solved a tricky deployment challenge, or discovered performance improvements, we encourage you to contribute your recipes!
Contribution templates — start here:
- Autoregressive Model Template — Full template for LLM recipes (deployment, API usage, benchmarks)
- Diffusion Model Template — Template for image/video generation models
Maintainers: We have a Claude Code skill that automates most of the contribution workflow — from scaffolding docs, config generators, YAML configs, to sidebar updates. Run
/add-modelin Claude Code to use it.
Ways to contribute:
- Add a new recipe for a model not yet covered
- Add AMD MI300X/MI325X/MI355X GPU support to existing models
- Improve existing recipes with benchmarks, tips, or configurations
- Report issues or suggest enhancements
Quick start:
# Fork the repo and clone locally
git clone https://github.com/YOUR_USERNAME/sgl-cookbook.git
cd sgl-cookbook
# Install dependencies and start dev server
npm install && npm start
# Create a new branch
git checkout -b add-my-recipe
# Add your recipe following the templates above
# Submit a PR!
Each model recipe needs 3 files: a .md doc, a ConfigGenerator component, and a sidebars.js entry. Use DeepSeek-V3.2 as a reference. All deployment commands must use sglang serve (not the deprecated python -m sglang.launch_server).
🛠️ Local Development
Prerequisites
- Node.js >= 20.0
- npm or yarn
Setup and Run
Install dependencies and start the development server:
# Install dependencies
npm install
# Start development server (hot reload enabled)
npm start
The site will automatically open in your browser at http://localhost:3000.
📖 Resources
📄 License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Let's build this resource together! 🚀 Star the repo and contribute your recipes to help the SGLang community grow.