SGLang Cookbook

A community-maintained repository of practical guides and recipes for deploying and using SGLang in production environments. Our mission is simple: answer the question "How do I use SGLang (and related models) on hardware Y for task Z?" with clear, actionable solutions.

🎯 What You'll Find Here

This cookbook aggregates battle-tested SGLang recipes covering:

Models: Mainstream LLMs and Vision-Language Models (VLMs)
Use Cases: Inference serving, deployment strategies, multimodal applications
Hardware: GPU and CPU configurations, optimization for different accelerators
Best Practices: Configuration templates, performance tuning, troubleshooting guides

Each recipe provides step-by-step instructions to help you quickly implement SGLang solutions for your specific requirements.

Guides

Autoregressive Models

Google

Gemma 4 NEW

Qwen

DeepSeek

Llama

GLM

OpenAI

gpt-oss

Moonshotai

MiniMax

NVIDIA

Ernie

Mistral

Xiaomi

MiMo-V2-Flash

FlashLabs

Chroma 1.0

StepFun

InclusionAI

Diffusion Models

FLUX

FLUX

Wan

Qwen-Image

Z-Image

Z-Image

MOVA

MOVA

SGLang Omni

FishAudio

S2 Pro (TTS)

Benchmarks

Reference

Installation (PyPI) - Install SGLang via pip or uv (stable and nightly)
Server arguments - Understanding all the arguments

🚀 Quick Start

Browse the recipe index above to find your model
Follow the step-by-step instructions in each guide
Adapt configurations to your specific hardware and requirements
Join our community to share feedback and improvements

🤝 Contributing

We believe the best documentation comes from practitioners. Whether you've optimized SGLang for a specific model, solved a tricky deployment challenge, or discovered performance improvements, we encourage you to contribute your recipes!

Contribution templates — start here:

Autoregressive Model Template — Full template for LLM recipes (deployment, API usage, benchmarks)
Diffusion Model Template — Template for image/video generation models

Maintainers: We have a Claude Code skill that automates most of the contribution workflow — from scaffolding docs, config generators, YAML configs, to sidebar updates. Run /add-model in Claude Code to use it.

Ways to contribute:

Add a new recipe for a model not yet covered
Add NVIDIA H100/H200/B200/B300/GB200/GB300 GPU and/or AMD MI300X/MI325X/MI355X GPU support to existing models
Improve existing recipes with benchmarks, tips, or configurations
Report issues or suggest enhancements

Quick start:

# Fork the repo and clone locally
git clone https://github.com/YOUR_USERNAME/sgl-cookbook.git
cd sgl-cookbook

# Install dependencies and start dev server
npm install && npm start

# Create a new branch
git checkout -b add-my-recipe

# Add your recipe following the templates above
# Submit a PR!

Each model recipe needs 3 files: a .md doc, a ConfigGenerator component, and a sidebars.js entry. Use DeepSeek-V3.2 as a reference. All deployment commands must use sglang serve (not the deprecated python -m sglang.launch_server).

🛠️ Local Development

Prerequisites

Node.js >= 20.0
npm or yarn

Setup and Run

Install dependencies and start the development server:

# Install dependencies
npm install

# Start development server (hot reload enabled)
npm start

The site will automatically open in your browser at http://localhost:3000.

📖 Resources

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Let's build this resource together! 🚀 Star the repo and contribute your recipes to help the SGLang community grow.

🎯 What You'll Find Here
Guides
Reference
🚀 Quick Start
🤝 Contributing
🛠️ Local Development
- Prerequisites
- Setup and Run
📖 Resources
📄 License

🎯 What You'll Find Here​

Guides​

Autoregressive Models​

Google​

Qwen​

DeepSeek​

Llama​

GLM​

OpenAI​

Moonshotai​

MiniMax​

NVIDIA​

Ernie​

InternVL​

InternLM​

Jina AI​

Mistral​

Xiaomi​

FlashLabs​

StepFun​

InclusionAI​

Diffusion Models​

FLUX​

Wan​

Qwen-Image​

Z-Image​

MOVA​

SGLang Omni​

FishAudio​

Benchmarks​

Reference​

🚀 Quick Start​

🤝 Contributing​

🛠️ Local Development​

Prerequisites​

Setup and Run​

📖 Resources​

📄 License​