Session Abstracts | Speaker Bios
Registration
Welcome Speech
Getting started with vLLM: The leading open-source LLM inference engine for Private AI
Multi-modal inference and deployment using vLLM
LLM inference on AMD GPUs: A Technical Deep Dive
Group Photo
AMD Hands-on Workshop: Minimax M2 Agent Tutorial
Lunch & Networking
From Offline to Online Inference: Why Serving Is Hard—and How vLLM Helps
Deep Adaptation and Engineering Practice of vLLM on MetaX GPU
KVCache Practices at MiniMax for Agentic Workloads: From Traffic Characteristics to Architectural Insights
vLLM-Omni: Easy, Fast, and Cheap Omni-Modality Model Serving
Networking
Event End
Subject to change.