zhwu

Joined 10/5/2022, 7:51:59 PM has 31 karma

Posts

A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM
by zhwu on 4/21/2025, 10:28:15 PM with 0 comments
Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE
by zhwu on 2/10/2025, 7:26:15 PM with 0 comments
New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server
by zhwu on 8/22/2023, 4:20:13 PM with 0 comments
Train Your Own Vicuna on Llama-2
by zhwu on 8/10/2023, 4:34:50 PM with 0 comments
Guide on fine-tuning your own Vicuna on Llama-2
by zhwu on 8/3/2023, 6:18:03 PM with 0 comments
Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot
by zhwu on 6/29/2023, 5:11:17 PM with 1 comments
Biologists are moving to the clouds with SkyPilot from UC Berkeley
by zhwu on 5/1/2023, 5:15:47 PM with 0 comments
Vicuna releases its secrete of finding available A100s on the cloud to train it
by zhwu on 4/13/2023, 9:48:12 PM with 2 comments