Joined 10/5/2022, 7:51:59 PM has 31 karma
A collection of reproducible LLM inference engine benchmarks: SGLang vs. vLLM
Efficient GPU Resource Management for ML Workloads Using SkyPilot, Kueue on GKE
New Recipe: Serving Llama-2 with VLLM's OpenAI-Compatible API Server
Train Your Own Vicuna on Llama-2
Guide on fine-tuning your own Vicuna on Llama-2
Serving LLM 24x Faster on the Cloud with VLLM and SkyPilot
Biologists are moving to the clouds with SkyPilot from UC Berkeley
Vicuna releases its secrete of finding available A100s on the cloud to train it