Joined 8/20/2018, 11:10:25 PM has 1013 karma
DevRel @ https://baseten.co
Email me: username at baseten.co
How to build function calling and JSON mode for open-source and fine-tuned LLMs
How to double tokens per second for Llama 3 with Medusa
FP8: Efficient model inference with 8-bit floating point numbers
Three techniques to adapt LLMs for any use case
Serving four million Riffusion requests in two days
Show HN: Free Stable Diffusion 2.0 hosted interface
Try it yourself: Speech to text with Whisper
Deploying Stable Diffusion in Production Using Truss
Hosted Stable Diffusion Demo
Show HN: Truss – Serve any ML model without boilerplate code
Code generation interactive demo (Salesforce Codegen mono 2B)
Show HN: Baseten – Build ML-powered applications