DeepSeek-R1 at 3,872 tokens / second on a single Nvidia HGX H200
by billconan on 1/31/2025, 3:59:14 AM
https://news.ycombinator.com/item?id=42879864
this is cerebras' 70B number, 1600 tokens / sec, not sure about the costs.
https://news.ycombinator.com/item?id=42879864
this is cerebras' 70B number, 1600 tokens / sec, not sure about the costs.