Hacker News Clone

New Gemini model significantly outperforms others on Chatbot Arena (LMSYS)

by zopper on 12/6/2024, 5:18:25 PM with 18 comments

by impulser_ on 12/6/2024, 5:38:20 PM
Based on my testing, this model is significantly better than other Gemini models especially with programming/math related tasks. The current Gemini models are pretty useless for anything related to programming/math, but this experiment model puts Gemini ahead of GPT4o, and pretty close to Claude 3.5.
The major problem with Claude 3.5 is you can't have conversation with a large amount of text because you will constantly hit rate limits and it's very annoying.
This model with a 2 million context window is probably the best model right now for programming.
by chenxi9649 on 12/6/2024, 9:51:36 PM
I feel like it's at the point where I'm not too sure how these rankings impact the my choice of LLM. Every time a new model tops the charts, I'll try them for a bit and go back to claude-3.5-sonnet. Both for coding and day to day questions.
I don't know if I'm just getting used to the claude style of response, or the orangy UI that I kind of find cozy, but I think we need better ways to convey the difference between models.
by Alifatisk on 12/7/2024, 3:26:41 PM
Claude has been my got to, mainly because of the huge context window. But today, that doesn't seem to be the case, or you hit the rate limit pretty quickly and have to wait a whole day.
Google Studio with it's 2M context window + this experimental version could be a good replacement.
by leobg on 12/6/2024, 9:26:10 PM
Google has one moat that is often being overlooked: Googlebot. They get to scrape content that is invisible to pretty much every other crawler, thanks to Cloudflare and paywalls.
by jug on 12/8/2024, 2:16:36 AM
I feel like these are test versions of Gemini Pro 2.0. The changes are too foundational to be mere iterations/break date updates for 1.5 Pro.
by ralfd on 12/6/2024, 10:21:26 PM
What is the new Gemini model? 1.5-pro-002?