• Top
  • New

Muon Is Scalable for LLM Training

by renonce on 2/25/2025, 4:50:57 AM with 1 comments
  • by yorwba on 2/25/2025, 5:40:29 AM

    For people who want to know more about the Muon optimizer: https://kellerjordan.github.io/posts/muon/