Joined 1/14/2021, 4:58:53 PM has 952 karma
hi. ;P
Visualizing 6D Mesh Parallelism
Muon: An optimizer for hidden layers in neural networks