• by kuprel on 2/22/2024, 9:56:41 PM

    This adds PyTorch/CUDA training support to Andrej Karpathy's minbpe. It takes 2min 28sec (148 seconds) on an RTX4090 to train the BasicTokenizer with a vocab_size of 512 on 307MB of Enron emails. The original code takes 2hrs 15min (8076 seconds) on an M2 Air with Python 3.11 to do this. That is a 55x speedup.

  • by Havoc on 2/23/2024, 1:18:27 AM

    > 307MB of Enron emails

    Wait what?

    Is that some sort of inside joke?

  • by erichocean on 2/23/2024, 1:41:33 PM

    Now someone needs to do a Mojo version, and write up the blog post.