• by mikewarot on 11/23/2023, 1:15:04 AM

    Most neural networks are just directed graphs, with a ton of matrix multiplies and nonlinear functions at the end of each layer. The libraries to do gradient descent, training, etc.. are all there to use. It is amazing how small the actual code is, compared to the amount of compute to do the training.

  • by arthurcolle on 11/22/2023, 10:32:24 PM

    These are just the repos that provide the inference code to run the model, it requires the weights which are available via HuggingFace or in Llama 2's case, from here: https://ai.meta.com/resources/models-and-libraries/llama-dow...