• by talldayo on 12/28/2024, 4:48:12 PM

    Your computer can run it, but you have to layer the model across CPU and GPU memory. Your bottleneck will be PCIe speed, which probably won't be a huge issue for a 4080 on smaller quants.

  • by dtnewman on 12/29/2024, 3:20:18 PM

    Check out exo labs blog. https://blog.exolabs.net/day-2/

  • by dcreater on 12/29/2024, 4:33:18 AM

    Short answer: You cant run it locally. Its 670B parameters..

    Long(er) answer: check the reddit thread on r/LocalLLaMa

  • by lavren1974 on 12/28/2024, 4:18:38 PM

    Ollama doesn't have this model yet.