Hacker News Clone

I Self-Hosted Llama 3.2 with Coolify on My Home Server

by whitefables on 10/16/2024, 5:26:35 AM with 90 comments

by bambax on 10/16/2024, 8:59:23 AM
> I decided to explore self-hosting some of my non-critical applications
Self-hosting static or almost-static websites is now really easy with a Cloudflare front. I just closed my account on SmugMug and published my images locally using my NAS; this costs no extra money (is basically free) since the photos were already on the NAS, and the NAS is already powered on 24-7.
The NAS I use is an Asustor so it's not really Linux and you can't install what you want on it, but it has Apache, Python and PHP with Sqlite extension, which is more than enough for basic websites.
Cloudflare free is like magic. Response times are near instantaneous and setup is minimal. You don't even have to configure an SSL certificate locally, it's all handled for you and works for wildcard subdomains.
And of course if one puts a real server behind it, like in the post, anything's possible.
by taosx on 10/16/2024, 7:41:50 AM
For the people who self-host LLMs at home: what use cases do you have?
Personally, I have some notes and bookmarks that I'd like to scrape, then have an LLM summarize, generate hierarchical tags, and store in a database. For the notes part at least, I wouldn't want to give them to another provider; even for the bookmarks, I wouldn't be comfortable passing my reading profile to anyone.
by netdevnet on 10/16/2024, 8:11:07 AM
Am I right thinking that a self-hosted llama wouldn't have the kind restrictions ChatGPT has since it has no initial system prompt?
by seungwoolee518 on 10/16/2024, 7:14:33 AM
Great post!
However, Do I need to Install CUDA toolkit on host?
I haven't install CUDA toolkit when I use on Containerized platform (like docker)
by ossusermivami on 10/16/2024, 10:30:41 AM
ai generated blog post (or reworded, whatever) are kinda getting very irritating, like playing chess against the computer, it feel soulless
by varun_ch on 10/16/2024, 7:27:17 AM
I’m curious about how good the performance with local LLMs is on ‘outdated’ hardware like the author’s 2060. I have a desktop with a 2070 super that it could be fun to turn into an “AI server” if I had the time…
by _blk on 10/16/2024, 7:50:52 AM
Why disable LVM for a smoother reboot experience? For encryption I get it since you need a key to mount, but all my setups have LVM or ZFS and I'd say my reboots are smooth enough.
by satvikpendem on 10/16/2024, 7:56:35 AM
I love Coolify, used to use v3, anyone know how their v4 is going? I thought it was still a beta release from what I saw on GitHub.
by sorenjan on 10/16/2024, 12:42:46 PM
Can you use a selfhosted LLM that fits in 12 GB VRAM as a reasonable substitute for copilot in VSCode? And if so, can you give it documentation and other code repositories to make it better at a particular language and platform?
by vincentclee on 10/16/2024, 12:38:57 PM
Instead of `watch -n 0.5 nvidia-smi` to track GPU usage. One can use `nvtop`
https://github.com/Syllo/nvtop
by cranberryturkey on 10/16/2024, 11:06:54 AM
How is coolify different than ollama? is it better? worse? I like ollama because I can pull models and it exposes a rest api to me. which is great for development
by eloycoto on 10/16/2024, 10:19:20 AM
I have something like this, and I'm super happy with anythingLLM, which allows me to add a custom board with my workspaces, RAG, etc.. I love it!
by ragebol on 10/16/2024, 8:13:08 AM
Probably saves a bit on the gas bill for heating too
by keriati1 on 10/16/2024, 7:29:13 AM
What model size is used here? How much memory does the GPU have?