vShoestring - Homelabbing on a budget: My journey with homelab inferencing part 1

Building a machine that can handle a local LLM isn't too difficult, but can be cost prohibitive. My journey to build a machine that can handle distilled, decent-sized models was motivated by having a completely offline voice assistant for Home Assistant - an open-source software for smart home management. My requirements for this project were as follows:

No subscriptions
Full ownership
Minimal cloud service connectivity
Keep cost down as much as possible

Since I have most of the components to build the machine already, the main thing to consider was picking a graphics card. The sweet spot in terms of performance per dollar is the NVIDIA RTX 3090 (Ti). Both the Ti and non-Ti versions of the card support 24GB of vRAM, which will fit most ~32B models. There are GPUs with more vRAM onboard, as well as multi-GPU capable systems, but the cost ramps up significantly when those are in play. Because this is a very specific use case, I don't need a lot of context - I just need it to be "smart" enough to carry out basic Home Assistant actions when asked.

As of this writing, RTX 3090's can be obtained for around $800 used on eBay. Let's draw up some pros and cons compared to subscribing to LLM as a Service offerings.

The pros are:

Full ownership and customization of hardware

No subscriptions, pay once and it's yours

Use whichever model you wish...

...as long as it fits in vRAM
Can add more hardware to run larger models

No internet connection required
Hardware can be repurposed for other projects (gaming, Folding @ Home, etc)

The cons are:

Steep upfront cost (LLMaaS such as ChatGPT Plus is $20/month at the time of this writing)
Need to decide on OS and deployment model

LLMaaS is turn-key and ready at a moment's notice

vRAM limits the size of models that can be used

LLMaaS grants full size models with RAG customization
More GPUs can be used but increases power and cost significantly

Risk of hardware deprecation (RTX 3090 is two generations old and may not be able to run future models)

LLMaaS provides the latest models and additional features, such as advanced image generation

At the end of the day, you need to determine where you want your flexibility. Do you want flexibility in LLM capabilities, or in the hardware aspect of the stack? At $800 plus the cost of hardware, you're looking at a 40-month ROI compared to LLMaaS, with less context and capabilities. Again, for my use case, this is fine, but for a general purpose chat bot you might prefer the monthly subscription. In my next blog, I plan on covering the next stage of local LLM: OS install and containerized deployment.

vShoestring - Homelabbing on a budget

Wednesday, September 24, 2025

My journey with homelab inferencing part 1 - objectives, planning and cost

No comments:

Post a Comment

My journey with homelab inferencing part 1 - objectives, planning and cost

Report Abuse