Proxmox

Homelab Hardware Overview

Idle: ~55W
Typical Load: 90–120W (varies by VM/container workload)
Peak: ~180W (GPU-intensive tasks)
Note: Actual consumption may vary depending on attached peripherals and VM activity.

The GTX 1650 has only 4GB VRAM, which significantly restricts the size and performance of LLMs (Large Language Models) you can run locally.
4B parameter models (e.g., TinyLlama, Phi-2) may run with quantization (4-bit/8-bit), but context length and speed are limited.
7B parameter models (e.g., Llama 2 7B, Mistral 7B) are generally not feasible to run on a 4GB VRAM GPU, even with aggressive quantization, due to memory constraints and slow inference.
Larger models (13B, 70B, etc.) are not supported at all on this hardware.
For practical LLM/AI workloads, a GPU with 12GB+ VRAM is recommended.
GPU Sharing: The GPU is currently not shared between VMs (no vGPU), but can be shared between containers (e.g., via Docker/NVIDIA runtime).

GPU Upgrade: Plan to upgrade to an RTX 3060 (12GB VRAM) for better AI/ML and LLM support
Network: Mellanox SFP+ 10GbE NIC is present, but not yet wired for high-speed networking