SMRTR AI• Jun 8, 2026• Hacker News

Avoiding wasteful electricity use while self hosting LLMs

SMRTR summary

When an AI model running on a home server got stuck for 20 hours at 85 watts with no active work, it triggered a search for a permanent fix. The culprit was a known Ollama bug leaving the GPU pegged at ~89% busy. The solution: a lightweight watchdog script that checks GPU load every 5 minutes and sends a phone alert after 15 minutes of suspicious activity — without auto-restarting, to avoid killing real work.

SMRTR provides this summary for quick context. The original article belongs to Hacker News.

Read the original article

SMRTR AI

Get the next batch of curated summaries in your inbox.

This archive is built from SMRTR newsletter summaries. Subscribe for hand-picked stories without the extra noise.

Run LLMs Locally Using Ollama

Ollama has emerged as a lightweight framework that dramatically simplifies running large language models like Llama 3.1, Mistral, and DeepSeek R1 directly on local machines,...

Read SMRTR summary Original

AI• Hacker Noon• Mar 22, 2026

Optimizing Local LLM Inference for 8GB VRAM GPUs

Developers can run powerful Large Language Models on consumer GPUs with just 8GB of VRAM, despite the belief that expensive 24GB+ hardware is required. Using optimization...

Read SMRTR summary Original

AI• Daily.dev• Mar 3, 2026

How to Run and Customize LLMs Locally with Ollama

Ollama is a free, open-source tool that simplifies running Large Language Models like Meta's Llama 3.3 or Google's Gemma 3 directly on personal computers through a process called...

Read SMRTR summary Original

AI• Daily.dev• Feb 25, 2026

Running LLMs on Raspberry Pi and Edge Devices

Raspberry Pi 5 with 8GB RAM can now run small language models locally at 10-18 tokens per second using Llama.cpp and GGUF-quantized models, eliminating cloud costs and network...

Read SMRTR summary Original

AI• Daily.dev• Nov 25, 2024

Running AI Models Without GPUs on Serverless Platforms

The Llama 3.2 1B model was successfully deployed on AWS Lambda and Google Cloud Run using serverless computing. The experiment revealed that with proper configuration (6GB memory,...

Read SMRTR summary Original

AI• Hacker News• Mar 15, 2026

CostClaw – free local dashboard to track and reduce OpenClaw agent costs

CostClaw is a free plugin that tracks LLM API costs in real-time via localhost:3333 dashboard. It identifies expensive usage patterns and recommends optimizations, potentially...

Read SMRTR summary Original

Avoiding wasteful electricity use while self hosting LLMs

Get the next batch of curated summaries in your inbox.

Related Stories

Run LLMs Locally Using Ollama

Optimizing Local LLM Inference for 8GB VRAM GPUs

How to Run and Customize LLMs Locally with Ollama

Running LLMs on Raspberry Pi and Edge Devices

Running AI Models Without GPUs on Serverless Platforms

CostClaw – free local dashboard to track and reduce OpenClaw agent costs