AI Stack — Local AI Lab

Ollama Setup
Models
ROCm GPU
Open WebUI
Useful Prompts
School vs Personal

Ollama Setup

Ollama is the engine. It downloads, serves, and runs LLMs locally. Old version 0.24.0 is installed as a user systemd service — works perfectly with ROCm.

Installation

# User-level systemd service
mkdir -p ~/.config/systemd/user
cat > ~/.config/systemd/user/ollama.service << 'EOF'
[Unit]
Description=Ollama LLM Server (Local AI Stack)
After=network.target

[Service]
Type=simple
Environment="OLLAMA_HOST=0.0.0.0"
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.0"
ExecStartPre=/bin/sh -c 'sg render -c "echo render group ok"'
ExecStart=/bin/sh -c 'sg render -c "/usr/local/bin/ollama serve"'
Restart=always
RestartSec=10

[Install]
WantedBy=default.target
EOF

systemctl --user daemon-reload
systemctl --user enable ollama
systemctl --user start ollama

💡 Why user service? The apt-installed system service (user=ollama) crash-loops with port conflicts. User service runs as wm, has correct GPU group access, and "just works."

Models moved to 4TB SSD

# Move models to free root partition
rsync -aP ~/.ollama/models/ /media/wm/2TB-Data/ollama-models/
# Backup old location
mv ~/.ollama/models ~/.ollama/models.bak
# Create symlink
ln -s /media/wm/2TB-Data/ollama-models ~/.ollama/models
# Verify
ls -la ~/.ollama/models
# Remove backup after verification
rm -rf ~/.ollama/models.bak

Models

Five models installed, totaling ~72 GB. All stored on the 4TB SSD.

Model	Size	Parameters	Purpose	Use
llama3.1:8b	4.9 GB	8B	Fast general chat	School + Personal
qwen3:30b	18 GB	30B	Deep reasoning, coding	School + Personal
qwen2.5:14b	9.0 GB	14B	Balanced speed/quality	School + Personal
dolphin-llama3:8b	4.9 GB	8B	Uncensored personal use	Personal only
dolphin-llama3:70b	35 GB	70B	Uncensored personal use	Personal only

Download commands

ollama pull llama3.1:8b
ollama pull qwen3:30b
ollama pull qwen2.5:14b
ollama pull dolphin-llama3:8b
ollama pull dolphin-llama3:70b

⚠️ China download speeds Ollama pulls from international servers. Expect 2-5 MB/s. A 18 GB model takes ~2 hours. Set long timeouts and let Ollama resume partial downloads automatically.

ROCm GPU Setup

AMD RX 7900 XTX runs inference via ROCm. Key environment variable:

export HSA_OVERRIDE_GFX_VERSION=11.0.0

This maps the card's gfx1100 architecture to ROCm's compatibility layer. Without it, Ollama falls back to CPU (painfully slow on these models).

💡 Verify GPU is working ollama ps shows loaded models. If VRAM column shows values (not 0/0), GPU acceleration is active. Or check /var/log/syslog for "ROCm" entries.

Open WebUI

ChatGPT-like interface for Ollama. Students never need to touch the terminal.

Docker Compose

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://host.docker.internal:11434
      - WEBUI_AUTH=False
    volumes:
      - open-webui:/app/backend/data
    restart: unless-stopped
    extra_hosts:
      - "host.docker.internal:host-gateway"

volumes:
  open-webui:

💡 Why ghcr.io? Docker Hub is blocked in China. ghcr.io (GitHub Container Registry) works reliably. The image is ~6.7 GB.

Access

Students open http://10.39.26.217:3000 (or whatever the machine's LAN IP is). No login required. Select a model from the dropdown and chat.

Useful Prompts

These work well for educational use with llama3.1:8b and qwen3:30b:

For students

"Explain photosynthesis like I'm 12 years old. Use simple words and analogies."

"Help me write a 5-paragraph essay about [topic]. Give me an outline first."

"Translate this Chinese text to English, then explain any cultural references I might miss."

"I'm stuck on this math problem: [paste problem]. Walk me through it step by step without giving me the answer."

"Quiz me on World War 2. Ask 5 questions, then check my answers."

For teachers

"Generate 10 discussion questions about [book] for 10th graders."

"Create a rubric for grading creative writing assignments. 4 levels: Exceeds, Meets, Approaching, Below."

"Suggest 5 differentiation strategies for teaching [topic] to mixed-ability classes."

"Write a parent email template explaining why we're using AI tools in class."

School vs Personal Use

Two of the models are uncensored (Dolphin series). They can generate content unsuitable for students. Our setup keeps them available but not default:

School default: llama3.1:8b, qwen3:30b, qwen2.5:14b
Personal use: dolphin-llama3:8b, dolphin-llama3:70b

⚠️ Model inventory matters Maintain MODEL_INVENTORY.md tracking each model's size, purpose, and school-appropriateness. Review before exposing to students.

🧠 AI Stack

Contents

Ollama Setup

Installation

Models moved to 4TB SSD

Models

Download commands

ROCm GPU Setup

Open WebUI

Docker Compose

Access

Useful Prompts

For students

For teachers

School vs Personal Use