Ollama Setup
Ollama is the engine. It downloads, serves, and runs LLMs locally. Old version 0.24.0 is installed as a user systemd service — works perfectly with ROCm.
Installation
# User-level systemd service
mkdir -p ~/.config/systemd/user
cat > ~/.config/systemd/user/ollama.service << 'EOF'
[Unit]
Description=Ollama LLM Server (Local AI Stack)
After=network.target
[Service]
Type=simple
Environment="OLLAMA_HOST=0.0.0.0"
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.0"
ExecStartPre=/bin/sh -c 'sg render -c "echo render group ok"'
ExecStart=/bin/sh -c 'sg render -c "/usr/local/bin/ollama serve"'
Restart=always
RestartSec=10
[Install]
WantedBy=default.target
EOF
systemctl --user daemon-reload
systemctl --user enable ollama
systemctl --user start ollama
wm, has correct GPU group access, and "just works."
Models moved to 4TB SSD
# Move models to free root partition
rsync -aP ~/.ollama/models/ /media/wm/2TB-Data/ollama-models/
# Backup old location
mv ~/.ollama/models ~/.ollama/models.bak
# Create symlink
ln -s /media/wm/2TB-Data/ollama-models ~/.ollama/models
# Verify
ls -la ~/.ollama/models
# Remove backup after verification
rm -rf ~/.ollama/models.bak
Models
Five models installed, totaling ~72 GB. All stored on the 4TB SSD.
| Model | Size | Parameters | Purpose | Use |
|---|---|---|---|---|
| llama3.1:8b | 4.9 GB | 8B | Fast general chat | School + Personal |
| qwen3:30b | 18 GB | 30B | Deep reasoning, coding | School + Personal |
| qwen2.5:14b | 9.0 GB | 14B | Balanced speed/quality | School + Personal |
| dolphin-llama3:8b | 4.9 GB | 8B | Uncensored personal use | Personal only |
| dolphin-llama3:70b | 35 GB | 70B | Uncensored personal use | Personal only |
Download commands
ollama pull llama3.1:8b
ollama pull qwen3:30b
ollama pull qwen2.5:14b
ollama pull dolphin-llama3:8b
ollama pull dolphin-llama3:70b
ROCm GPU Setup
AMD RX 7900 XTX runs inference via ROCm. Key environment variable:
export HSA_OVERRIDE_GFX_VERSION=11.0.0
This maps the card's gfx1100 architecture to ROCm's compatibility layer. Without it, Ollama falls back to CPU (painfully slow on these models).
ollama ps shows loaded models. If VRAM column shows values (not 0/0), GPU acceleration is active. Or check /var/log/syslog for "ROCm" entries.
Open WebUI
ChatGPT-like interface for Ollama. Students never need to touch the terminal.
Docker Compose
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434
- WEBUI_AUTH=False
volumes:
- open-webui:/app/backend/data
restart: unless-stopped
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
open-webui:
ghcr.io (GitHub Container Registry) works reliably. The image is ~6.7 GB.
Access
Students open http://10.39.26.217:3000 (or whatever the machine's LAN IP is). No login required. Select a model from the dropdown and chat.
Useful Prompts
These work well for educational use with llama3.1:8b and qwen3:30b:
For students
"Explain photosynthesis like I'm 12 years old. Use simple words and analogies."
"Help me write a 5-paragraph essay about [topic]. Give me an outline first."
"Translate this Chinese text to English, then explain any cultural references I might miss."
"I'm stuck on this math problem: [paste problem]. Walk me through it step by step without giving me the answer."
"Quiz me on World War 2. Ask 5 questions, then check my answers."
For teachers
"Generate 10 discussion questions about [book] for 10th graders."
"Create a rubric for grading creative writing assignments. 4 levels: Exceeds, Meets, Approaching, Below."
"Suggest 5 differentiation strategies for teaching [topic] to mixed-ability classes."
"Write a parent email template explaining why we're using AI tools in class."
School vs Personal Use
Two of the models are uncensored (Dolphin series). They can generate content unsuitable for students. Our setup keeps them available but not default:
- School default:
llama3.1:8b,qwen3:30b,qwen2.5:14b - Personal use:
dolphin-llama3:8b,dolphin-llama3:70b
MODEL_INVENTORY.md tracking each model's size, purpose, and school-appropriateness. Review before exposing to students.