By the end of these four weeks, every student has run a local AI model on the machine they built, written a script that talks to it, and completed a final project of their own design. The final session is a demo day — each student presents what they built and what they learned across the entire six-month course.
The arc of Phase 6 is deliberate. Week 21 demystifies AI honestly — not as magic, not as threat, but as a tool with knowable properties. Week 22 installs and runs it. Week 23 connects it to scripts. Week 24 is the final project and the course close.
One hardware note before starting: Ollama models require RAM. The smallest useful models (llama3.2:1b, gemma2:2b) need 2–4GB free RAM. The recommended models (llama3.2:3b, gemma2:9b) need 4–8GB. Students who built with 8GB RAM can run 3B models comfortably. Students with 16GB can run 7B and 9B models. Check available RAM with free -h before session 82 and advise students which model to pull based on their build.
The final project has no single correct answer. A student who builds a script that summarises their class notes using the local model has built something genuinely useful. A student who builds a menu tool for their own workflow has too. The criterion is: does it solve a real problem you actually have?
This session has one goal: replace vague beliefs about AI with a concrete, accurate mental model. Students who finish this session understand what a language model is, what it can and cannot do, and why running one locally on their own machine is both possible and meaningful. No hype. No catastrophising. Just the actual mechanism.
Ask the class: "What do you think an AI model actually is? How does it work?" Write every answer on the board without correcting. Collect the full range — from "it thinks like a human" to "it searches the internet" to "nobody knows." Then: "By the end of this session you will be able to give a mechanistically correct answer to that question. Let's find out which of these are right."
What a language model actually is. A large language model (LLM) is a mathematical function that takes a sequence of tokens as input and outputs a probability distribution over the next token. That is the complete description. It does not think, it does not understand, it does not have beliefs or goals. It is a very large function that has been trained on a very large amount of text to be good at predicting what comes next.
Tokens. Text is broken into tokens before the model sees it. Tokens are roughly 3–4 characters each — sometimes a whole word, sometimes part of a word. "Hello world" might be two tokens. "antidisestablishmentarianism" might be several. The model processes tokens, not characters or words. This is why models sometimes make strange errors at the character level — they don't see characters directly.
Parameters. The "size" of a model is measured in parameters — the numbers that define the function. A 7 billion parameter model has 7,000,000,000 numbers, typically stored as 4-byte floats. 7B × 4 bytes = ~28GB. Quantized to 4-bit (the Q4 format Ollama uses by default), this is ~4GB. This is why RAM matters: the entire model must be loaded into RAM to run inference efficiently.
Training vs inference. Training is the process that determines the parameters — showing the model vast amounts of text and adjusting parameters to make the model better at predicting the next token. Training requires enormous compute: thousands of GPUs for weeks. Inference is running the trained model to generate a response — much cheaper. What we are doing is inference only. We are not training anything.
Context window. The model can only "see" a fixed amount of text at once — the context window. Older models: 4,096 tokens. Modern models: 128,000+ tokens. Text beyond the context window is simply not available to the model. This is why "it forgot what we discussed earlier" is a real limitation, and why long conversations eventually lose coherence.
What the model can and cannot do. It can: generate fluent, coherent text that appears knowledgeable. It cannot: access the internet, access your files (unless you give them to it), verify facts against ground truth, or know what happened after its training cutoff. It can appear confident while being wrong. It has no persistent memory across conversations. It does not learn from your interactions with it (inference does not change parameters).
Why local? Running a model locally means: your prompts and the model's responses never leave your machine. No API key needed. No cost per token. No internet required. The model runs on your CPU (or GPU if you have one). It is slower than cloud APIs but completely private and completely free. This is what makes it meaningful to run on the machine you built.
Return to the board. For each belief written during the warm-up, evaluate it against the concept:
The honest capability table is important to spend time on. Students who believe the model "knows everything" will be confused and disappointed when it hallucinates confidently. Students who have been told AI will replace all human work will find the limitations reassuring. The goal is neither enthusiasm nor dismissal — it is accurate expectations. A tool you understand correctly is more useful than one you have mythologised.
The local vs cloud distinction matters for these students specifically: they built a computer. They control it. A model running on that computer processes their data on their hardware. No corporation has their conversation logs. For 15–18 year olds, the privacy dimension of running AI locally is often more compelling than the cost argument. Lead with whatever resonates with the class.
Run free -h on the projector. Connect the numbers back to session 81 — available RAM determines which model the student can run. Calculate: if available RAM is 6GB, a Q4 4B model (~2.5GB) fits comfortably. An 8B model (~5GB) is tight. Write the recommended model for each RAM tier on the board before class starts.
Ollama is a tool that runs LLMs locally. It handles: downloading model files, managing model storage, running a local HTTP server that provides an API, and a CLI for interactive conversations. It abstracts the complexity of model inference — you do not need to know about CUDA, PyTorch, or model formats. You install it, pull a model, and run it.
Installation: one command from the Ollama website:
curl -fsSL https://ollama.com/install.sh | sh
This downloads and installs Ollama. After installation, Ollama runs as a service on port 11434. Verify: ollama --version and systemctl status ollama.
Choosing a model. Model names follow the format name:tag where tag is the parameter count and quantization: llama3.2:3b, gemma2:2b, mistral:7b. Recommended for this course based on RAM:
Model files are stored in ~/.ollama/models/. They can be several gigabytes. Disk space matters: a 3B model is ~2GB, a 7B model is ~4GB. Check with du -sh ~/.ollama/ after pulling.
ollama --version. What version is installed?systemctl status ollama. Is it running? On which port?ollama run modelname. You are now in an interactive conversation. Ask it: "What are you?" Read the answer critically — does it match what we learned in session 81?/bye to exit. Run du -sh ~/.ollama/. How much disk does the model use?htop while the model is running (open a second terminal). What does CPU and RAM usage look like during inference?The model download in task 3 takes 3–10 minutes depending on network speed and model size. Use this time productively — ask students to observe htop and watch memory climb as the model loads. When the model first responds, the first token is slow (model loading into RAM) and subsequent tokens are faster. This is a visible demonstration of the RAM loading process from session 81.
Task 6 — testing the model's limitations — is important for maintaining calibrated expectations. A model that gets a recent news question wrong is not broken; it is behaving correctly given that its training data has a cutoff. Students who understand this will use the tool more effectively than those who lose trust when it makes an error.
The interactive ollama run mode is useful for exploration, but it cannot be automated. To use a model in a script, you need to send a prompt and receive a response programmatically. Ollama exposes a REST API on localhost:11434 that accepts JSON and returns JSON.
Non-interactive single-prompt mode:
echo "What is the Linux kernel?" | ollama run llama3.2:3b
Pipes a prompt directly to the model and prints the response. Useful for quick one-off queries but limited for scripts because the full model loads each time.
The Ollama REST API:
curl http://localhost:11434/api/generate \
-d '{
"model": "llama3.2:3b",
"prompt": "Explain what a process is in Linux",
"stream": false
}'
Returns JSON with a "response" field containing the model's answer. "stream": false waits for the complete response before returning. Without it, the API streams tokens as they are generated. The API works with the model already loaded — subsequent requests are fast.
Parsing the response with jq:
curl -s http://localhost:11434/api/generate \
-d '{"model":"llama3.2:3b","prompt":"Say hello","stream":false}' \
| jq -r '.response'
jq extracts the response field. -r outputs raw text without JSON quotes. Install jq if needed: sudo apt install jq.
echo "What is bash?" | ollama run llama3.2:3b. Compare the speed to the interactive mode — what is different?RESPONSE=$(curl -s ... | jq -r '.response'). Echo it.PROMPT=$(cat prompt.txt). What challenge does this create with JSON quoting?Task 4 — multi-line prompts with JSON quoting — is a real engineering challenge. Special characters in the prompt (quotes, newlines, backslashes) break the JSON if embedded directly. The clean solution is to use jq to construct the JSON: jq -n --arg p "$PROMPT" '{"model":"llama3.2:3b","prompt":$p,"stream":false}'. Students who discover this independently have learned a genuinely useful JSON handling pattern. Students who get stuck on this are learning an important lesson: data going into JSON must be properly escaped, which is why purpose-built tools exist.
The chat endpoint. The /api/generate endpoint handles single prompts. The /api/chat endpoint handles multi-turn conversations with proper message history:
curl http://localhost:11434/api/chat -d '{
"model": "llama3.2:3b",
"stream": false,
"messages": [
{"role": "system", "content": "You are a Linux expert."},
{"role": "user", "content": "What is a process?"},
{"role": "assistant", "content": "A process is..."},
{"role": "user", "content": "How do I list them?"}
]
}'
The messages array contains the full conversation history. Each subsequent request must include all previous messages to maintain context. This is why chat interfaces store history — the model has no memory between API calls.
Useful API endpoints:
Context management. The generate endpoint returns a "context" field — a compact encoding of the conversation state. You can pass this back in the next request to continue from where you left off without sending full message history. This is more efficient for long conversations but less readable than the chat format.
curl http://localhost:11434/api/tags | jq '.'. What metadata is available for each model?curl http://localhost:11434/api/ps | jq '.'. What information does this return about the running model?curl http://localhost:11434/api/show -d '{"name":"llama3.2:3b"}' | jq '.'. Read the modelfile and parameters — what do they configure?Task 3 — asking a follow-up without context — is the session's most important demonstration. The model gives an answer that makes no reference to the previous question, often answering as if the follow-up were a standalone query. This makes the stateless nature of the API concrete: the model genuinely has no memory between API calls. Every call starts fresh. Chat applications create the illusion of memory by maintaining and resending conversation history. This is not a limitation or a bug — it is how stateless APIs work.
Ask: what would you actually want to automate with a local AI? Brainstorm quickly. Examples: summarise a file, explain an error message, generate a commit message, check if a script looks correct, translate something. Pick one of these or your own idea and that is what you build today.
The pattern for any script that uses the Ollama API:
#!/bin/bash
set -euo pipefail
MODEL="${OLLAMA_MODEL:-llama3.2:3b}"
API="http://localhost:11434/api/generate"
query_model() {
local prompt="$1"
local system="${2:-}"
local payload
payload=$(jq -n \
--arg model "$MODEL" \
--arg prompt "$prompt" \
--arg system "$system" \
'{model: $model, prompt: $prompt, system: $system, stream: false}')
curl -s "$API" -d "$payload" | jq -r '.response'
}
# Use it:
RESPONSE=$(query_model "Explain: $1" "You are a concise Linux expert.")
echo "$RESPONSE"
This pattern uses jq to safely construct JSON (handles special characters), curl to send it, and jq to extract the response. The OLLAMA_MODEL environment variable allows overriding the model without editing the script.
Build one of these scripts. Each uses the query_model pattern above as a foundation.
Option A — explain.sh: takes a file as argument, reads its contents, sends them to the model with "explain what this file does in plain language." Handles: missing file, file too large (add -n 100 limit), non-text files.
Option B — error-explain.sh: reads the last N lines of a log file or stdin, sends to the model asking for an explanation of any errors found. Useful format: "These are system log lines. Identify any errors and explain what they mean."
Option C — commit-message.sh: runs git diff --staged, sends the diff to the model, asks for a commit message in the format "type: description." Handles the case where nothing is staged.
Option D — your own idea: if you have a specific use case in mind, build it using the query_model function. Must use the same error handling and jq JSON construction pattern.
All scripts must: use set -euo pipefail, use the query_model function, handle missing arguments with a usage message, and work correctly on your actual files.
Option C — the git commit message generator — is particularly useful and motivating for students who have been using git since session 75. A student who runs git add . && commit-message.sh and gets a suggested commit message is experiencing the integration of skills across the course: git (session 75), bash scripting (Phase 3 and 5), curl (session 68), jq (session 83), and the Ollama API (sessions 83–84). Highlight this integration explicitly when students share their work.
Ollama installs itself as a systemd service. This means it starts automatically at boot and runs in the background. Your scripts can query it any time without manually starting a model first. This is the same systemd knowledge from session 70, applied to a real tool.
Verifying and managing the Ollama service:
systemctl status ollama # is it running? sudo systemctl restart ollama # restart after config change sudo systemctl stop ollama # stop it journalctl -u ollama -f # watch its logs
Model preloading. The first request to a model is slow because it loads the model into RAM. Subsequent requests are faster. For scripts that need fast responses, pre-load the model by sending a dummy request at startup, or use the keep_alive parameter to keep the model in memory between requests:
curl http://localhost:11434/api/generate -d '{
"model": "llama3.2:3b",
"prompt": "",
"keep_alive": "10m"
}'
This keeps the model in RAM for 10 minutes after the last request.
Ollama environment configuration. /etc/systemd/system/ollama.service.d/override.conf or environment variables control Ollama behavior. OLLAMA_HOST can be set to 0.0.0.0:11434 to allow connections from other machines on the network. Do this only on trusted networks — it exposes the API without authentication.
journalctl -u ollama -f in one terminal, send an API request in another. What does the log show?sudo systemctl restart ollama (cold start). Then time a second request immediately after (warm). How different are they?query_model "What is a symlink?". Now you can ask the model from any terminal session.Task 4 — adding query_model to .bashrc — is one of the most practically useful things in Phase 6. A student who can type query_model "explain this error: $(tail -5 /var/log/syslog)" from any terminal has integrated local AI into their daily workflow. The bash function in .bashrc makes it accessible everywhere, which is the point of all that dotfile work in Phase 3. Connect this explicitly: the ~/.bashrc discipline from week 12 is paying off here.
The final project is a tool or system you build entirely yourself. It runs on the machine you built. It must solve a real problem you actually have. It must incorporate skills from at least three different phases of the course. It does not need to use the local AI model — though it can.
The project will be: built across sessions 88–91, documented in session 92, and presented at demo day in sessions 94–95.
Scope guidance. Four sessions of build time is approximately 4 hours. A well-scoped project fits in that time. Too large: a full web application, a complete operating system tool, anything requiring external APIs you have not set up. About right: an automated workflow that solves a real problem using bash, cron, git, and/or Ollama. A personal tool you will run every day. A system monitoring and alerting setup. A local AI assistant customised for a specific use case.
Each student completes a project design document covering:
Write this in ~/projects/linux-course/phase6/PROJECT.md. This document is part of the final grade.
Each student gives a 90-second pitch: what the problem is, what they are building, and one technical risk they identified. The class asks one question each. You give feedback on scope — is this achievable in 4 sessions? Does it genuinely use course skills? Is the problem real?
The most common scoping mistake: projects that are too conceptually interesting but technically shallow ("an AI that chats with you"), and projects that are too technically ambitious ("a full web dashboard"). Push back on both. The first has no real problem to solve. The second cannot be finished. The best projects are personal tools with clear value: "a script that monitors my machine's health, emails me a summary, and if there is an error explains it using the local AI." That is specific, valuable, achievable, and uses cron, bash, Ollama, and email — all course content.
By the end of this session: the minimum viable version works. It may be rough, it may lack features, but the core mechanism functions. A student who cannot demo a working minimum by the end of session 88 has a scope problem or a blocking technical issue — both need to be identified now, not in session 91.
Students build independently. Circulate and check on each student at least twice. Key questions to ask during circulation:
Each student: one sentence on what works so far, one sentence on what is next. Identify anyone who is blocked and address it at the start of session 89.
Students who are blocked at the end of session 88 usually have one of three problems: a scope that is too large for their current skill level (break it down, descope features), a technical dependency they cannot resolve (help them find a simpler equivalent), or decision paralysis (the problem is real but they cannot commit to a specific technical approach — help them choose and start). All three are solvable but need to be caught now. A student who is still designing at the end of session 89 will not finish.
All core features work. The project handles its primary use case correctly. Error handling is in place. By the end of session 89, the project should be demonstrable — not polished, but functional.
Address any issues identified at the session 88 standup directly before starting the timer. Students who were blocked get targeted help in the first 5 minutes.
Sprint 2 is where most students encounter their first real technical problem — something that works in isolation but fails when integrated. This is normal and valuable. The debugging skills from session 72 apply here: describe the problem precisely, gather information (bash -x, journalctl, echo statements), hypothesise, test one change. Students who apply the method will unblock themselves. Students who thrash randomly need coaching to slow down and be systematic.
The project is complete and tested. Edge cases are handled. The code is committed to git. By the end of session 90, building is done — session 91 is for testing and refinement only, not new features.
Ask: who has something working that they can explain? Ask: who is stuck on a specific problem? Pair the helpers with the stuck students for the first 10 minutes of build time. Peer debugging is often faster than teacher debugging, and teaching is one of the best ways to consolidate understanding.
Students who finish early in sprint 3 should: (1) test their project with unexpected input — empty arguments, missing files, wrong types; (2) add logging if not already present; (3) commit everything to git with meaningful commit messages; (4) start the PROJECT.md documentation. Do not allow early finishers to add new features they cannot finish — "done and solid" is better than "expanded and broken."
No new features. Today is testing, edge case handling, code cleanup, and preparing for documentation. The project should be in its final state by the end of this session.
Work through this checklist for your project:
bash -x on it — are there any unexpected behaviors?Code cleanup: remove any debugging echo statements left in accidentally. Ensure all variables are quoted. Add any missing comments. Verify the comment header at the top is accurate and complete.
Swap projects with a partner. Each reviews the other's code against the testing checklist. Write specific, constructive feedback in the partner's PROJECT.md. Focus: does it do what it claims? Are the edge cases handled? Is the code readable?
Good documentation is what separates a script you use once from a tool you use for years. Today each student writes the complete documentation for their project — not just a comment block, but a README that would allow someone unfamiliar with the project to install, configure, and use it.
Write ~/projects/linux-course/phase6/README.md with the following sections:
Commit the README and all project files to git. The git history should show the progression of the project across sprints.
The "course skills used" section is valuable both as documentation and as a reflection exercise. Students who list skills from Phases 1–6 are demonstrating integration. Students who can only list Phase 6 skills have either a narrow project or have not noticed the connections. Help them find the connections they missed — filesystem knowledge from Phase 1 is in every path operation in every script they wrote.
Show the screenshots students took in session 43 of their terminal setup at week 11. Then show their current terminal. Ask: what changed? What did you not know you could do when you took that screenshot?
Pull up their Phase 3 synthesis script and their Phase 5 synthesis script side by side. Read both. Ask the class: what is different? What does the Phase 5 script do that the Phase 3 script could not?
Write individually first, then share. For each question, spend 5 minutes writing a genuine answer — not what you think sounds good, but what you actually think.
Each student shares their answer to question 6 — the most transferable insight. These answers often reveal what the course actually taught beyond its technical content: problem decomposition, persistence with ambiguous problems, documentation discipline, reading errors before panicking. Name these explicitly. They are the course's real outcomes.
Question 6 is the most important question of the retrospective. A student who says "I learned that when something doesn't work, reading the error message carefully usually tells you what is wrong" has internalized something that applies to chemistry, to car repair, to relationship conflicts, to any domain where systematic diagnosis outperforms emotional reaction. This is the course's deepest goal — not Linux fluency, but epistemic confidence. Name it when it appears.
Each student: 4 minutes presentation + 2 minutes questions. Total: 6 minutes per student. For a class of 8 students: 48 minutes covers all 8. For larger classes, split across sessions 94 and 95.
Presentation must cover:
Audience during each presentation: write one genuine observation or question. Not a compliment — an observation or question that shows you were listening. Questions are asked in the 2 minutes after each presentation.
The "one thing that broke and how you fixed it" requirement is deliberately included. It normalises debugging as part of building, not as evidence of failure. Students who present their debugging process with confidence are demonstrating exactly the mindset the course was designed to build. Students who try to pretend nothing broke should be gently pushed: "everything breaks during development — what was the most interesting thing you had to fix?"
After all presentations: each student submits written peer reviews for 2 other students' projects. Format: one paragraph per review covering what the project does well technically, one specific improvement that would make it more robust or useful, and one question they still have about how it works. Reviews are written respectfully and specifically — no vague praise, no unactionable criticism.
Collect all peer reviews. Share them with the relevant student after the session.
Peer review is a professional skill that most students have rarely practiced in a technical context. The format requirement — specific improvement and specific question — prevents the reviews from being meaningless. A student who writes "one improvement would be adding -v flag for verbose output because right now you cannot tell what the script is doing when it runs silently" has understood the project and thought about it seriously. That quality of engagement is what you are looking for.
Walk through the course's progression on the board — not a list of topics but a narrative of capability. Where students were in session 1: the computer was a black box. Where they are now: they built the box, installed the OS, configured the environment, automated workflows, connected to remote machines, and ran a local AI model. Name each phase and its contribution to this capability.
Show their Phase 3 script and their Phase 6 project side by side. The difference in complexity, robustness, and ambition is the visible measure of growth.
Point at specific paths from here:
Deeper Linux: Linux From Scratch (build a Linux system from source), Arch Linux (manual installation with full control), advanced systemd, network administration, firewall configuration with ufw/iptables.
Programming: Python is the natural next step — everything they have done in bash maps to Python with better data structures and a larger ecosystem. The terminal fluency from this course makes Python development much faster. Recommended: "Automate the Boring Stuff with Python" by Al Sweigart — freely available online, directly continues where this course leaves off.
Systems administration: Proxmox for virtualisation (you run Proxmox — show it to them). Docker for containerisation. Ansible for configuration management. The skills from Phases 4–5 are direct prerequisites.
Security: Kali Linux for ethical hacking, TryHackMe and HackTheBox for structured learning. The permission model, network understanding, and log reading from this course are the foundation.
AI and machine learning: Python + PyTorch or TensorFlow. Fine-tuning local models with Ollama's Modelfile. Building applications on top of LLM APIs. The local AI experience from Phase 6 is a practical foundation.
Each student writes a personal roadmap — not an assignment, but a genuine plan:
These do not get collected. They are for the student.
The machine on your desk is yours. Everything on it is yours — your configurations, your scripts, your dotfiles, your git history, your local AI model. You understand what it is made of, how it works, how to fix it when it breaks, and how to make it do new things.
That is not a Linux skill. That is a way of approaching systems. It applies to every piece of technology you will ever use. The specific commands will change. The kernel will be updated. New tools will exist. The habit of reading error messages before panicking, of looking for the root cause not the symptom, of writing things down so future-you can understand present-you — those do not change.
You have built something. You have learned something. Whatever comes next: you know you can figure it out.
96 sessions. 6 months. 1 machine built from parts. 1 OS installed from scratch. Filesystem, terminal, permissions, scripting, customisation, hardware, networking, SSH, processes, services, logs, archives, version control, local AI. Final project designed, built, tested, documented, and presented.
The close speech above is a script if you want one. Adjust it to be yours. The important thing it conveys: the skills transfer. Linux fluency is valuable but not the point. The point is that they now have evidence — across 6 months and a real machine — that they can approach an unknown technical system, break it down, build understanding incrementally, make mistakes, fix them, and produce something that works. That evidence is what makes the next challenge less frightening.
Keep the Phase 3 scripts. Keep the demo day recordings if you made them. In a year or two, if you teach this course again, the contrast between a first-day student and a session-96 student is your best recruiting material. The journey is the argument for the course.
Phase 6 of 6 · Linux seminar · Kubuntu LTS · Ages 15–18
Course complete — 96 sessions across 24 weeks