Not knowledge, statistics
An LLM doesn't store facts like an encyclopaedia. It learned patterns of how language works and predicts the next word. That's why the same question can produce slightly different answers twice in a row.
01 · Fundamentals
No CS degree required. Just the principles, so you work with AI instead of against it.
Reading time about 8 minutes · Level: Start · As of May 2026
Section 1 · What is an LLM
LLM stands for Large Language Model. It doesn't know anything. It calculates which word is most likely to come next, based on what it has seen across billions of texts. That's why it sounds so confident even when it's wrong. You stay the reviewer.
An LLM doesn't store facts like an encyclopaedia. It learned patterns of how language works and predicts the next word. That's why the same question can produce slightly different answers twice in a row.
When the model doesn't know something, it guesses anyway. With full conviction. Names, numbers, sources and dates are the most common failure modes. Always cross-check.
The LLM is the engine. The chatbot (e.g. ChatGPT, Claude.ai) is the surface you use to operate the engine. A search engine searches real sources. Mixing them up means expecting things from AI it doesn't deliver.
Every model can only hold a limited amount of text in mind at once, measured in tokens (see Section 2). Claude Opus 4.7 holds a million tokens; smaller models hold less. Dump a whole book series in and quality drops off at the end.
Section 2 · What is a token
Tokens are the units AI thinks and computes in. A token is usually a chunk of a word, often roughly three-quarters of a word, sometimes just a letter or a punctuation mark. You pay per token. Reading and writing. Keep this in mind and you build cheaper.
The German word „Wahrscheinlichkeitsverteilung“ breaks into multiple tokens. A short „ja“ is one token. German texts are less token-efficient on average than English, because German compounds get split apart.
What you send into the model (input) is usually cheaper than what comes back (output). Claude Sonnet 4.6 charges 3 dollars per million input tokens and 15 dollars per million output. Factor 5. Source: Anthropic Pricing Docs, May 2026.
If your model has a million-token window, that's the maximum sum of input plus ongoing conversation. Once the window is full, the start drops out or quality breaks down.
Instead of restarting every conversation from scratch, put your standard briefings into a folder. Cowork and ChatGPT Projects load them automatically — no more repeating yourself or re-explaining context.
Section 3 · Prompting for tasks
Most people talk to AI like a search engine. Three words, hit enter, done. You get mediocrity. Say instead what you want, why you want it, and what the result should look like. Give an example if you have one. The effort up front saves you three rounds of corrections later.
Tell the AI who it is (e.g. „You are a copywriter“), what to do (e.g. „Write a headline“), in which format (e.g. „Maximum 8 words, Plus Jakarta Sans“), and give an example. These four pieces are the scaffold for 90 % of prompts.
Describe what the finished result should look like before giving detail steps. Otherwise the model optimises the wrong interim state. Example: „At the end I want a LinkedIn caption in 5 lines, in my voice.“ Then the content.
For knowledge questions, add: „Clearly mark where you are unsure, and tell me which parts I should double-check.“ It works surprisingly well.
Break big tasks into 3–5 steps. First plan, then draft, then polish. Check briefly after each step. Try to wrap it all into one prompt and you get mush back.
Drop them straight into the AI of your choice. Square brackets are where you fill in your own content.
You are an experienced LinkedIn writer in the style of [name or example text].
Task: write a LinkedIn post on the following topic:
[topic]
Format:
- Hook in a single line
- Body 5–8 lines, short and long sentences mixed
- Closer with a real question to the community
- Hashtag set at the end, maximum 5 hashtags
Important: no hedging template phrases, no em-dashes, address the reader as "you".Question: [your question]
Answer in three parts:
1. The direct answer.
2. Sources or evidence you have for it.
3. Spots where you are uncertain and that I should double-check.We do this in three steps.
Step 1: propose a plan for how we tackle [task].
Ask back if anything is missing.
Only after my confirmation do you move to step 2.Section 4 · Reviewing output
An AI can tell you nonsense with full conviction. Your job is not to believe it. Your job is to review. Three minutes of double-check save hours of correction. And sometimes your reputation.
Sounds obvious. Gets skipped daily anyway. Read the output once through before you use it. Questions you'd ask yourself: Does this make sense? Does it match the brief? Are the names and numbers right?
This is where the AI hallucinates most often. If a date, a price or a source shows up in the text, give it a quick Google. 30 seconds of effort.
Hand the AI its own output back and say: „Find the three biggest weaknesses in this text.“ It works surprisingly well, because the model now reviews instead of producing.
For important stuff, run the text through another model (e.g. Claude reviews GPT output, or the other way around). Different models have different blind spots.
For anything public, legal or interpersonal, the text passes through your own head once before going out. The AI isn't liable. You are.
Section 5 · Providers compared
There isn't one best provider. There's the right one for your task and your budget. Here are the seven you can seriously work with in 2026 — including two strong models out of China. Prices per million tokens, as of May 2026. Sources linked per card.
Strengths · Reasoning, code, writing in German voice
Price · $1–5 input · $5–25 output per million tokens. Up to 90 % cheaper with caching.
What for · When you build, write or think. My default for code and newsletters.
Anthropic PricingStrengths · All-rounder, Computer-Use, tool use
Price · $0.20–5 input · $1.25–30 output. Cached input with 90 % discount.
What for · When you need tools, plug-ins or function calling. Also when you want to let the AI drive a UI.
OpenAI PricingStrengths · Fast, multimodal, very cheap
Price · $0.10–2 input · $0.40–18 output. Very cheap lite tier.
What for · When you process images or video, or when latency matters. Also when your workflow already lives in Google Workspace.
Google AI PricingStrengths · Real-time web and X data
Price · $1–1.25 input · $2–2.50 output. Flat tariff.
What for · When you need trends, news or social sentiment. For pure writing or coding there are better options.
xAI PricingStrengths · Very cheap, good for JSON and code
Price · $0.14–1.74 input · $0.28–3.48 output. Cache hits at one hundredth.
What for · When you process high volume and cost is the bottleneck. Often weaker than Claude or GPT for German voice.
DeepSeek PricingStrengths · Reasoning, multilingual incl. Chinese, strong coding
Price · $0.80 input · $3.20 output. Open-weights variants are freely self-hostable.
What for · When you serve Chinese markets or need an open reasoning model. Also when you want to try open-source self-hosting.
Alibaba Cloud PricingStrengths · 256 K context, very cheap, strong tool use
Price · $0.15 input · $2.50 output. Open weights, permissively licensed.
What for · When you process long documents or want to scale agent workflows cheaply. Open source means: self-hostable too.
Moonshot PricingNote
The market moves monthly. Prices and models change. This table is from May 2026. Before any booking decision, quickly check the provider's official pricing page — links per card above.
On to the next station
Daily tasks
Sparring, not prompting. With Cowork, ChatGPT Projects or Perplexity Comet — your call.