01 · Fundamentals

Fundamentals.

No CS degree required. Just the principles, so you work with AI instead of against it.

Jump to the token splitter Read from the top

Reading time about 8 minutes · Level: Start · As of May 2026

Section 1 · What is an LLM

The machine that guesses the next word.

LLM stands for Large Language Model. It doesn't know anything. It calculates which word is most likely to come next, based on what it has seen across billions of texts. That's why it sounds so confident even when it's wrong. You stay the reviewer.

Not knowledge, statistics

An LLM doesn't store facts like an encyclopaedia. It learned patterns of how language works and predicts the next word. That's why the same question can produce slightly different answers twice in a row.

Hallucinations are built in

When the model doesn't know something, it guesses anyway. With full conviction. Names, numbers, sources and dates are the most common failure modes. Always cross-check.

LLM, chatbot and search are three things

The LLM is the engine. The chatbot (e.g. ChatGPT, Claude.ai) is the surface you use to operate the engine. A search engine searches real sources. Mixing them up means expecting things from AI it doesn't deliver.

The context window is the attention span

Every model can only hold a limited amount of text in mind at once, measured in tokens (see Section 2). Claude Opus 4.7 holds a million tokens; smaller models hold less. Dump a whole book series in and quality drops off at the end.

Section 2 · What is a token

The building blocks the AI thinks and pays in.

Tokens are the units AI thinks and computes in. A token is usually a chunk of a word, often roughly three-quarters of a word, sometimes just a letter or a punctuation mark. You pay per token. Reading and writing. Keep this in mind and you build cheaper.

A token isn't a word

The German word „Wahrscheinlichkeitsverteilung“ breaks into multiple tokens. A short „ja“ is one token. German texts are less token-efficient on average than English, because German compounds get split apart.

Input and output tokens cost differently

What you send into the model (input) is usually cheaper than what comes back (output). Claude Sonnet 4.6 charges 3 dollars per million input tokens and 15 dollars per million output. Factor 5. Source: Anthropic Pricing Docs, May 2026.

The context window is a token budget

If your model has a million-token window, that's the maximum sum of input plus ongoing conversation. Once the window is full, the start drops out or quality breaks down.

Briefing folder saves tokens

Instead of restarting every conversation from scratch, put your standard briefings into a folder. Cowork and ChatGPT Projects load them automatically — no more repeating yourself or re-explaining context.

Token splitter · Preview

How an AI breaks your sentence apart.

Example sentence, broken into tokens. Each block is a token, like a tokeniser of an LLM would see it. Note the German compound words — they often split into 2–3 tokens.

Input

„Künstliche Intelligenz versteht Sprache, indem sie Wahrscheinlichkeiten rechnet.“

Tokenisation

Künstliche·Intelligenz·versteht·Sprache,·indem·sie·Wahrscheinlichkeiten·rechnet.

Tokens

Characters

Characters per token

5.3

Model	Input	Output (same length)	Source
Claude Sonnet 4.6	≈ 0,00005 $	≈ 0,00023 $	Anthropic Pricing
Claude Haiku 4.5	≈ 0,00000125 $	≈ 0,000005 $	Anthropic Pricing
GPT-5.4	≈ 0,00006 $	≈ 0,00027 $	OpenAI Pricing
GPT-5.4 nano	≈ 0,0000015 $	≈ 0,000006 $	OpenAI Pricing
Gemini 3.5 Flash	≈ 0,00002 $	≈ 0,00014 $	Google AI Pricing
DeepSeek V4-Flash	≈ 0,00000014 $	≈ 0,00000028 $	DeepSeek Pricing

Note · Simplified approximation for this example sentence. Real tokenisers can differ by 5–15 %, especially with German compounds. The live version with your own typing is coming as a client component.

Section 3 · Prompting for tasks

Say what you want. Why. And what the result should look like.

Most people talk to AI like a search engine. Three words, hit enter, done. You get mediocrity. Say instead what you want, why you want it, and what the result should look like. Give an example if you have one. The effort up front saves you three rounds of corrections later.

Role, task, format, example

Tell the AI who it is (e.g. „You are a copywriter“), what to do (e.g. „Write a headline“), in which format (e.g. „Maximum 8 words, Plus Jakarta Sans“), and give an example. These four pieces are the scaffold for 90 % of prompts.

End state before single step

Describe what the finished result should look like before giving detail steps. Otherwise the model optimises the wrong interim state. Example: „At the end I want a LinkedIn caption in 5 lines, in my voice.“ Then the content.

Force the uncertainty

For knowledge questions, add: „Clearly mark where you are unsure, and tell me which parts I should double-check.“ It works surprisingly well.

In steps, not in one shot

Break big tasks into 3–5 steps. First plan, then draft, then polish. Check briefly after each step. Try to wrap it all into one prompt and you get mush back.

Three copy-paste prompt templates

Drop them straight into the AI of your choice. Square brackets are where you fill in your own content.

text·Clean task

You are an experienced LinkedIn writer in the style of [name or example text].

Task: write a LinkedIn post on the following topic:
[topic]

Format:
- Hook in a single line
- Body 5–8 lines, short and long sentences mixed
- Closer with a real question to the community
- Hashtag set at the end, maximum 5 hashtags

Important: no hedging template phrases, no em-dashes, address the reader as "you".

text·Knowledge question with uncertainty check

Question: [your question]

Answer in three parts:
1. The direct answer.
2. Sources or evidence you have for it.
3. Spots where you are uncertain and that I should double-check.

text·Task in steps

We do this in three steps.

Step 1: propose a plan for how we tackle [task].
Ask back if anything is missing.

Only after my confirmation do you move to step 2.

Section 4 · Reviewing output

Three minutes of double-check save hours of correction.

An AI can tell you nonsense with full conviction. Your job is not to believe it. Your job is to review. Three minutes of double-check save hours of correction. And sometimes your reputation.

Read it before you use it

Sounds obvious. Gets skipped daily anyway. Read the output once through before you use it. Questions you'd ask yourself: Does this make sense? Does it match the brief? Are the names and numbers right?

Re-check numbers, names, dates

This is where the AI hallucinates most often. If a date, a price or a source shows up in the text, give it a quick Google. 30 seconds of effort.

Pit the AI against itself

Hand the AI its own output back and say: „Find the three biggest weaknesses in this text.“ It works surprisingly well, because the model now reviews instead of producing.

A second model as a counter-check

For important stuff, run the text through another model (e.g. Claude reviews GPT output, or the other way around). Different models have different blind spots.

Keep a human in the loop when it matters

For anything public, legal or interpersonal, the text passes through your own head once before going out. The AI isn't liable. You are.

Section 5 · Providers compared

Seven providers worth taking seriously in 2026.

There isn't one best provider. There's the right one for your task and your budget. Here are the seven you can seriously work with in 2026 — including two strong models out of China. Prices per million tokens, as of May 2026. Sources linked per card.

Provider	Top model	Input / Output (per 1 M)	Context	Strengths
Anthropic (Claude)	Claude Opus 4.7	5 $ / 25 $	1 M Tokens	Reasoning, code, writing in German voice
OpenAI (GPT)	GPT-5.5	5 $ / 30 $	1 M Tokens	All-rounder, Computer-Use, tool use
Google (Gemini)	Gemini 3.5 Flash	1,50 $ / 9 $	1 M Tokens	Fast, multimodal, very cheap
xAI (Grok)	grok-4.3	1,25 $ / 2,50 $	1 M Tokens	Real-time web and X data
DeepSeek	DeepSeek V4-Flash	0,14 $ / 0,28 $	1 M Tokens	Very cheap, good for JSON and code
Alibaba (Qwen)	Qwen 3 Max	0,80 $ / 3,20 $	1 M Tokens	Reasoning, multilingual incl. Chinese, strong coding
Moonshot AI (Kimi)	Kimi K2	0,15 $ / 2,50 $	256 K Tokens	256 K context, very cheap, strong tool use

Claude Opus 4.7

Anthropic (Claude)

Strengths · Reasoning, code, writing in German voice

Price · $1–5 input · $5–25 output per million tokens. Up to 90 % cheaper with caching.

What for · When you build, write or think. My default for code and newsletters.

Anthropic Pricing

GPT-5.5

OpenAI (GPT)

Strengths · All-rounder, Computer-Use, tool use

Price · $0.20–5 input · $1.25–30 output. Cached input with 90 % discount.

What for · When you need tools, plug-ins or function calling. Also when you want to let the AI drive a UI.

OpenAI Pricing

Gemini 3.5 Flash

Google (Gemini)

Strengths · Fast, multimodal, very cheap

Price · $0.10–2 input · $0.40–18 output. Very cheap lite tier.

What for · When you process images or video, or when latency matters. Also when your workflow already lives in Google Workspace.

Google AI Pricing

grok-4.3

xAI (Grok)

Strengths · Real-time web and X data

Price · $1–1.25 input · $2–2.50 output. Flat tariff.

What for · When you need trends, news or social sentiment. For pure writing or coding there are better options.

xAI Pricing

DeepSeek V4-Flash

DeepSeek

Strengths · Very cheap, good for JSON and code

Price · $0.14–1.74 input · $0.28–3.48 output. Cache hits at one hundredth.

What for · When you process high volume and cost is the bottleneck. Often weaker than Claude or GPT for German voice.

DeepSeek Pricing

Qwen 3 Max

Alibaba (Qwen)

Strengths · Reasoning, multilingual incl. Chinese, strong coding

Price · $0.80 input · $3.20 output. Open-weights variants are freely self-hostable.

What for · When you serve Chinese markets or need an open reasoning model. Also when you want to try open-source self-hosting.

Alibaba Cloud Pricing

Kimi K2

Moonshot AI (Kimi)

Strengths · 256 K context, very cheap, strong tool use

Price · $0.15 input · $2.50 output. Open weights, permissively licensed.

What for · When you process long documents or want to scale agent workflows cheaply. Open source means: self-hostable too.

Moonshot Pricing

Note

The market moves monthly. Prices and models change. This table is from May 2026. Before any booking decision, quickly check the provider's official pricing page — links per card above.

On to the next station

Daily tasks

Sparring, not prompting. With Cowork, ChatGPT Projects or Perplexity Comet — your call.

Take a look