LLM Adventure – Learn AI & ML (Advanced v2)

Tokenization & Attention

Tokenize Your Text

LLMs operate on tokens, not raw words. Try your own sentence; click a token to highlight its attention row/column.

Attention Visualization

Word Embeddings

Words in Space

Embeddings place words as points in a high‑dimensional space. Here we project to 2D; similar words appear nearby. Drag your cursor over points for details; click to reveal 3 nearest neighbors.

Tip: use the slider to change cluster separation.

Cluster separation

Semantic Clusters

Words with similar meanings form clusters in the embedding space.

Positive Cluster

happy joyful excited wonderful amazing

Negative Cluster

sad angry frustrated disappointed worried

Positional Encoding

Transformers use sinusoidal or learned signals that encode token positions. Adjust settings to see how frequency and max length affect the waves.

Sequence length

Position dimension (k)

Frequency scale

Model Training Journey

Pretraining Phase

Cosmic Knowledge
Absorbs billions of tokens from corpora

Pattern Recognition
Learns syntax & semantics

Memory Formation
Internalizes broad knowledge

Fine‑tuning Phase

Specialization
Focuses on target tasks

Parameter Adjustment
Refines weights

Performance Boost
Improves accuracy

Making Models Helpful & Safe

Reinforcement Learning from Human Feedback (RLHF) aligns model behavior with preferences. Try a mini preference comparison below.

Model Response

Generates an answer

Human Feedback

Raters prefer better answers

Aligned Model

Learns a reward model

Prompt: "Explain overfitting in one sentence."

Candidate A: Overfitting is when a model memorizes the training data so well that it fails to generalize to unseen data.

Prompt: "Explain overfitting in one sentence."

Candidate B: Overfitting is when a model is over and fits the training data and then is worse on test data because it is over.

Magic Prompting

Try Prompting!

Craft your prompt. Use the tools to estimate tokens and copy quickly.

Model Response

Your response will appear here...

Set a Role

Provide Examples

Be Clear

Retrieval‑Augmented Generation (RAG)

RAG retrieves relevant passages and feeds them to the generator. Use the demo to search a tiny in‑page knowledge base.

1

Retrieve

Find relevant chunks

2

Augment

Attach to the prompt

3

Generate

Answer with citations

Fighting Hallucinations

Hallucination

Model makes up incorrect information

“The Eiffel Tower is located on Mars and was built by aliens in 1985...”

Unverified Information

Verified Response

Model provides accurate, checked information

“The Eiffel Tower is in Paris, France, completed in 1889 for the Exposition Universelle.”

Fact‑Verified Information

Verify a Claim

Model Optimization

Knowledge Distillation

Transfer knowledge from large models to smaller, efficient ones.

Teacher Model

175B parameters • Large & slow

Knowledge Transfer

Student Model

7B parameters • Small & fast

Model Quantization

Reduce precision to shrink size and speed up inference.

Model Compression

100%

Precision

16‑bit

Full precision

Size Estimate

~350 GB

Tokens/s: ~1×

Model Evaluation

Perplexity

How surprised the model is

42.5

Lower is better

BLEU

Translation quality

0.85

Higher is better

Task Accuracy

Performance on tasks

--

Run evaluation to see score

Tokenization & Attention

Tokenize Your Text

Attention Visualization

Word Embeddings

Words in Space

Semantic Clusters

Positive Cluster

Negative Cluster

Positional Encoding

Model Training Journey

Pretraining Phase

Fine‑tuning Phase

Making Models Helpful & Safe

Model Response

Human Feedback

Aligned Model

Magic Prompting

Try Prompting!

Model Response

Set a Role

Provide Examples

Be Clear

Retrieval‑Augmented Generation (RAG)

Retrieve

Augment

Generate

Fighting Hallucinations

Hallucination

Verified Response

Verify a Claim

Model Optimization

Knowledge Distillation

Teacher Model

Student Model

Model Quantization

Model Compression

Model Evaluation

Perplexity

BLEU

Task Accuracy

Run Comprehensive Evaluation