πŸ§ͺ The AI Lab← All labs
Class 7 Β· AI Domains

πŸ—ƒοΈ Datasets β€” the food an AI learns from

A dataset is a big group of data we use to teach and test an AI model. Each tiny piece (a record or data point) has attributes β€” like an animal photo's colour, size and shape. Datasets come in many flavours (numbers, text, pictures, sound, time, places) and we always split them into three parts. Play below to see how!

Playground 1 · Split the dataset 🍰

Here are 100 animal photos. Slide the controls to share them between Training, Validation and Test. The three must always add up to 100!

Quick presets:
Training 60%
Validation 20%
Test 20%
60 training 20 validation 20 test

πŸ€– Why three parts? Think like a student. Training = learning the chapter. Validation = practising sums to get better while you still study. Test = the final exam on questions you've never seen. The computer never peeks at the test photos while training β€” that's the only way to know if it really learned, instead of just memorising!
Playground 2 · Sort the data 🧺

Datasets are named by what's inside them. Tap an item, then tap the bucket it belongs to. Score: 0 / 7

1️⃣ Pick an item
2️⃣ Drop it in the right bucket

πŸ“¦ Structured vs Unstructured Structured data sits in neat rows and columns, like a spreadsheet β€” super easy to search. Unstructured data has no fixed format, like a messy pile of photos, songs or videos. Some datasets are hybrid β€” a bit of both!

Practice 🎯

← Back to all labs