πŸ§ͺ The AI Lab← All labs
Reinforcement learning Β· learn by reward

πŸ•ΉοΈ Reinforcement Learning β€” learning by trial & reward

No labelled examples, no neat piles of dots β€” just a robot πŸ€– dropped into a world. It tries things, bumps around, and gets rewards (πŸ† good!) or penalties (⚑ ouch!). Over many tries it remembers what paid off β€” exactly how you train a pet with treats. Press β€œAuto-train” and watch it get smarter.

Pick a world for the robot

tries trained 0 last reward – best path –

Reward on each try (right = most recent) β€” watch it climb πŸ“ˆ
arrow = best move it has learned for that square greener square = β€œa good place to be” redder square = β€œa bad place to be”
πŸ€– In a real AI β€” this is the third way to learn There's no teacher giving answers (that's supervised / classification) and we're not just finding groups (that's unsupervised / clustering). Here the robot learns from rewards by trial and error β€” this is Reinforcement Learning. The same idea teaches computers to play chess and video games, helps robots walk, and even helps train chatbots to give better answers (it's the β€œRL” in the training of modern AI). The robot keeps a little scorebook (a Q-table): for every square, how good is each move? Good moves get a higher score next time.

Practice 🎯

← Back to all labs