"When adults and 4- to 5-year-old children played a game where certain choices earned them rewards, both adults and children quickly learned what choices would give them the biggest returns. But while adults then used that knowledge to maximize their prizes, children continued exploring the other options, just to see if their value may have changed."
Ah, this is known as the "explore-exploit" problem in computer science. I've read that mathematicians have analyzed simple models of it and found that the optimal solution depends on the time remaining. To use a "restaurant" analogy, suppose you are visiting a city and are going out to eat every night. Should you go to a new restaurant or go to the best you have visited so far? The answer is, if it's your last night, you should go to the best you've been to so far -- exploit the knowledge that you have for maximum reward. (We are assuming maximum reward, defined as the best tasting food, is the goal here.). But if you have weeks or months ahead of you, you should go to a new restaurant. At some point, the balance will tip and you will reap the best rewards by going to the best restaurant over and over before your time runs out. I don't recall the math for working out the exact optimal solution but I remember it's really complicated. Nowadays people make AI agents, in particular reinforcement learning agents, that learn the optimal explore-exploit strategy for a given environment, without it being directly solved mathematically.
Anyway, what's interesting is that children seem oriented towards exploration, while adults seem oriented towards exploitation, as if humans have some innate sense of "time left" and "knowledge gained so far" and what is the optimal strategy.
"And despite what adults may think, kids' search for new discoveries is anything but random. Results showed children approached exploration systematically, to make sure they didn't miss anything."
"The researchers conducted two studies. One study involved 32 4-year-olds and 34 adults. On a computer screen, participants were shown four alien creatures. When participants clicked on each creature, they were given a set number of virtual candies. One creature was clearly the best, giving 10 candies, while the others gave 1, 2 and 3 candies, respectively. Those amounts never changed for each creature over the course of the experiment."
In the computer science models I've read about, they simulated armed bandits. This is the same thing except replace bandits with "aliens" and money with "virtual candies."
"The goal was to earn as much candy as possible over 100 trials. (The children could turn their virtual candies into real stickers at the end of the experiment.)"
"As expected, the adults learned quickly which creature gave the most candies and selected that creature 86 percent of the time. But children selected the highest-reward creature only 43 percent of the time."
"And it wasn't because the children didn't realize which choice would reap them the largest reward. In a memory test after the study, 20 of 22 children correctly identified which creature delivered the most candy."
"When they didn't click on the option with the highest reward, they were most likely to go through the other choices systematically, to ensure they never went too long without testing each individual choice."
"In a second study, the game was similar but the value of three of the four choices was visible -- only one was hidden. The option that was hidden was randomly determined in each trial, so it changed nearly every time. But the values of all four choices never changed, even when it was the hidden one."
"Like in the first experiment, the 37 adults chose the best option on almost every trial, 94 percent of the time. That was much more than the 36 4- and 5-year-old children, who selected the highest-value option only 40 percent of the time."
"When the hidden option was the highest-value option, adults chose it 84 percent of the time, but otherwise they almost never selected it (2 percent of the time)."
"Children chose the hidden option about 40 percent of the time -- and it didn't matter if it was the highest value one or not."Young children would rather explore than get rewards
Young children will pass up rewards they know they can collect to explore other options, a new study suggests. Researchers found that when adults and 4- to 5-year-old children played a game where certain choices earned them rewards, both adults and children quickly learned what choices would give them the biggest returns. But while adults ...news.osu.edu