Published on January 16th, 2014 | by Emily Corbett0
Bored Robots Learning to Look Around
By Jake Fountain, University of Newcastle
This student took part in the 2012/13 AMSI Vacation Research Scholarship program. For more information on this years program please click here
Without looking away from this page, visualise your surroundings. If you are human, you will probably find this easy to do. As you arrived at wherever you may be right now, you likely observed your surroundings without even thinking about it. You developed an image in your head about where things are around you, allowing you to navigate with confidence.
This ability to remember where we are in our environment is actually quite sophisticated, and, as it turns out, robots have a lot of trouble with it. Robots which play soccer must be able to figure out where they are on the field, where the ball is and which end of the field is their end. In order to keep track of this information they must use a camera to measure the positions of objects. But robots which have human-like vision cannot see all objects at once due to a limited field of view. So which objects should the robot look at to get the most information?
To solve this problem we used a technique called reinforcement learning. This involves the robot learning the best action to take by humans giving it rewards. Think of it like training a dog. If a dog follows the command ‘sit’, we give the dog a treat so that the dog knows that sitting down when it hears the associated sound yields a reward. Similarly, if the robot can figure out where it is on the soccer field, we reward it, and the robot will learn take actions to maximise this reward in the long term by choosing objects to look at.
We found, however, that after we rewarded the robot for looking at, say, a goal post, it would stare at it, reaping the rewards without considering other actions may give more reward. This would cause it to lose track of the ball. To fix this problem, we implemented a motivation reward system. This involves generating an additional reward for the robot based on how ‘interesting’ an action was. The amount of interest is calculated based on how new an action is; actions which are similar, but still different, from the last action are the most interesting. So the robot would become ‘bored’ of looking at the goal post after a while and try looking at the ball, or maybe a different goal post.
The combination of looking not just for the most useful objects, but also the most interesting ones, allowed the robot to more accurately know its position while playing soccer. This gives us a little insight as to why humans may look at interesting things even though they may not be useful. Boredom causes searching for new facts about our environment, while maintaining information about old facts.