1. 1


In order for robots to learn from people with no machine learning expertise, robots should learn from natural human instruction. Most machine learning techniques that incor- porate explanations require people to use a limited vocabulary and provide state information, even if it is not intuitive. This paper discusses a software agent that learned to play the Mario Bros. game using explanations. Our goals to improve learning from explanations were two-fold: to filter explanations into advice and warnings, and to learn policies from sentences without state information. We used sentiment analysis to filter explanations into advice of what to do and warnings of what to avoid. We developed Object-focused advice to represent what actions the agent should take when dealing with objects. An RL agent used Object-focused advice to learn policies that maximized its reward. After mitigating false negatives, using sentiment as a filter was approximately 85% accurate. Object-focused advice performed better than when no advice was given, the agent learned where to apply the advice, and the agent could recover from adversarial advice. We also found the method of interaction should be designed to ease the cognitive load of the human teacher or the advice may be of poor quality.


Recent Comments