A Blog by Jonathan Low

 

Dec 29, 2019

AI That Improves When You Smile?

It's called positive affectivity. Hey, it's just personal data, nothing really valuable, so you should be smiling. Right? JL

Kyle Wiggers reports in Venture Beat:

Positive affectivity, or the characteristic that describes how people experience affects (e.g., sensations, emotions, and sentiments) and interact with others as a consequence, has been linked to increased interest and curiosity as well as satisfaction in learning. Inspired by this, a team of Microsoft researchers propose imbuing reinforcement learning, an AI training technique that employs rewards to spur systems toward goals, with positive affect, which they assert might drive exploration useful in gathering experiences critical to learning.
Positive affectivity, or the characteristic that describes how people experience affects (e.g., sensations, emotions, and sentiments) and interact with others as a consequence, has been linked to increased interest and curiosity as well as satisfaction in learning. Inspired by this, a team of Microsoft researchers propose imbuing reinforcement learning, an AI training technique that employs rewards to spur systems toward goals, with positive affect, which they assert might drive exploration useful in gathering experiences critical to learning.
As the researchers explain, reinforcement learning is commonly implemented via policy-specific rewards designed for a predefined goal. Problematically, these extrinsic rewards are narrow in scope and can be difficult to define, as opposed to intrinsic rewards that are task-independent and quickly indicate success or failure.
In pursuit of an intrinsic policy, the researchers developed a framework comprising mechanisms motivated by human affect — one that motivates agents by drives like delight. Using a computer vision system that models the reward and another system that uses data to solve multiple tasks, it measures human smiles as positive affect.
The framework encourages agents to explore virtual or real-world environments without getting into perilous situations, and it has the advantage of being agnostic to any specific machine intelligence application. A positive intrinsic reward mechanism predicts human smile responses as the exploration evolves, while a sequential decision-making framework learns a generalizable policy. As for the positive intrinsic affect model, it changes the action selection such that it biases actions providing better intrinsic rewards, and a final component uses data collected during the agent’s exploration to build representations for visual recognition and understanding tasks.
Microsoft smile AI
To test the framework, the researchers collected data from five subjects tasked with exploring a digital three-dimensional maze with a vehicle, as well as synchronized footage of each of their faces. (Every person drove for 11 minutes each, providing a total of 64,000 frames.) Participants were told to explore the environment but were given no additional instruction about other objectives, and their smile responses were calculated and recorded by an open source algorithm.
The affect-based intrinsic motivation model was trained using the subjects’ data, with image frames from the vehicle’s dashboard serving as the input and the smile probability serving as the output. The results of further experiments show that the framework improved safe exploration while at the same time enabling efficient learning; compared with baselines, the researchers’ intrinsic reward policy covered 46% more space in the maze and collided with obstacles 29% less of the time.
“Here we were not attempting to mimic affective processes, but rather to show that functions trained on affect like signals can lead to improved performance,” wrote the coauthors of the paper detailing the work. “In summary, we argue that such an intrinsically motivated learning framework inspired by affective mechanisms can be effective in increasing the coverage during exploration, decreasing the number catastrophic failures, and that the garnered experiences can help us learn general representations for solving tasks including depth estimation, scene segmentation, and sketch-to-image translation.”

2 comments:

ishhu said...

thank you for sharing this using the subjects data, with image frames from the vehicle’s dashboard serving because the input and therefore the smile probability serving because the output. The results of further experiments show that the framework improved safe exploration while at an equivalent time enabling efficient learning; compared with baselines, the researchers’ intrinsic reward policy covered. wrote the coauthors of the paper detailing the work. “In summary, we argue that such an intrinsically motivated learning framework inspired by effective mechanisms are often effective in increasing the coverage during exploration, decreasing the amount catastrophic failures, i'm working Cheap Essay Writing Service company which the garnered experiences can help us learn general representations.

Mason Dillon said...

Hello, I would like to thank you for sharing this interesting material. The information you mentioned about smiling is rather interesting. I was actually looking for essay pros sample task.

Post a Comment