A Blog by Jonathan Low

 

Apr 4, 2019

How Tracking Eye Movements Helps Computers Learn

Researchers are finding that in an increasingly visually-oriented technological world, captured eye tracking may provide better and faster guides for helping AI systems learn than does vast arrays of data.

The implications for improving human education and training are equally profound. JL


Gregory Barber reports in Wired:

Researchers are looking for ways to make artificial neural networks more brainlike. A dataset that combines eye tracking and brain signals gathered from EEG scans discovered patterns that can improve how neural networks understand language. By adding data around how long eyes linger on a word, helped the neural networks focus on critical parts of a sentence as a human would. Gaze was useful for identifying hate speech, analyzing sentiment, and detecting grammatical errors. Adding more information about gaze, such as when eyes flit between words to confirm a relationship, helped a neural network better identify places and people.
For our eyes, reading is hardly a smooth ride. They stutter across the page, lingering over words that surprise or confuse, hopping over those that seem obvious in context (you can blame that for your typos), pupils widening when a word sparks a potent emotion. All this commotion is barely noticeable, occurring in milliseconds. But for psychologists who study how our minds process language, our unsteady eyes are a window into the black box of our brains.
Nora Hollenstein, a graduate student at ETH Zurich, thinks our reader’s gaze could be useful for another task: helping computers learn to read. Researchers are constantly looking for ways to make artificial neural networks more brainlike, but brain waves are noisy and poorly understood. So Hollenstein looked to gaze as a proxy. Last year she developed a dataset that combines eye tracking and brain signals gathered from EEG scans, hoping to discover patterns that can improve how neural networks understand language. “We wondered if giving it a bit more humanness would give us better results,” Hollenstein says.
Neural networks have produced immense improvements in how machines understand language, but to do so they rely on large amounts of meticulously labeled data. That requires time and human labor; it also produces machines that are black boxes, and often seem to lack common sense. So researchers look for ways to give neural networks a nudge in the right direction by encoding rules and intuitions. In this case, Hollenstein tested whether data gleaned from the physical act of reading could help a neural network work better.
Last fall, Hollenstein and collaborators at the University of Copenhagen used her dataset to guide a neural network to the most important parts of a sentence it was trying to understand. In deep learning, researchers typically rely on so-called attention mechanisms to do this, but they require large amounts of data to work well. By adding data around how long our eyes linger on a word, the researchers helped the neural networks focus on critical parts of a sentence as a human would. Gaze, the researchers found, was useful for a range of tasks, including identifying hate speech, analyzing sentiment, and detecting grammatical errors. In subsequent work Hollenstein found that adding more information about gaze, such as when eyes flit between words to confirm a relationship, helped a neural network better identify entities, like places and people.
The hope, Hollenstein says, is that gaze data could help reduce the manual labeling required to use machine learning in rare languages, and in reading tasks where labeled data is especially limited, like text summaries. Ideally, she adds, gaze would be just the starting point, eventually complemented by the EEG data she gathered as researchers find more relevant signals in the noise of brain activity.
“The fact that the signals are there is I think clear to everyone,” says Dan Roth, a professor of computer science at the University of Pennsylvania. The trend in AI of using ever-increasing quantities of labeled data isn’t sustainable, he argues, and using human signals like gaze, he says, is an intriguing way to make machines a little more intuitive.
Still, eye tracking is unlikely to change how computer scientists build their algorithms, says Jacob Andreas, a researcher at Microsoft-owned Semantic Machines. Gaze data is difficult to gather, requiring specialized lab equipment that needs constant recalibration, and EEGs are messier still, involving sticky probes that need to be wet every 30 minutes. (Even with all that effort, the signal is still fuzzy; it’s much better to place the probes under the skull.) Most of the manual text labeling that researchers depend on can be done fast and cheaply, via crowdsourcing platforms like Amazon’s Mechanical Turk. But Hollenstein sees improvements on the horizon, with better webcams and smartphone cameras, for example, that could passively collect eye-tracking data as participants read in the leisure of their homes.
In any case, some of what they learn by improving machines might help us understand that other black box, our brains. As Andreas notes, researchers are constantly scouring neural networks for signs that they make use of humanlike intuitions—rather than relying on pattern matching based on reams of data. Perhaps by observing what aspects of eye tracking and EEG signals improve the performance of a neural network, researchers might begin to shed light on what our brain signals mean. A neural network might become a kind of model organism for the human mind.

0 comments:

Post a Comment