A Blog by Jonathan Low

 

Feb 6, 2025

Can "Automated Reasoning" Reduce, If Not Stop, AI Hallucinations?

One of the reasons corporate uptake of AI has been slower than expected is concern about AI's tendency to make stuff up, some of it wildly inaccurate. While that might be amusing or inconsequential in some circumstances, for businesses dealing with financial issues, new pharmaceuticals and other sensitive uses for which accuracy is essential, that is a huge disincentive. 

In the search for solutions, mathematically-driven 'automated reasoning' may help. But can it reduce hallucinations somewhat: maybe. Eliminate them entirely: "Never." JL

Belle Lin reports in the Wall Street Journal
:

One of AI’s most intractable problems is its tendency to make up answers, and to repeat them. Automated reasoning uses mathematical proof to assure a system will behave a certain way. The idea is that AI models can “reason” through problems, in this case, used to check that the models are providing accurate answers, usig mathematical logic to encode knowledge in AI systems in a structured way, and rule-based decision-making to reach conclusions. To more fully reduce hallucinations, companies also use retrieval-augmented generation, which connects AI models with external data sources, and fine-tuning, to customize a model with company data. But can hallucinations be eliminated altogether? Automated reasoning says it is “undecidable. We will never 100% solve this.”

Amazon is using math to help solve one of artificial intelligence’s most intractable problems: its tendency to make up answers, and to repeat them back to us with confidence.

The issue, known as hallucinations, have been a problem for users since AI chatbots hit the mainstream over two years ago. They’ve caused people and businesses to hesitate before trusting AI chatbots with important questions. And, they occur with any AI model—from those developed by OpenAI and Meta Platforms to those from the Chinese firm DeepSeek.

Now, Amazon.com’s cloud-computing unit is looking to “automated reasoning” to provide hard, mathematical proof that AI models’ hallucinations can be stopped, at least in certain areas. By doing so, Amazon Web Services could unlock millions of dollars worth of AI deals with businesses, some analysts say.

Simply put, automated reasoning aims to use mathematical proof to assure that a system will or will not behave a certain way. It’s somewhat similar to the idea that AI models can “reason” through problems, but in this case, it’s used to check that the models themselves are providing accurate answers.

A 2,000 year-old discipline

Automated reasoning is a branch of AI that’s part of “symbolic AI,” a field rooted in the 2,000-year-old mathematical logic work of Socrates and Plato, according to AWS Vice President and Distinguished Scientist Byron Cook, a leader in the field. It’s a relatively obscure discipline that counts only about 3,000 practitioners in total around the world, Cook said. 

Symbolic AI is the use of mathematical logic to encode knowledge in AI systems in a structured way, and uses rule-based decision-making to reach conclusions, Cook said. That’s different from machine learning, which involves teaching a machine to deduce patterns from large amounts of data. As a sign of its faith in automated reasoning, Amazon has hired most of the experts specializing in the field over the past decade, including 97 interns with doctorates last year, and hundreds in total, Cook said

“We’ve hired so many people that we’re in danger of the university system not being able to hire enough people to teach,” he said.

Why does Amazon find math and AI so useful? Because it can provide a sort of guarantee, or mathematical proof, that a program or software is working the way it’s intended.

The first application of automated reasoning for AWS was in cybersecurity, Cook said, where the technology helped “prove the correctness” of the company’s cryptography for business customers. That kind of proof led to more and more of customers’ data and applications being moved to Amazon’s cloud, Cook added.

AWS’s tool for AI hallucinations, called Automated Reasoning Checks, aims to provide assurances of truth to customers, especially when they’re using it in “circumstances where it really, really matters,” Cook said.

Selling accuracy

For AWS and its peers, selling AI to businesses has been a challenge. The technology has remained relatively unreliable. That has caused many chief information officers to say they can’t cede control of key business decisions to the technology, which could spit back completely wrong, made up answers. And, it’s important for AI to be right when businesses are expected to show that their AI investments are generating returns.

 

To use the AWS tool, customers first need to set up a set of policies that serve as the absolute truth. That could be a company’s internal guidebook on employee benefits, or its product information for customer service staff. Automated Reasoning Checks also work with Bedrock Guardrails, AWS’s product with enterprise-friendly safeguards like filters and blocking inappropriate content.

PricewaterhouseCoopers is using Automated Reasoning Checks to help stop hallucinations for customers in regulated industries like pharmaceuticals and life sciences, according to Matt Wood, the firm’s commercial technology and innovation officer.

When marketing new drugs, for instance, customers need to make sure they’re not running afoul of regulations governing what can be advertised. Without Automated Reasoning Checks, AI might be prone to help customers accomplish the goal of generating advertising—not meeting regulatory requirements, Wood said. Beyond areas where policies or rules can be clearly defined, however, AWS’s automated reasoning tool is more limited, according to Rowan Curran, an analyst focused on AI at market research and IT consulting firm Forrester.

For businesses, it’s best to consider automated reasoning as one component of a multi-pronged approach to helping eliminate hallucinations, he said, and not the singular “silver bullet.”

Microsoft CEO Satya Nadella speaks at the Microsoft Ignite conference. Microsoft and Google also offer tools to reduce the likelihood of hallucinations for business customers.
Microsoft CEO Satya Nadella speaks at the Microsoft Ignite conference. Microsoft and Google also offer tools to reduce the likelihood of hallucinations for business customers. Photo: Charles Rex Arbogast/Associated Press

“It’s more work upfront, for less work when something terrible happens and the application exposes company data, or it gives an offer to a customer that costs you millions of dollars,” he said.

To more fully reduce hallucinations, companies should also use tools like retrieval-augmented generation, or RAG, which is a method of connecting AI models with external data sources, and fine-tuning, a method of customizing a large language model with private or company data.

Amazon’s competitors, Microsoft and Google, also offer tools that aim to reduce the likelihood of hallucinations for business customers. Future versions of AWS’s hallucination-reduction tools will include a combination of techniques like RAG and automated reasoning, Cook said.

Jason Gelman, a director of product management for Google’s Vertex AI, said the company doesn’t use automated reasoning. But model accuracy—and some form of hallucination mitigation—is an imperative for AI agents. “If the foundation is shaky, then you can’t build on top of it,” he said.

But can hallucinations be eliminated altogether? Automated reasoning says that problem is “undecidable,” according to Cook. “We will never one hundred percent solve this,” he said, but tools can be built that provide correct answers.

0 comments:

Post a Comment