A Blog by Jonathan Low


Aug 3, 2023

96 Percent of AI Professionals Say Human Expertise Crucial To AI Success

AI professionals are increasingly stating that human expertise will be crucial to generative AI success in order to improve the quality of data sets, without which generative will be considered too failure prone and risky. JL

Venture Beat reports:

96% of AI professionals say human labeling is important to the success of  ML/AI data models and 86% call it essential. The significance of data quality has become central to industry conversations. Human expertise is especially crucial in the next era of customer-ready AI that will change the way customers interact with the technology. "You need human intelligence behind use cases both from the developer side and the company side. It’s accepted that human annotation is a big expense at the beginning, structuring data sets, and that structuring needs to be vetted and quantified." Humans in the loop are crucial to success, but they’re also critical as ways to limit failures, or even liability

The coming explosion of new businesses and technologies, all of which will be built upon the promise of generative AI, will transform how we work. But is it ready for prime time? The process of building and developing these intelligent technologies the right way will rely upon MLOps fundamentals, strong data management, and human expertise for success, says Jeff Mills, CRO at iMerit.

“As Andrew Ng said, companies need to move from a model-centric approach to a data-centric approach to make AI work, and that especially holds true for generative AI,” Mills says. “A model is only as good as its data. Without good data going into these models, the models themselves simply won’t work — and one of the fundamentals of MLOps is good data.”

MLOps streamlines the way AI applications are developed, deployed and optimized for ongoing value, while MLDataOps, a subset of MLOps, unlocks the ability to source and create good data at scale, and then build models that are precise and stand up to rigorous train-and-test cycles. From the start, it ensures the sustainability of an AI project, and ultimately its likelihood of going into production.

The significance of data quality has become central to industry conversations. iMerit’s recent study on the state of MLOps surveyed AI, ML and data practitioners across industries and found that 3 in 5 consider higher-quality training data to be more important than higher volumes of training data for achieving the best outcomes from AI investments — and about half said lack of data quality or precision is the number-one reason their ML projects fail.

It all comes down to human expertise — which is especially crucial as we enter the next era of customer-ready AI that will change the way customers interact with the technology, Mills says.

“As the masses adopt this technology at a whole new level, you need human intelligence behind the use cases both from the developer side and the company side,” he says. “It’s already widely accepted that human annotation is a big chunk of the expense at the beginning, structuring data sets, and that structuring needs to be vetted and quantified. But as that moves into deeper parts of production, you start to have different layers of human-in-the-loop because of the edge cases that will inevitably be happening.”

Not just human-in-the loop, but expert-in-the-loop for next wave of AI

The vast majority of AI professionals agreed on the need for human integration, with 96% saying human labeling is important to the success of their ML/AI data models and 86% calling it essential. From beginning to end, as models progress through the production pipeline, it’s become clear that human expertise is crucial for ensuring that the data used to refine and enhance models is of the highest quality.

It’s even more important as we hit a new cycle of transformative technology, in which a core technology unleashes a wave of creative, innovative solutions that re-invent industries.

“I think we’re about to have an explosion of startups coming in and building a thin UX and UI layer on top of these generative models,” Mills explains. “But as you start to think about what application you’re building on top of a general model, then you’re going to have to start tuning your data.”

Mills points to the example of using SAM (Segment Anything Model), a Meta AI open-sourced semantics segmentation tool, to build a medical AI application that looks for tumors in lung tissue scans. This gives a developer a head-start in building that application, but it also requires a great deal of finessing from there, and that all comes down to data.

“It’s already trained on some base data pretty well. Trees, it probably can figure out pretty quick. Stop signs, it probably has a good idea. Tissue scanning tumors in lungs? Probably not yet,” he explains. “So, it’s not just humans-in-the-loop, it’s experts-in-the-loop, maybe even a medical doctor.”

Mitigating risks and liability with generative AI

Humans in the loop are crucial to success, but they’re also critical as ways to limit failures, or even liability – which, Mills says, is going to be a big factor with generative AI, as it is for any technology that makes a big leap.

Much of it stems simply from the level of experience the practitioners bring to working on the data that help build these systems. Some comes from ensuring systems are secure, and sensitive PII is protected.

And a lot of it goes back to how good the data quality is — particularly as you consider the difference between recommending a television show or a restaurant and autonomous driving or identifying tumorous tissue.

“The impact AI will have on the people who use it, or are affected by it, will start to grow exponentially, and so will the potential liability,” Mills explains. “As systems scale and automation becomes essential, human oversight will only grow more important. And as models get more complex, humans who are experts will start to have to come in earlier and earlier, all the way to the conception and design stage of a model.”

It’s especially critical as judgement calls become more and more subjective. At what point should quality control kick in? If it’s content moderation, is the model being 80% sure that it’s okay for a child to come across this post? Even if it’s something as relatively low stakes as a pizza recommendation engine, who decides what constitutes good pizza?

“Quality control in production comes in at the guideline creation phase, which not many people realize, in part because it might not be the sexiest stage of development,” Mills says. “It’s taking it a step beyond the need for good data, and diving into the need for subjective data — what could even be called biased data.”

He refers back to the analogy of a pizza recommendation engine. You want to eliminate wild data points for a New York City audience, such as a pizza restaurant in Italy. But moreover, the model should span data from experts spanning all five NYC boroughs, and specifically exclude data from, for instance, Chicago, a city with very different ideas of what “pizza” actually entails.

“Bias, or subjectivity, is bad when it’s unintentional — and it’s critical when it’s intentional,” he says. “If I’m looking for thin-crust New York pizza, that’s a bias. I want my algorithm super-biased. If I’m getting operated on, I want my surgeon to be very biased in his understanding of the theory and practice behind the procedure that’s going to save my life. I want him to be very authoritative.”

The ethics of generative AI

Ethical AI isn’t just about avoiding liability — it’s about prioritizing the problems that need to be solved, ahead of, or at least alongside, the potential for revenue.

“Developers need to nail down the real problem they’re solving for, and whether it actually needs to be solved, and why,” Mills says. “It’s not just about whether a technology is an ethical pursuit, it’s about how that algorithm is going to be used. All good things can be used with bad intentions.”

It’s not just the end product, but how that final solution is built — from the end goal of the developer to how the technology is being built — and what resources are being used. When they’re humans, are you working with human-in-the-loop to make sure what you’re building is good and safe?

This extends right down to how those humans involved are compensated for their work and their expertise. Is their value being recognized and rewarded — or are they being asked to deal with difficult material without the right guidance or support? And while treatment of resources is an ethical issue in and of itself, it has an ultimate impact on the AI you’re building.

“Just as a hospital offers support to its healthcare workers, you have to make sure you take care of those people on the front lines of a new technology,” Mills says. “This is a whole new arena in which we need to ensure that workers are being treated fairly, from wages and hours to mental health support and more. It’s as fundamental to its success as the data.”


Post a Comment