A Blog by Jonathan Low


Oct 11, 2017

How Artificial Intelligence Makes Data Science More Productive

The progression from big data to analytics to machine learning to artificial intelligence is designed, in the operational and financial context, to make information more productive more quickly, optimizing the knowledge gleaned by enhancing its accuracy and applicability.

Which, in turn, makes the enterprises utilizing such tools more productive as well. JL

Walter Frick interviews Hilary Mason in the Harvard Business Review:

AI us(es) machine learning and deep learning to enable applications. Deep learning makes data accessible that was previously inaccessible. Software engineering fails when it encounters uncertainty (but) data science is an experimental process that allows for uncertainty, investing in initiatives generating more value than incremental product improvements. AI improves the product. This is where recommendation systems, search algorithms, and data visualization use data to open new product, new business, and new revenue opportunities.
But how will we put them into practice? Where in the organization will these new capabilities sit, and how will companies take advantage of them?
To get a practical, on-the-ground view, HBR senior editor Walter Frick spoke with Hilary Mason, the founder of Fast Forward Labs, a machine intelligence research firm. Here are excerpts from their conversation.
HBR: AI is a hot topic right now. As a data scientist and a researcher, how do you think about the recent progress in your field?
Mason: If we were having this conversation eight or 10 years ago, it would have been about big data — about whether we could even build the infrastructure to get all the data into one place and to query it. Once you can do that, you can do analytics, which is essentially counting things to answer questions that have business value or product value. People could always count things in data, but the change we saw about eight years ago was that new software made doing it affordable and accessible for a wide variety of people who never could do it before.
And that led to the rise of data science, which is about counting things cleverly, predicting things, and building models on data. Because that modeling was now so much cheaper, it was applied not just to very high-value problems, like actuarial science, but to things that may seem fairly trivial, like recommendations, search results, and that kind of stuff.
Then we had machine learning, which is a set of tools inside data science that let you count things cleverly and incorporate feedback loops. We began using the models to get more data from the world, and then fed the data back into those models so that they improved over time.
Now, today, we talk about AI. The term itself is a little bit loose — it has both a technical meaning and a marketing meaning — but it’s essentially about using machine learning, and specifically deep learning, to enable applications that are built on top of this stack. That means that you can’t do AI without machine learning. You also can’t do machine learning without analytics, and you can’t do analytics without data infrastructure. And so that’s how I see them all being related.
How do machine learning and AI fit into companies’ existing data capabilities?
Data science is used in multiple ways inside an organization, and a really common mistake I see people make in managing it is assuming that because it runs on one tech stack, it’s just one thing. But I’d break it down into three capabilities, all of which rely on the same technology. The first capability is understanding the business. That’s analytics, or business intelligence — being able to ask questions and analyze information to make better decisions. It’s usually run out of the CFO or COO’s office. It’s not necessarily a technical domain.
The second capability is product data science: building algorithms and systems — which may use machine learning and AI — that actually improve the product. This is where things like spam filters, recommendation systems, search algorithms, and data visualization come in. This capability usually sits under a line of business and is run out of product development or engineering.
The last data capability is one that tends to get neglected or lumped in with product data science. It’s an R&D capability — using data to open up new product, new business, and new revenue opportunities.
Are all three capabilities changed by machine learning and AI?
Let’s take a moment and look more closely at what deep learning offers, since it’s central to a lot of what people now call AI and is a big part of the progress in machine learning in recent years. First, deep learning makes data accessible that was previously inaccessible to any kind of analysis — you can actually find value in video and audio data, for example. The number of companies that have a large amount of that kind of data is still fairly small, but I do think it’s likely to increase over time. Even analytics is impacted by the ability to use image data rather than just text or structured data. Second, deep learning enables new approaches to solving very difficult data science problems — text summarization, for example. Deep learning allows you to create predictive models at a level of quality and sophistication that was previously out of reach. And so deep learning also enhances the product function of data science because it can generate new product opportunities. For example, several companies are using deep learning very successfully in e-commerce recommendation systems. Then, of course, deep learning affects the R&D function by pushing the frontier of what is technically possible.
So data science is about analytics, product development, and R&D. Is this a walk-before-you-run situation? Or should companies attempt all three at once?
It’s a little bit of both. You’ll leave opportunities on the table if you pursue only one of these use cases. However, it really helps to get your infrastructure and analytics piece to be fairly solid before jumping into R&D. And in practice we see that people are much more comfortable investing in cost-saving initiatives before they invest in new revenue opportunities. It’s just more culturally acceptable.
What other mistakes do you see companies making in their data science efforts?
A big one involves process. We’ve noticed that people shoehorn this kind of stuff into the software engineering process, and that doesn’t work. Developing data science systems is fundamentally different in several ways. At the outset of a data science project, you don’t know if it’s going to work. At the outset of a software engineering project, you know it’s going to work.
This means that software engineering processes fail when they encounter uncertainty. By contrast, data science requires an experimental process that allows for uncertainty.
Also, every company has its own cultural hurdle to get over. A lot of companies aren’t places where you can work on something that doesn’t succeed, so the poor data scientists who do the risky research projects end up getting penalized in their annual reviews because they worked on something for two months that didn’t pay off, even though they did great work. Data science requires having that cultural space to experiment and work on things that might fail. Companies need to understand that they’re investing in a portfolio of initiatives, some of which will eventually pay off, generating dramatically more value than incremental product improvements do.
How do you navigate all the buzz around this topic, and how do you recommend executives do so?
I remain a relentless optimist about the potential of what we’re now calling AI, but I’m also a pragmatist in the sense that I need to deliver systems that work to our clients, and that is quite a constraint. There are some folks running around making claims that are clearly exaggerated and ridiculous. In other cases, things that a few years ago we would have called regression analysis are now being called AI, just to enhance their value from a marketing perspective. So my advice is to keep in mind that there is no magic. At a conceptual level nothing here is out of reach of any executive’s understanding. And if someone is pitching you on an idea and says, “I don’t want to explain how it works, but it’s AI,” it’s really important to keep asking: How does it work? What data goes in? What patterns might be in the data that the system could be learning? And what comes out? Because what comes out of a deep learning system is generally just a previously unlabeled data point that now has a label, along with some confidence in that label, and that’s it. It’s not intelligent in the sense that you and I are — and we’re still a long, long way away from anything that looks like the kind of intelligence that a human has.


Post a Comment