A Blog by Jonathan Low


Sep 20, 2016

The Not-So-Secret Life Of A Data Scientist

Ultimately, the most important tasks for data scientists may be to make their craft more accessible to the quantitatively-challenged in order to grow the market. JL

Lutz Finger reports in Forbes:

Data means nothing by itself: Data has to bring insights one can act on otherwise data is just pain. A big job of anyone in data is to abstract from a detailed discussion in data to enable non-data-savvy folks to participate.
Do you want to work in data? What does this mean? For the last there years I was at one of the coolest companies in the data space: LinkedIn! LinkedIn has amazing data assets and they have been at the forefront of data science. I want to share what I see as the three key tasks of a Director of Data Science: insights, tools & innovation
Insights, Insights, Insights
Data means nothing by itself: Data has to bring insights one can act on otherwise data is just pain. Data has to answer questions such as, why do we have more sign-ups today? Why is usage up or down? Think of traditional business intelligence but with agility and quick turnarounds enabled by today’s big data tooling. One example of how to create insights is to measure public reactions to our products: how, and how often, do customers talk about them. Very early on my team and I built a tool called “Voices” that collected public reactions to our product and analyzed them with set metrics in order to improve our products. (See here a talk on what is ‘under the hood’ of of our text mining platform)
But Insights are often nimble and very detailed oriented. A big job of anyone in data is to abstract from a detailed discussion in data to enable non-data-savvy folks to participate. You may want to call this “storytelling”, even though the “story” in this case is not fictional. Data should form the foundation but our focus as data scientist is to derive to a simple action, no matter how much data was used in the process. (See my book “Ask Measure Learn“)
As an example take our vision at LinkedIn is to create economic opportunity (Read about the value of social networks). I was fortunate to spearhead a partnership at LinkedIn with the World Economic Forum, where my team helped them to find insights into the skills marketplace. Using our data we could show that a job title is not a good description of any given job, but rather the collection of skills of the ones who work in this job. Thus, a marketing director means something completely different to Proctor & Gamble versus the oil and gas industry. Once you broken down jobs into skills they are becoming tradable goods. You have a skill and you offer them to an employer. The picture below shows the 100 top talent flows worldwide. Those kind of insights can help governments and universities to understand in what kind of education they need to invest. That are data insights at it’s best. (Read more about LinkedIn’s data and the World Economic Forum here)
Top 100 Talent Flows by LinkedIn & World Economic Human Capital Report 2016
But finding actionable insights is not always easy. There is the well-known example that AltaVista lost all of its market share to Google, not because they did not have the data, but because they did not find the right actionable insight. For many this struggle of Data Scientists might be surprising but the reason is actually simple: Managers who understand the needs of the business might not understand data, While data scientists might not understand business needs. To cross this chasm has been a challenge for many companies.
One way to do this is to educate managers more broadly about data. This was one of the reasons that I started to teach data science to MBA students at Harvard Business School & Cornell. Another option is to make data so easy that even business people can actually use it without coding or complex models. In much the same way you do not wonder about the underlying data or algorithms suggesting your next movie on Netflix, you as a business manager should not have to worry about the technology behind your data. At LinkedIn, my team standardized the way we show insights and built a platform that now serves thousands of business users weekly. The insights are typically a chart or a table created by a dashboarding tool, output to a PowerPoint file ready to be used by the business owner.  The key to its success was to decouple the content – and thus the insights – from the engineering platform to scale it up.
The tools space is surely bigger than just an insights platform. The data tooling space has evolved rapidly (see here my review of Gartner’s magical quadrant in data tools) but there is still room for improvement. Thus, many tech companies are building their own tools and as someone in Data you will need to push the boundary in terms of speed, accuracy and simplicity. Maybe one project to call out is here LinkedIn’s way to create a unified messaging platform to facilitate the creation of streamlined and consistent metrics. I see many companies now starting to evolve into the same direction to control and audit the data trail.
Often when we talk data, we talk about recommendation systems (read here the reasons we love recommender systems are):  Systems that recommend have become a competitive advantage in healthcare, telecommunications, retail, media, energy, and many other industries. But often it is not the actual algorithm – thus the most ‘data science’ like task – that make or break a new product but it is either the missing business application or the missing data (read here why so many data products fail).
Thus, how should you embrace data and innovation? Believe it or not, for many companies the answer is actually quite easy: go down the beaten path. There are many known use-cases of data such as churn prediction, territory planning, customer segmentation, and others. Use the services and expertise of commercial companies in these areas and voila – you would be data driven (read here about predictive analytics as platform). For example, if you want to hire the right talent, use LinkedIn. Becoming data driven can be that simple.The harder part is that if you have been down the beaten path already, you need to foster innovation using your data assets. At LinkedIn we took several approaches to foster innovation from hack-days over to agile iterations (see here our Strata Talk). One way to foster innovation to seek collaborations with universities thus we kicked off the Economic Graph Challenge where we asked Universities propose how LinkedIn’s data could create economic opportunity?  Over the last 1.5 years my team did here stunning work from predicting the growth of a city to understanding much the LinkedIn social graph of companies describes their company culture and thus their stock market share price.This is what you should do if you work in data – bring business insights, technical expertise and innovation together to create the competitive advantage for your business. And, well! don’t forget the fun part (such as this pop-up-beer-garden-party with my team).Lutz Finger leads a Data Team at LinkedIn and is Data Scientist in Residence at Cornell. He teaches "Critical Thinking" at Harvard and has co-authored "Ask, Measure, Learn".


Post a Comment