After 6 years in Data Science, you’d think I’d be able to confidently answer the question “Data Scientist huh… so what do you do?” I usually respond with the pithy phrase “I help people make better, faster decisions.” And ironically, I can’t really prove that I’ve done this with enough consistency to justify “Data Science” as my career. Maybe in the context of analyzing an A/B test I can say that I helped someone make a decision… but maybe it would have been faster if they just used Optimizely, or not have run the test at all!
So naturally, I lament to my friends about the seeming worthlessness of my work. They tell me to become an engineer, or maybe I’d like to try my hand at product management? I start clicking on articles from ex-data scientists at top tech companies exclaiming that you should not become a data scientist like this, this or this.
But changing careers is hard, and I can be lazy. So the armchair economist in me started to think about markets. If data science really isn’t valuable, why does the market seem to assign a value to us close to that of an engineer?
Perhaps it isn’t that my work isn’t valuable, but I’m just falling into a trap of not assigning value correctly.
Imagine you’re a barista at the local coffee shop and you tell yourself that the purpose of your job is to help everyone in your neighborhood be more productive. You might see caffeine fueled customers typing away furiously, but really, will you ever be able to know if you’re succeeding or not? You’re likely to pretty quickly become disenchanted by your barista job. But what if instead you focused on a more tangible and direct goal — delighting customers with delicious coffee and warm conversation.
So I looked through the historical data of my career to see if I could identify patterns that would point me to where I’ve really been creating value, in hopes to reframe how I think about my work.
A mental catalog of all the work I’ve done in my career, serving behemoth companies, to small tech startups, led me to coalesce on 3 reasons why data science really is valuable in today’s and tomorrow’s organizations.
1. Creating and maintaining metrics is hard
Data Scientists have ownership over the dissemination of data in their organizations. We have the responsibility of understanding the data generation process, crafting metrics with business logic, and presenting this in a digestible format. And because a business is an ever-evolving organism, so is the data it generates, and consequently its metrics.
Take this seemingly simple question for my new subscriptions business Barkflix (Netflix for dogs, of course).
How much revenue from new subscribers did we receive yesterday?
Here’s an internal dialogue my data scientist might have.
Okay, simple enough, let’s just get the revenue from new subscribers from our payment processor.
Hmm oh wait, our payment processor doesn’t know who is a new customer vs. existing. I need to check our application database. Oh, that’s right we have three different payment processors. Looks like I need to combine all these somehow. Wait, what, we added a fourth processor? How do I even get that data…?
Oof. Nate wants this when again?
I think I, along with a slew of other data scientists think this time consuming work is beneath us. It’s just work that just gets in the way of that clustering analysis that I just know will revolutionize the business.
The truth of the matter is that this work is valuable and challenging. The rise of the analytics engineer is more evidence that this work is valuable. These roles have descriptions centered around Designing, building and maintaining scalable data models to power self-service business intelligence tools and promote data-driven decision making. Or, making sense of data, and disseminating it to the people.
2. Small optimizations matter, or more businesses have more scale
With the rise in the digital economy, we’ve seen more businesses be able to reach more people. This rise in companies with scale means that smaller improvements matter more.
Revisiting my hot new unicorn, Barkflix, let’s say I charge $10/month for a subscription to this absolutely critical piece of the content ecosystem. A rockstar data scientist on my team identifies a high friction drop-off point in our signup flow. We design a fix and improve our conversion rate from 10% to 11%.
Now imagine this is still early days at Barkflix, we’re a small mom & pop with 500 visitors per month to our signup page, which means this improvement leads to 5 more customers, and an extra $50 in my pocket per month. I can treat a couple friends to some $19 cocktails that they tell me “are so worth it.”
Now let’s scale this up a bit. I now get 500,000 monthly visitors, this improvement now nets me 5,000 new customers, and $50,000 more. That same change will more than cover the rent of my way-to-expensive Manhattan apartment.
Finding potential opportunities for, and the testing of incremental improvements is part of a data scientist job. Each small improvement to revenue-driving metrics a data scientist enables generates more and more value as scale increases.
3. We have more wealth, more time, so we can have more things. Namely more questions, and a higher degree of certainty in our answers
Much akin to the phenomenon of lifestyle creep, whereby increases in income lead to increased spending lifestyle goods (nice restaurants, shoes, cars, books, etc.) This higher level of consumption becomes normalized and even perceived as a necessity. However, more money isn’t the only piece of the equation. You need the time to be able to actually consume these shiny new objects.
Imagine going from a 2 hour commute each way for a 10-hour a day shift at $15/hour, to a 7-hour workday netting that pays $200k/year that can be done from the comfort of the bedroom. This combination of more time and more money will naturally shift your attention from your day-job to more leisure activities.
In the case of the workplace, we’re seeing a parallel phenomenon. Over the past twenty years, we’ve seen massive improvements to workplace productivity. For evidence, take the rise of new and improved SaaS tooling that free us up from pushing paper all day.
Salesforce revolutionized the CRM space. Asana, Airtable and Jira have helped teams organize their work. Hubspot, Marketo and Google Analytics automate previously time consuming tasks for marketers. The result of this is that more people are now able to focus their attention on problems that haven’t been solved by SaaS tools.
These problems tend to be too nuanced and bespoke for any tool to design a blanket solution. These are the types of problems that tend to fall into the hands of an analytics team. Was our product launch successful? What’s the next feature we should build for our product? What type of content should we invest more in producing over the next year?
I don’t think we’d even be asking these kinds of questions (or at least not really expecting someone to answer them) if we hadn’t optimized product development, web development, content creation and the slew of other workflows necessary to run a business.
I think this last piece is the hardest pill for me to swallow. It’s often that the questions asked here aren’t really best answered by a data scientist, or we just don’t have a reasonable data set to work with. Sometimes these questions can truly be a waste of time to answer, and can drive an analyst into a rabbit hole from which they never return.
So am I eternally happy now?
I’ve resolved my skepticism of the value of data science… in part. There are still times when I struggle to feel like I’m having a meaningful impact on the teams I work with, and to the business as a whole. But, when I’m floating in the existential ether while my slack gets blown up with ad-hoc requests, I find that I’m able to find meaning in most of my work by falling back on these 3 pillars of data science value.