This one skill will make you a data science rockstar

Image for data science rockstarWant to be a data science rockstar? of course you do! Sorry for the clickbait headline, but I wanted to reach as many people as I can with this important piece of information.

Want to know what the ‘one skill’ is?

It isn’t python or R or Spark or some other new technology or platform.  It isn’t the latest machine learning methods or algorithms. It isn’t being able to write AI algorithms from scratch or analyze terabytes of data in minutes.

While those are important – very important – they aren’t THE skill. In fact, it isn’t a technical skill at all.

The one skill that will make you a data science rockstar is a so-called ‘soft-skill’.  The ability to communicate is what will set you apart from your peers and make you stand out in an increasingly full world of data scientists.

Why do I need to communicate to be a data science rockstar?

You can be the smartest person in the world when it comes to creating some wild machine learning systems to build recommendation engines, but if you can’t communicate the ‘strategy’ behind the system, you’re going to have a hard time.

If you’re able to find some phenomenal patters in data that have the potential to deliver a multiple X increase in revenue but can’t communicate the ‘strategy’ behind your approach, your potential is going to be unrealized.

What do I mean by ‘strategy’?  In addition to the standard information (error rates/metrics, etc) you need to be able to hit the key ‘W’ points (‘what, why, when, where and who’) when you communicate your output/results. You need to be able to clearly define what you did, why you did it, when your approach works (and doesn’t work), where your data came from and who will be effected by what you’ve done.  If you can’t answer these questions succinctly and in a manner that a layperson can understand them, you’re failing a data scientist.

Two real world examples – one rockstar, one not-rockstar

I have two recent examples for you to help highlight the difference between a data science rockstar (i.e., someone that communicates well) and one not-so-much rockstar. I’ll give you the background on both and let you make up your own mind on which person you’d hire as your next data scientist. Both of these people work at the same organization.

Person 1:

She’s been a data scientist for 4 years. She’s got a wide swatch of experience in data exploration, feature engineering, machine learning and data management.  She’s had multiple projects over her career that required a deep dive into large datasets and she’s had to use different systems, platforms and languages during her analysis. For each project she works on, she keeps a running notebook with commentary, ideas, changes and reasons for doing what she’s doing – she’s a scientist after all.   When she provides updates to team members and management, she provides multiple layers of details that can be read or skipped depending on the level of interest by the reader.  She providers a thorough writeup of all her work with detailed notes about why things are being done they way they are done and how potential changes might effect the outcome of her work.  For project ‘wrap-up’ documentation, she delivers an executive summary with many visualizations that succinctly describes the project, the work she did, why she did what she did, what she thinks could be done to improve things and how the project could be improved upon. In addition to the executive summary, she provides a thorough write-up that describes the entire process with multiple appendices and explanatory statements for those people that want to dive deeply into the project. When people are selecting people to work on their projects, her name is the first to come out of their mouths when they start talking about team members.

Person 2:

He’s been a data scientist for 4 years (about 1 month longer than Person 1).  His background is very technical and is the ‘go-to’ person for algorithms and programming languages within the team. He’s well thought of and can do just about anything that is thrown over the wall at him. He’s quite successful and is sought after for advice from people all over the company.  When he works on projects he sort of ‘wings it’ (his words) and keeps few notes about what he’s done and why he’s chosen the things he has chosen.  For example, if you ask him why he chose Random Forests instead of Support Vector Machines on a project, he’ll tell you ‘because it worked better’ but he can’t explain what ‘better’ means.   Now, there’s not many people that would argue against his choices on projects and his work is rarely questions. He’s good at what he does and nobody at the company questions his technical skills, but they always question ‘what is he doing?’ and ‘what did he do?’ during/after projects.  For documentation and presentation of results, he puts together the basic report that is expected with the appropriate information but people always have questions and are always ‘bothering him’ (again…his words). When new projects are being considered, he’s usually last in line for inclusion because there’s ‘just something about working with him’ (actual words from his co-workers).

Who would you choose?

I’m assuming you know which of the two is the data science rockstar. While Person 2 is technically more advanced than Person 1, his communication skills are a bit behind Person 1. Person 1 is the one that everyone goes to for delivering the ‘best’ outcomes from data science in the company they work at.  Communication is the difference. Person 1 is not only able to do the technical work but also share the outcomes in a way that the organization can easily understand.

If you want to be a data science rockstar, you need to learn to communicate. It can be that ‘one skill’ that could help move you into the realm of ‘top data scientists’ and away from the average data scientists who are focusing all of their personal developer efforts on learning another algorithm or another language.

By the way, I’ve written about this before here and here so jump over and read a few more thoughts on the topic if you have time.

Photo by Ben Sweet on Unsplash

To have a great analytics culture, you need a great communications culture

Employee-communicationWhen you read about big data and/or data analytics projects and systems, it is rare that you also read bout communicating the outcome of those projects. Without the ability to communicate the results of any analysis to the broader business, most big data / analytics projects are doomed to mediocrity…or even failure.

The quantitative mind is a great one. It is one that I’m very familiar with and one that I wholeheartedly support.  The ability to take a data set, analyze that data and create new information and knowledge from that data is an extremely important skill for people and organizations to have.

Just as important is the skill to be able to convert the outcome of any quantitative analysis into something that is easily digestible by people throughout an organization.

Take, for example, the world of academia.  There are many really smart people performing research within universities and research facilities. These people conduct research and then publish the outcomes of that research in academic journals to share their new-found knowledge with others.

Have you ever picked up an academic journal/article? These articles are generally well-written and delivered in formal academic styles but they aren’t exactly ‘easy reading’.   They are meant to be used for academic reporting within academic circles. They are also used within industry but most practitioners that read these journals and articles are usually people with similar education and experience as those folks who are writing / publishing these articles.

What happens when a finance manager picks up the Journal of Finance paper titled “Determinants of Corporate Borrowing?” Will they easily understand what the paper is trying to communicate?  Let’s take a look at a portion of the abstract of the paper:

Many corporate assets, particularly growth opportunities, can be viewed as call options. The value of such ‘real options’ depends on discretionary future investment by the firm. Issuing risky debt reduces the present market value of a firm holding real options by inducing a suboptimal investment strategy or by forcing the firm and its creditors to bear the costs of avoiding the suboptimal strategy. The paper predicts that corporate borrowing is inversely related to the proportion of market value accounted for by real options. It also rationalizes other aspects of corporate borrowing behavior, for example the practice of matching maturities of assets and debt liabilities.

I would argue that anyone – given enough time – could understand what that paragraph is trying to communicate, but in the fast-paced world of business, does anyone really have time to sit down and study this paper?  I doubt it.  Most will call up a consultant and ask to help better understand the optimal approach to corporate debt.  What is that consultant going to do?  She will take her experience as a consultant (and in finance/banking), study the business, literature and best practices and then make a recommendation to the business on what they should do. If the consultant is any good, these recommendations will be provided in an easy to understand document that can be implemented effectively within the organization.

The same approach needs to be taken with data analytics.  We can’t just throw a spreadsheet or chart over the wall at the business and expect them to understand what the data is telling them or what they should with that data. I see a lot of this these days though. A company will implement a new big data project, perform some analysis of the data and then provide the output of the analysis in pretty charts and tables but very rarely are there deep, meaningful discussions and analysis about what that data is really telling the business and/or what the business should do based on the data analysis.

Now, you may say that good data scientists / analysts already do this…and you’d be right. But, not everyone is a great analyst nor is it a skill set that most organization’s are hiring for these days. When I talk to clients about big data, they talk about the need to get the best hardware, software and analytical skills…but they rarely talk about the need to find great communicators.

Companies regularly spend millions of dollars on the ‘hard’ costs for big data and data analytics. They’ve even begun spending a good deal of money on the ‘soft’ costs to get their people the best training available so they can be the best data analysts available but it is rare that they spend much money on communications training.

The funny thing about this particular topic is that most data scientists consider themselves to be good communicators.   In my experience, the really good ones are…but the majority of the ‘new’ data scientists struggle with this aspect of their job.

If you want to be a great data scientist, become a great communicator and storyteller. As a data scientist, if you can’t communicate in a way that is informative and useful to the business, the work you do in the ‘quant’ world isn’t that valuable to the company.  The same can be said to the business in general – if you want a great data analytics culture, build a great communications culture. You can’t have one without the other.

If you'd like to receive updates when new posts are published, signup for my mailing list. I won't sell or share your email.