Want to be a data science rockstar? of course you do! Sorry for the clickbait headline, but I wanted to reach as many people as I can with this important piece of information.
Want to know what the ‘one skill’ is?
It isn’t python or R or Spark or some other new technology or platform. It isn’t the latest machine learning methods or algorithms. It isn’t being able to write AI algorithms from scratch or analyze terabytes of data in minutes.
While those are important – very important – they aren’t THE skill. In fact, it isn’t a technical skill at all.
The one skill that will make you a data science rockstar is a so-called ‘soft-skill’. The ability to communicate is what will set you apart from your peers and make you stand out in an increasingly full world of data scientists.
Why do I need to communicate to be a data science rockstar?
You can be the smartest person in the world when it comes to creating some wild machine learning systems to build recommendation engines, but if you can’t communicate the ‘strategy’ behind the system, you’re going to have a hard time.
If you’re able to find some phenomenal patters in data that have the potential to deliver a multiple X increase in revenue but can’t communicate the ‘strategy’ behind your approach, your potential is going to be unrealized.
What do I mean by ‘strategy’? In addition to the standard information (error rates/metrics, etc) you need to be able to hit the key ‘W’ points (‘what, why, when, where and who’) when you communicate your output/results. You need to be able to clearly define what you did, why you did it, when your approach works (and doesn’t work), where your data came from and who will be effected by what you’ve done. If you can’t answer these questions succinctly and in a manner that a layperson can understand them, you’re failing a data scientist.
Two real world examples – one rockstar, one not-rockstar
I have two recent examples for you to help highlight the difference between a data science rockstar (i.e., someone that communicates well) and one not-so-much rockstar. I’ll give you the background on both and let you make up your own mind on which person you’d hire as your next data scientist. Both of these people work at the same organization.
She’s been a data scientist for 4 years. She’s got a wide swatch of experience in data exploration, feature engineering, machine learning and data management. She’s had multiple projects over her career that required a deep dive into large datasets and she’s had to use different systems, platforms and languages during her analysis. For each project she works on, she keeps a running notebook with commentary, ideas, changes and reasons for doing what she’s doing – she’s a scientist after all. When she provides updates to team members and management, she provides multiple layers of details that can be read or skipped depending on the level of interest by the reader. She providers a thorough writeup of all her work with detailed notes about why things are being done they way they are done and how potential changes might effect the outcome of her work. For project ‘wrap-up’ documentation, she delivers an executive summary with many visualizations that succinctly describes the project, the work she did, why she did what she did, what she thinks could be done to improve things and how the project could be improved upon. In addition to the executive summary, she provides a thorough write-up that describes the entire process with multiple appendices and explanatory statements for those people that want to dive deeply into the project. When people are selecting people to work on their projects, her name is the first to come out of their mouths when they start talking about team members.
He’s been a data scientist for 4 years (about 1 month longer than Person 1). His background is very technical and is the ‘go-to’ person for algorithms and programming languages within the team. He’s well thought of and can do just about anything that is thrown over the wall at him. He’s quite successful and is sought after for advice from people all over the company. When he works on projects he sort of ‘wings it’ (his words) and keeps few notes about what he’s done and why he’s chosen the things he has chosen. For example, if you ask him why he chose Random Forests instead of Support Vector Machines on a project, he’ll tell you ‘because it worked better’ but he can’t explain what ‘better’ means. Now, there’s not many people that would argue against his choices on projects and his work is rarely questions. He’s good at what he does and nobody at the company questions his technical skills, but they always question ‘what is he doing?’ and ‘what did he do?’ during/after projects. For documentation and presentation of results, he puts together the basic report that is expected with the appropriate information but people always have questions and are always ‘bothering him’ (again…his words). When new projects are being considered, he’s usually last in line for inclusion because there’s ‘just something about working with him’ (actual words from his co-workers).
Who would you choose?
I’m assuming you know which of the two is the data science rockstar. While Person 2 is technically more advanced than Person 1, his communication skills are a bit behind Person 1. Person 1 is the one that everyone goes to for delivering the ‘best’ outcomes from data science in the company they work at. Communication is the difference. Person 1 is not only able to do the technical work but also share the outcomes in a way that the organization can easily understand.
If you want to be a data science rockstar, you need to learn to communicate. It can be that ‘one skill’ that could help move you into the realm of ‘top data scientists’ and away from the average data scientists who are focusing all of their personal developer efforts on learning another algorithm or another language.