Eric D. Brown, D.Sc.

Data Science | Entrepreneurship | ..and sometimes Photography

Category: Featured (page 2 of 26)

You Need a Chief Data Officer. Here’s Why.

Image of the word "why"Big data has moved from buzzword to being a part of everyday life within enterprise organizations. An IDG survey reports that 75% of enterprise organizations have deployed or plan to deploy big data projects. The challenge now is capturing strategic value from that data and delivering high-impact business outcomes. That’s where a Chief Data Officer (CDO) enters the picture. While CDO’s have been hired in the past to manage data governance and data management, their role is transitioning into one focused on how to best organize and use data as a strategic asset within organizations.

Gartner estimates that 90% of large global organizations will have a CDO by 2019. Given that estimate, it’s important for CIOs and the rest of the C-suite to understand how a CDO can deliver maximum impact for data-driven transformation. CDOs often don’t have the resources, budget, or authority to drive digital transformation on their own, so the CDO needs to help the CIO drive transformation via collaboration and evangelism.

“The CDO should not just be part of the org chart, but also have an active hand in launching new data initiatives,” Patricia Skarulis, SVP & CIO of Memorial Sloan Kettering Cancer Center, said at the recent CIO Perspectives conference in New York.

Chief Data Officer – What, when, how

A few months ago, I was involved in a conversation with the leadership team of a large organization. This conversation revolved around whether they needed to hire a Chief Data Officer and, if they did, what that individual’s role should be. It’s always difficult creating a new role, especially one like the CDO whose oversight spans multiple departments. In order to create this role (and have the person succeed), the leadership team felt they needed to clearly articulate the specific responsibilities and understand the “what, when, and how” aspects of the position.

The “when” was an easy answer: Now.

The “what” and the “how” are a bit more complex, but we can provide some generalizations of what the CDO should be focused on and how they should go about their role.

First, as I’ve said, the CDO needs to be a collaborator and communicator to help align the business and technology teams in a common vision for their data strategies and platforms, to drive digital transformation and meet business objectives.

In addition to the strategic vision, the CDO needs to work closely with the CIO to create and maintain a data-driven culture throughout the organization. This data-driven culture is an absolute requirement in order to support the changes brought on by digital transformation today and into the future.

“My role as Chief Data Officer has evolved to govern data, curate data, and convince subject matter experts that the data belongs to the business and not [individual] departments,” Stu Gardos, CDO at Memorial Sloan Kettering Cancer Center, said at the CIO Perspectives conference.

Lastly, the CDO needs to work with the CIO and the IT team to implement proper data management and data governance systems and processes to ensure data is trustworthy, reliable, and available for analysis across the organization. That said, the CDO can’t get bogged down in technology and systems but should keep their focus on the people and processes as it is their role to understand and drive the business value with the use of data.

In the meeting I mentioned earlier, I was asked what a successful Chief Data Officer looks like. It’s clear that a successful CDO crosses the divide between business and technology and institutes data as trusted currency that is used to drive revenue and transform the business.

Originally published on

Customer Engagement: A Data-Driven Team Sport

Customer Engagement: A Data-Driven Team SportWhat would you do if you had so much data about your customers that you know could know (almost) everything about your customer when they contacted you? Better yet, what if you had the ability to instantly know the exact offer for service or product that would pitch the right ‘sales’ approach that your customer would immediately sit up, take notice and spend money?

Most of you would jump at the chance to have this information about your clients.  You may be willing to open up the checkbook for a huge amount of money to make this happen.  What if I told you that you don’t need to do much more than get a better grasp on your data and understand how to use that data to build a better overall view of your customer?

Granted, you may need to collect a bit more data (and perhaps find new types of data) and you may need to implement some new data management processes and/or systems, but you shouldn’t have to start from scratch unless you have no data skills, people or processes. For those companies that already have a data strategy and a team of data geeks, building a customer-centric view with data can be extremely rewarding.

This customer-centric, data-driven approach is what most organizations are driving toward with their digital transformation initiatives.  Graeme Thompson, Informatica CIO, has argued for the importance of a customer-centric approach for some time. According to Graeme:

“You have to think about [digital transformation] in a connected way across the entire company.  It’s no longer about executing brilliantly within one functional silo. CIOs see the end-to-end connection [of different functions] across the entire company – how all these different processes need to work together to optimize the outcome for the enterprise, and, most importantly, for customers.”

Many companies consider themselves ‘customer-centric’ and have built programs and processes in order to ‘focus on the customer.  They may have done a very good job in this regard but there’s more than can be done. Most organizations have focused on Customer Relationship Management (CRM) as a way to help drive interactions with clients.  While a CRM platform is important and necessary, most of these platforms are nothing more than data repositories that provide very little value to an organization beyond the basics of ‘we talked to this person’ or ‘we sold widget X to that customer.’

These ‘customer-centric’ companies can be even more custome​_r-centric by becoming a data-driven organization. They have taken a small subset of customer data and built their entire customer engagement process around that data set.  That approach has worked OK for years, but with the data available to companies today, there’s no need to rely solely on that small data set.

Utilizing proper data management and the data lake concept, companies can begin to build much broader viewpoints into their customer base. Using data lakes filled with CRM data along with customer information, social media data, demographics, web activity, wearable data and any other data you can gather about your customers you (with the help of your data science team) can begin to build long-term relationships built on more than just some basic data.

In the white paper titled ‘Game Changers: Meet the Experts Behind Customer 360 Initiatives,’ there are some very good examples of how companies have become much more customer-centric and data-driven.  A few examples from the paper are:

  • FASTWEB uses Salesforce as much more than just a CRM. Their Salesforce instance includes a view into the customer by providing lists of latest invoices, the status of those invoices, payments and other key customer relationship data.
  • PostNL, a mail, parcel and e-commerce company, has changed their focus from simple ‘addresses’ to one that is focused on the customer by focusing first on data, then on the customer. No longer is their focus on getting a package from point-A to point-B, it is on using data to ensure the customer’s needs are met.
  • Bradley Corporation, a 95-year old manufacturer of plumbing fixtures implemented a Product Information Management system to ensure that data is up-to-date and accessible for their more than 200,000 products. This system simplifies the ability for their customers to find the right parts quickly and easily.

In addition to better relationships with your customers, a data-centric approach can help you better predict the activities of your customers, thereby helping you better position your marketing and messaging. Rather than hope your messaging is good enough to reach a small percentage of your customer base, the data-centric approach can allow you to take advantage of the knowledge, skills and systems available to you and your data team to create personal and individual programs and messaging to help drive marketing and customer service.

Originally published on

Marketers – You have too many choices

I have a little secret for everyone in the world of marketing: You have too many choices.

There are way too many technology platforms in existence today. Too many ‘tools’ and too many products.  You have too many choices when it comes to getting your work done. Let’s take a quick second to glance at Scott Brinker’s MarTech 5000 landscape:

Marketing Technology Landscape Supergraphic (2018)

I’m sorry, but that’s just too many choices; especially when put in the hands of people that don’t really understand the long-term implications of multiple technology platforms.

Sure, there may be a formal selection process (in my experience, there’s not…or at least it isn’t followed) and  rarely is there a strategic vision when it comes to MarTech. There’s a bunch of tactical ‘needs’ for why a particular type of platform is needed/wanted and even a hand-wave toward ‘strategy’ but rarely is there an in-depth review of how a new platform will make things better for the marketing team and the organization as a whole and (ahem…most importantly) help reach the strategic objective of the organization.

Too many choices can be a real problem.  Need an ‘optimization’ platform for A/B testing (or other optimization issues)?  I’m sure you can find 30 or 40 vendors out there selling some version of a platform that will do what you need it to do.  Do you take the time to run a thorough selection process or do you find the first one that fits your ‘right now’ need and your budget and push ‘buy’?  Based on my experience, people do the latter and pick the first one they find that does what they need to do.  They find a solution to the problem they have today with very little to no thought put into how that platform will integrate into their broader organization’s ecosystem and/or whether the solution will solve their problem tomorrow.

Don’t get me wrong. Personally, I love the possibilities that these choices offer an organization, but only if proper governance is used when selecting and implementing these choices.  Based on my conversations with clients and marketing /  IT professionals over the last few years, there’s very little of this happening.

Over the last 3 years about half the projects I been asked to be a part of are projects to help simplify the  ecosystem within an organization.  I’ve seen companies with over 100 platforms being used within the marketing team with very few of those systems able to talk to each other — and the lives of the marketing team had become a living hell because they had too many systems, too little control of their data and too little insight into what they are able to do, how to do things and who to go to for help.

What’s the solution?

There’s not an ‘easy’ answer.

It will take hard work, focus and a real drive toward reducing the complexity within your marketing organization.  Think of it as putting your team on a diet – a MarTech diet.  When you ‘need’ (by the way – its rarely a ‘need’ and usually a ‘want’ in these cases) some new function that you just can’t live without – check your existing platforms before going out to buy some new tool. If you are absolutely sure you don’t have the functionality in your existing platforms, take a look at what you’re trying to do and think about if its an absolute need and not just a ‘want’.  More importantly, think about the long term vision / strategy of the organization – how does ‘MarTech Platform X’ get you there?  If you can’t easily answer the question, it might be best to try to find a way to do what you need to do with your existing ecosystem.


Machine Learning Is Transforming Data Security

Machine Learning Is Transforming Data SecurityData is the lifeblood of any organization today so it should be easy to understand that security of that data is just as important (if not more important) that the data itself. It seems that data security (or rather the lack thereof) has been in the news regularly over the last few years. The inability for organizations to secure their has caused millions (if not billions) of dollars in damages from lost revenue in addition to the loss of trust.   A machine learning approach will never fully replace a human in the security chain, but it can help IT professionals monitor IT system and data security as well as monitor who (and how) data is accessed and used throughout the organization.  

Throughout the many different IT departments I’ve talked with over the years, I haven’t met any IT professional in an enterprise organization who wasn’t interested in ensuring enterprise security is intact. Organizations have spent considerable amount of time, effort and money to implement the proper security systems and protocols but most IT professionals are still worried about data security.   

That said, only a small percentage of these same security conscious people have systems or processes in place that accurately and quickly monitor how secure their data is.  In my experiences, sensitive data in most organizations is generally secure but isn’t regularly monitored or audited due to the costs and time commitment needed for analyzing access patterns and ensuring there’s been no intrusions.  In fact, in many organizations, IT professionals would be unable to provide a clear location of sensitive data throughout their organization.

In a Ponemon report titled ‘The State of Data Centric Security’, 57% of survey respondents report see their biggest security risk being that they don’t understand where their sensitive data lives. According to that same report, most IT professionals (79% of respondents) believe that not knowing where their sensitive data lives is a big security concern but only a small majority (51% of respondents) believe that it should be a priority to protect and secure their sensitive data. This gap is problematic and will cause significant issues for organizations.

Data has been – and will continue to be – a large part of most organizations’ digital transformation strategy. That said, this data is also creating new vulnerabilities without the property security systems and process in place. Graeme Thompson, CIO of Informatica, argues this point very well in Data Security: Don’t Call an Ambulance for a Sore Throat when he writes:

Just as businesses have evolved toward the cloud, they’re also evolving toward enterprise-wide data access. We recognize the valuable insights and innovations to be gleaned from trading siloed departmental data warehouses for the comprehensive enterprise data lake. Tearing down those silos can cost us a layer of security around specific data sets, but curling up in an information panic room is not the way forward.

Last year, I was speaking with the CISO for a large enterprise organization. The conversation was around how much time they’ve been spending on thinking about and securing their IT systems and their data. This particular CISO has done a very good job of implementing master data management systems and processes to ensure their data is safe, accurate and available. Though he has done an admirable job, he worries that he doesn’t have the manpower or budget to feel comfortable that the organization’s data is as secure as it can be.

With the large amounts of both structured and unstructured data in most organizations, some of the older IT security approaches may not work as well as they might have in the past.  My suggestion to this CISO was to spend some time investigating the use of machine learning approaches to data security. Machine learning can provide an organization with a ‘second set’ of eyes and ears that can be focused on data security. Implementing machine learning systems can not only free  up team members to focus on other things but – more importantly – these systems can monitor threats and issues at a scale that humans just can’t replicate.

The CISO I mentioned earlier is currently trialing an approach that uses machine learning security monitoring system for both his IT systems and his various data stores and, even though this system has only been in place for less a few months, he’s already begun to see efficiency improvements for security monitoring across the enterprise.  As an example, after only a few days of their new machine learning enabled security platform being in place, they were seeing hundreds of issues through their monitoring systems that they hadn’t been able to capture before. From these efficiencies, he’s been able to re-assign one of his IT personnel from full-time security monitoring to a less than full-time role because the monitoring has been capable of raising alerts in real-time without any manual intervention.

In addition to the act of monitoring for intrusions and security issues, these machine learning systems can help IT professionals locate and manage their sensitive data, recommend remediation efforts and actions when issues are found and gain a better understanding of who is accessing and using data across the organization.

Like many other areas within the modern organization, machine learning is changing how companies approach data security and changing data security itself. Machine learning isn’t a panacea for security, but it is is a very good tool to have in your security tool box.

Originally published on

Data Maturity before Digital Maturity

Data MaturityI recently wrote about Digital Maturity vs Digital Transformation where I proclaimed that its more important to set your goal for digital maturity rather than just push your organization toward digital transformation initiatives.  In this post, I want to talk about one of the most important aspects of digital maturity: Data Maturity. Before you can even hope to be digitally mature, you must reach data maturity.

What is Data Maturity?

Data maturity is the point at which you’ve been able to thoroughly and explicitly answer the ‘who, what, where, when and how’ of your data.  You’ve got to understand the following:

  • where the data came from?
  • where is it stored (and where has it been stored)?
  • how it was collected?
  • how it will be accessed?
  • who will access it?
  • who has had access to it over its lifetime?
  • what type of data is it?
  • if personal data, what types of permissions do you have to use it?
  • when was the data collected?
  • when was the data last reviewed?
  • when was the data last accessed?
  • how do you know the data is accurate?

There are many more questions to ask / answer in the ‘who, what, where, when and how’ universe, but hopefully you get the point. If you can’t answer these questions to build up your data’s “metadata”, then you haven’t reached maturity.

Data maturity requires proper data governance, data management and proper data processes (see previous writings here on those topics).   Like I’ve said before, i’m not an expert in these areas but I do know good data management when I see it – and most organizations don’t have good data management practices/processes.

Data Maturity is more than just technology initiatives though. Its more than having the right systems in place. Data Maturity requires organizational readiness as well as technology readiness; and the organizational readiness is generally the harder of the two data maturity paths to complete.

I’m not going to get into organizational readiness vs technology readiness in this post (I’ll save it for a later post) but just know that there are a lot of parallel paths (and sometimes perpendicular paths) that you need to take to get to digital maturity – and data maturity is one of the important aspects to focus on while working toward that digital maturity goal.

Are you working towards data maturity along the path to digital maturity?

Accuracy and Trust in Machine Learning

accuracy and trust in machine leanringA few weeks ago, I wrote about machine learning risks where I described four ‘buckets’ of risk that needed to be understood and mitigated when you have machine learning initiatives.  One major risk that I *should* have mentioned explicitly is the risk of accuracy and trust in machine learning.  While I tend to throw this risk into the “Lack of model variability” bucket, it probably deserves to be in a bucket all its own or, at the very least, it needs to be discussed.

Accuracy in any type of modeling process is a very nebulous term. You can only build a model to be as accurate as the training data that the model sees.  I can over-optimize a model and generate an MAE (Mean Absolute Error) that is outstanding for the model/data. I can then use that outstanding MAE to communicate the impressive accuracy of my model. Based on my impressively accurate model, my company then changes processes to make this model the cornerstone of their strategy…and then everyone realizes the model is almost worthless when ‘real-world’ data is used.

New (and experienced) data scientists need to truly understand what it means to have an accurate model If you go out there and surf around the web you’ll see a lot of people that are new to the machine learning / deep learning world who have taken a few courses and thrown up a few projects on their github repository and call themselves a ‘data scientist’.  Nothing wrong with that – everyone has to start somewhere – but the people that tend to do well as data scientists understand the theory, process and mathematics of modeling just as much as (or more than) the ability to code up a few machine learning models.

Modeling (which is really what you are doing with machine learning / deep learning) is much more difficult than many people realize.  Sometimes, building a model that delivers 55% accuracy can deliver much more value to an person/organization that one that has been over-optimized to deliver 90% accuracy.

As an example, look at the world of investing.  There are very famous traders and investors who have models that are ‘accurate’ less than half the time yet they make millions (and billions) off of those models (namely because risk management is a large part of their approach to the markets). This may not be a good analogy to use for a manufacturing company trying to use machine learning to forecast demand over the next quarter but the process these investors take in building their models are absolutely the same as those steps needed to build accurate and trustworthy models.

Accuracy and Trust in Machine Learning

If you’ve built models in the past, do you absolutely trust that they will perform in the future as well as they’ve performed when trained using your historical data?

Accuracy and trust in machine learning should go hand in hand. If you tell me your model has ‘good’ MAE (or RMSE or MAPE or whatever measure you use), then I need you to also tell me why you chose that measure and what variances you’ve seen in errors. Additionally, I’d want you to tell me how you built that model.  How big was your training dataset? Did you do any type of walk-forward testing (in the case of time series modeling)?  What have you done about bias in your data?

The real issue in the accuracy and trust debate isn’t with the technical skills of the data scientist to be honest.  A good data scientist will know this stuff inside and out from a technical standpoint. The real issue is in the communication ability of the data scientist and the people she is talking to.  An MAE Of 3.5 might be good or it might be bad and the non-technical / non-data scientists would have no clue in how to interpret that value.  The data scientist will need to be vary specific about what that value means from an accuracy standpoint and what that might mean when this model is put into production.

Accuracy and trust in machine learning / modeling has been – without question – the biggest challenge that i’ve run across in my career.  I can find really good data scientists and coders to build really cool machine learning models. I can find a lot of data to throw at those models. But what I’ve found hardest is helping non-data folks understand the outputs and what those outputs mean(which touches on the Output Interpretation risk I mentioned when I wrote about machine leaning risks).

I’ve found a good portion of my time spent while working with companies on modeling / machine learning is spent on analyzing model outputs and helping the business owners understand the accuracy / trust issues.

How do you (or your company) deal with the accuracy vs trust issue in machine learning / modeling?



« Older posts Newer posts »

If you'd like to receive updates when new posts are published, signup for my mailing list. I won't sell or share your email.