More feedback from clients can help you improve your service. More money can help you build better products and teams. More data can help you make better decisions. More resolution can make your photos better.
More is always better isn’t it?
Well. No. More isn’t always better.
Seth Godin recently said that “Too much resolution stops giving you information and becomes merely noise, which actually gets in the way of the accuracy you seek.”
This is very true. Anyone that’s ever worked with data will tell you that more data just means more work. Sure, you may find a great nugget in that additional data, but that extra data doesn’t always equate to more knowledge but it always equates to more work.
To Seth Godin’s point, more ‘resolution’ isn’t always the answer either. I can go buy a $50,000 camera with the highest resolution possible and still make terrible photos. Just because I have the resolution available to me doesn’t mean I have the lenses available to take advantage of that resolution nor does it mean I have the talent to utilize the high resolution.
Rather than go spend $50K on the ‘best’ camera, spend $500 on an OK camera and learn the skills and methods needed to make the most of what you have. When you’ve mastered your ‘art’, then move up to something more expensive with more functions.
Rather than focus on gathering more data, you need to be focused on using the data you have in the most optimal way possible. Make sure you have the tools and skills in place to analyze / use what you have before you go and add ‘more’ to the mix.
In his 1982 book Megatrends, John Naisbitt wrote “We are drowning in information but starved for knowledge.” While written over 30 years ago, that line is as very true today…but I might change it a bit to match the current state of affairs. Today, we are drowning in data and starved for information.
Every organization has a great deal of data and more data is being collected every day. In addition to the already large data-sets that exist today, many organizations are looking for ways to collect exponentially more data with the Internet of Things (IoT). They want sensors to collect data from all aspects of the business including how their clients interact and use their products and services.
Anyone can collect data. Its easy. All you need to do is turn on a collection system and store data somewhere. IDC reported in 2012 (pdf) that by 2015, we’d see data stores grow to roughly 8 Zetabytes (ZB) within organizations worldwide.
That’s a lot of data…but how much of that data will actually be useful? They have a lot of data…but do they have any actionable intelligence?
Data is useless unless you can convert it to information and ultimately into knowledge. In recent years, big data has been what organizations use to describe their attempts to converting all of their data into useful information.
I’m a fan of big data. I really am. I’ve said for a while that big data is more than a buzzword. Done right, big data can bring a great deal of value to a business but done poorly, big data is nothing more than bits and bytes flying around an organization. Done poorly, big data is just adding more layers of data to make it easier to drown.
When I speak to organizations about their big data initiatives, I find many that understand how important it is to convert their data into useful information. These companies understand that the work they are undertaking is much more than data analysis. They know that data is worthless unless it can be analyzed in a way that produces useful and actionable information. They understand that their big data initiatives are actually big information initiatives.
But…there are still many who don’t understand the importance of the output of data analysis. Sure, most people and organizations understand that data needs to be analyzed but many don’t understand how best to analyze that data. They implement systems and processes for data analysis but never stop to think about how best to use those systems to get the most from their data.
Big data initiatives are worthless unless their end-goal is to deliver information to an organization. That information must then be converted into knowledge to ultimately be worthwhile to the business. Maybe its time we stop talking about big data and start talking about big information…or even better…big knowledge.
Following up my Data Disconnect and Shadow IT post from yesterday, I wanted to talk about the 2nd area that is often overlooked when people undertake their own Shadow IT initiatives.
In my previous post, I talked about the Data Disconnect. That space where the data in your Shadow IT applications is disconnected from the rest of the organization.
This disconnect is something that requires the IT group to educate the rest of the organization as highlighted by Christian Verstraete in the his Enterprise CIO Forum titled Shadow-IT, it’s forbidden to forbid. In some instances, the Data Disconnect isn’t a big issue…but many times, the disconnect is a huge risk for the organization.
Today, I want to talk about another aspect of Shadow IT related to Data and Information…the optimization of data. The world today is ruled by data. That data is turned into information and sometimes that information is converted into knowledge.
When data lives outside the enterprise in the cloud or within a local ‘shadow’ database, it’s disconnected. To be able to use the data within your organization’s applications, they need to be connected. Therefore…the first step is solving the Data Disconnect problem.
Once you know how you’ll solve that disconnect problem…whether by using internal systems, API’s to access cloud app data or simple scripts to dump/convert data…then you need to think about the Optimization and Conversion problem.
The optimization problem is a big one.
There is a ton of useful (and useless!) data in every organization living as structured and unstructured data.
Structured data is quite easy to access and use and is fairly easy to connect when you face yourself with a data disconnect. Find the data. Access/Dump the data. Use the data. Repeat.
Unstructured data is different. This is the data that is growing exponentially these days. Its your email, text messages, twitter messages, blog posts, images, videos etc. The data stored within these mediums is unstructured in that it is text based and or audio/video. Optimizing and using this data is difficult when its stored inside enterprise applications and its even more difficult when this type of data is stored in applications that aren’t managed by the IT group.
This unstructured data is what you find in collaboration tools. Its the information that your team’s share and knowledge that your team’s create. If its stored in a third-party system with little to no access to retrieve the data, its not only disconnected, but useless.
Imagine that you work with a virtual team that is ‘in the cloud’. You use something like Basecamp or some other web based project management and collaboration tool to manage your projects. In addition, your team uses email and an instant messaging platform like Skype to keep in touch throughout the day.
A great deal of knowledge flows through your collaboration platforms….but what happens to that knowledge after the first creation and share? Does it sit out in ‘the cloud’ forever and is never revisited…or do you somehow grab that knowledge to ‘share’ with the rest of your organization.
You can’t optimize the information and/or share the knowledge if it isn’t held within the organization’s systems in a manner that use usable and accessible. This is the challenge of information optimization in the world of Shadow IT. There’s a lot of data / information / knowledge created that might be lost ‘in the cloud’ when these things aren’t considered.
So…CIO’s and IT groups…take the time to educate your organization on the pro’s and con’s of Shadow IT. If people are adamant about using a cloud service that doesn’t fit into the IT Strategic roadmap, make sure you understand why they are so adamant about it and what they and you must do to make sure the Data Disconnect and Information Optimization problems are considered and addressed.
I don’t talk or write much about ‘data’…mostly because I’ve always taken it for granted as something that was always ‘there’. If the data I needed wasn’t available in an easy to consume format, I’ve always found a way to get what I needed through data collection, data manipulation or by hacking together data to get what i needed.
To me, data has always been something that I’ve used to do my job. Data is something that I’ve used to help inform myself, my teams, my organizations and my clients.
I’ve often heard people and companies talk about being ‘data driven’ and have always felt like I was missing something as I never really understood what they meant by being ‘data driven’.
In my world, data has always been the building block of services and platforms but data isn’t driving me, my business or my teams. Data is the base level of the business. Data is the business in its rawest form…but its also meaningless without context and meaning.
Most of my thinking towards ‘data’ comes from my systems thinking and knowledge management education and training in the form of the Russell Ackoff model. The Ackoff model claims that there are five ‘buckets’ that content in the human mind can be classified into. These buckets are:
In the systems thinking and knowledge management world, the “Data -> Information -> Knowledge” model is quite prevalent…or maybe more accurately, its been the prevalent filter that i’ve used in my work.
So…from my filter, Data is the rawest level of ‘stuff’. Its the baseline that you build from. Data leads to information, which leads to knowledge…but data is nothing until you build something on top of it…until you add some form of context or meaning.
Therefore, it was always hard for me to understand the ‘data driven’ people who’ve been popping up everywhere over the last few years. I’ve never really given much credence to the ‘data driven’ mantra.
In the article, Peterson makes a fairly convincing plea to stop using the term ‘data driven’…rather, he says, use something more like ‘data informed’.
My concern arises from the idea that any business of even moderate size and complexity can be truly “driven” by data. I think the right word is “informed” and what we are collectively trying to create is “increasingly data-informed and data-aware businesses and business people” who integrate the wide array of knowledge we can generate about digital consumers into the traditional decisioning process. The end-goal of this integration is more agile, responsive, and intelligent businesses that are better able to compete in a rapidly changing business environment.
I can get behind ‘data informed’.
I can get behind using data to make better decisions. At the end of the data, thats why you collect data…to make better decisions. But…you’ve got to put meaning, context and definition around that data to make it useful.
I’m keeping an eye on Eric’s post to see what discussions come out of it but I’d love to hear your thoughts on how you view ‘data driven’ vs ‘data informed’.
Last week I published a post titled Mining for Knowledge where I discussed some of the research that I’ve been doing in my doctorate program.
One of the favorite lines from the article, and one that resonated with a few others as well. The line was:
…converting tacit (i.e., internal) knowledge to explicit (i.e., external) knowledge is one of the most difficult things to do.
I’ve been thinking about this (and reading A LOT of articles, papers and books on the subject) and have come to the conclusion that trying to force someone to convert tacit knowledge to explicit knowledge is a wasted effort.
Can I truly convert 100% of my knowledge into the written form? Will the context of my knowledge be converted? Perhaps a good portion of my knowledge can be converted, but can my experiences, thoughts and believes that shaped that knowledge be converted? Can I ‘write down’ the knowledge that I have and truly make it meaningful to others? I don’t think (feel free to disagree here).
Does that mean that an organization should stop trying to gather an individual’s internal knowledge to add to overall organizational knowledge-base? Nope…. definitely not.
Rather than forcing a conversion from tacit to explicit (which is darn near impossible), are there ways to manage the internal knowledge of people? Managing that knowledge is a much easier process that converting that knowledge.
Knowledge is best internalized when wrapped in context
Basically, they’re saying that in order to share internal knowledge, you’ve got to start a dialogue with others. That’s why activities like storytelling, mentoring and other forms of social interaction can play a huge role in knowledge managment…they help to start and maintain dialogue and discussion on various topics. These activities help to provide context around knowledge, which helps a person internalize that knowledge and make it their own.
In my previous article I talked about ‘mining for knowledge’. I talked about using web 2.0 platforms to capture knowledge and to share knowledge. All good stuff (and still interesting to me) but I’m looking at other methods to make these platforms more social. Make dialog and discussion a more active portion of these tools.
If we can find ways to create dialogue and discussion within the enterprise, knowledge sharing would happen much more naturally. This is why I like the idea of Enterprise 2.0. While some people hate E2.0, I think there’s some real value there. Of course, E2.0 won’t solve world hunger and probably will never truly win over its detractors, there are many aspects to the idea that make sense.
What would it mean for an organization’s knowledge managements capabilities if a system could be implemented that found indexed the many disparate repositories of structured and unstructured data sources found throughout the enterprise and then provided that information in a socially aware platform that could wrap context around the indexed knowledge as well as provide a mechanism for dialogue, discussion and reflection? You’d have a platform that could capture and share explicit and tacit knowledge.
Anyone know of any companies with products in this space? I know SocialText is out there but I don’t think they have a platform as robust as the one above. SharePoint also has some aspects to this but not everything.
In my doctoral research, I’ve been researching ways to improve knowledge capture and sharing methods, specifically within project teams but the ideas can be dissemenated around the organization.
One of the biggest issues I’ve found while working as a consultant is the amount of knowledge that I walk away with after a project is complete. Sure, I try to share this knowledge in every way possible but converting tacit (i.e., internal) knowledge to explicit (i.e., external) knowledge is one of the most difficult things to do.
Let’s assume though, that some portion of the knowledge that I hold in my head is converted into some form of writing at various periods throughout a consulting project. Where does that explicit knowledge live? In an email? In some document stored on a server? In a knowledge repository somewhere?
In the past, this problem has been attacked using centralized knowledge repository platforms. These systems require users to log in and ‘enter’ their knowledge into the system. Many of these platforms have been well built and some have been successfully used in organizations, but the success stories are far outweighed by the stories of KM repositories sitting idle and unused.
So…how can we get that tidbit of knowledge from my brain into some form of knowledge repository without me logging in and ‘entering’ it into the system?
Web 2.0 as knowledge repository
The use of Web 2.0 tools (blogs, IM, wikis, etc) has become ubiquitous.. If incorporated into a project environment, these tools might allow an easy and efficient method for capturing and sharing knowledge throughout project teams and project organizations.
The key to retrieving knowledge from tools is to make the user experience as seamless as possible. For example, an employee creates a blog on an organization’s intranet and then uses this blog to write different topics, some that pertain to her project and some that don’t.
Perhaps this employee is participating in two projects within the organization and she writes about topics that might be of interest to a portion of the organization and project team members. While she writes about interesting topics and at times, writes about her experiences on the projects that she’s worked on, perhaps her blog posts aren’t widely read. This employee has attempted to convert a portion of her tacit knowledge to explicit knowledge but few people on the project team or within the organization find this knowledge because its tucked away in the intranet site (which is rarely used anyway).
In the above scenario, knowledge was converted from tacit to explicit but few people are able to absorb this knowledge and make it their own (i.e., perform the conversion from explicit to tacit knowledge). What would happen if this knowledge were indexed, searched and shared with the rest of the project team in something akin to a project knowledge ‘journal’?
Since Web 2.0 platforms are ubiqutious, why can’t we use these tools as our knowledge repository? Employees and project team members are already using them…so can we find a way to ‘mine’ these platforms for knowledge?
Could a system be built that ‘mines’ these web 2.0 platforms along with other unstructured data (documents, email, etc) to ‘build’ a knowledge repository available to the entire organization?
Mining for Knowledge
I’m currently looking at ways to use text mining methods and techniques to mine for knowledge. Text mining looks to be a good approach to solving this problem because it allows for knowledge to be gathered without additional work by project team members.
There are other approaches that could be used for gathering knowledge from project team members, but all require additional work to input information. For example, a project team using a manual approach could ask team members to regularly update their blog and to ‘tag’ their posts with a special project tag or keyword so that a non-intelligent aggregation system (RSS, etc) could simply pull these tagged posts into a central repository. While this is a good approach, it relies on the end-user to tag their content correctly, accurately and in a timely manner. Tagging, and other categorization and taxonomic approaches, require the user to do something to allow their knowledge contribution to be categorized, indexed and found by aggregation systems and other users.
Using text-mining methods against pre-existing tools and platforms takes away the human fallibility issues found in current knowledge management repository platforms or by requiring a user to ‘tag’ a piece of content correctly as described above.
Using text-mining and other data mining approaches, I’m looking at ways to build semi-autonomous systems to index and organize both structured data and unstructured data pulled from blogs, email, IM, social networks, documents, spreadsheets and any other location / data sources. This system could aggregate knowledge found via text mining and social network analysis and build a project knowledge ‘repository’ that will contain all knowledge for any specific project. This repository will be searchable and will contain both manually curated content (e.g., content uploaded by project team members) and automatically curated / generated content based on text-mining and indexing techniques.
There are some major privacy issues here of course. How can you mine a users email and find the relevant knowledge without truly invading their privacy? Not sure you can but I’m looking at it.
Which of these two sources of knowledge would you trust to be more accurate?
The same can be said of knowledge captured and shared within an organization. How do you know that the white paper on your new API is true? Is it because it was released? Is it because of the author(s) of the paper? What if you had a knowledge-base generated by an autonomous agent using text-mining techniques…how would you know to trust the information contained in it? Who wrote the content? Were did it come from?
This is where trust comes into play. If you could ‘see’ the qualifications of the author or authors of the knowledge base articles would you trust the content more? If I knew that the worlds leading authority on organizational behavior wrote the Wikipedia article on the subject, I’d tend to trust that article more.
This is another aspect of my research…building trust into the mined knowledge using social network analysis (SNA) methods & techniques. Using SNA techniques, can the background, profiles, connections and knowledge of the users within an organization be automatically (or semi-automatically) generated to provide some form for initial trust metric to show that mined knowledge can be trusted?
I don’t know if it can…but I’m looking into it 🙂
So what are the next steps for me and this research?
I’m working on a research paper now that I hope will outline the research in more detail.
Eric D. Brown, D.Sc. is a technology consultant, investor and entrepreneur with an interest in using technology and data to solve real-world business problems. He currently runs his own consulting practice focused on helping organizations use their data more efficiently. Additionally, he is the Chief Information Officer of Sundial Capital Research, publisher of sentimenTrader
Eric received his Doctor of Science (D.Sc.) in Information Systems in 2014 with a dissertation titled “Analysis of Twitter Messages for Sentiment and Insight for use in Stock Market Decision Making”. His research interests are currently in the areas of decision support, data science, big data, natural language processing, sentiment analysis and social media analysis.In recent years, he has combined sentiment analysis, natural language processing and big data approaches to build innovative systems and strategies to solve interesting problems. You can read some of his research here: Eric D. Brown on ResearchGate
In addition, he is an entrepreneur that has launched a few companies with the most recent being a company focused on proving data analytics and visualization services to the financial markets.