<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Eric D. Brown &#187; Data mining</title>
	<atom:link href="http://ericbrown.com/tag/data-mining/feed/" rel="self" type="application/rss+xml" />
	<link>http://ericbrown.com</link>
	<description>Technology, Strategy, People and Projects</description>
	<lastBuildDate>Tue, 22 May 2012 18:46:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Mining for Knowledge</title>
		<link>http://ericbrown.com/mining-for-knowledge.htm?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=mining-for-knowledge</link>
		<comments>http://ericbrown.com/mining-for-knowledge.htm#comments</comments>
		<pubDate>Thu, 15 Jul 2010 14:45:01 +0000</pubDate>
		<dc:creator>Eric D. Brown</dc:creator>
				<category><![CDATA[Doctorate]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[Knowledge Management]]></category>
		<category><![CDATA[Organization]]></category>
		<category><![CDATA[Data mining]]></category>
		<category><![CDATA[Knowledge]]></category>
		<category><![CDATA[knowledge base]]></category>
		<category><![CDATA[knowledge capture]]></category>
		<category><![CDATA[Knowledge Flow]]></category>
		<category><![CDATA[knowledge repository]]></category>
		<category><![CDATA[Knowledge sharing]]></category>
		<category><![CDATA[mining]]></category>
		<category><![CDATA[sharing knowledge]]></category>
		<category><![CDATA[social information processing]]></category>
		<category><![CDATA[Social network]]></category>
		<category><![CDATA[Tacit knowledge]]></category>
		<category><![CDATA[text mining]]></category>
		<category><![CDATA[unstructured data]]></category>
		<category><![CDATA[Web 2.0]]></category>

		<guid isPermaLink="false">http://ericbrown.com/?p=3434</guid>
		<description><![CDATA[In my doctoral research, I&#8217;ve been researching ways to improve knowledge capture and sharing methods, specifically within project teams but the ideas can be dissemenated around the organization. One of the biggest issues I&#8217;ve found while working as a consultant is the amount of knowledge that I walk away with after a project is complete. [...]]]></description>
			<content:encoded><![CDATA[<p><a target="_blank" href="http://dev.ericbrown.com/wp-content/uploads/2010/07/Mining-for-Knowledge.jpeg"><img class="alignleft size-full wp-image-3885" title="Mining for Knowledge" src="http://dev.ericbrown.com/wp-content/uploads/2010/07/Mining-for-Knowledge.jpeg" alt="Mining for Knowledge" width="200" height="200" /></a>In my doctoral research, I&#8217;ve been researching ways to improve knowledge capture and sharing methods, specifically within project teams but the ideas can be dissemenated around the organization.</p>
<p>One of the biggest issues I&#8217;ve found while working as a consultant is the amount of knowledge that I walk away with after a project is complete.  Sure, I try to share this knowledge in every way possible but converting <a target="_blank" class="zem_slink" title="Tacit knowledge" rel="wikipedia" href="http://en.wikipedia.org/wiki/Tacit_knowledge">tacit</a> (i.e., internal) knowledge to explicit (i.e., external) knowledge is <a target="_blank" href="http://books.google.com/books?id=K1N-wNI2Gt8C&amp;lpg=PA292&amp;ots=pB0fZWEqCa&amp;dq=Converting%20tacit%20knowledge%20into%20explicit%20knowledge%20means%20finding%20a%20way%20to%20express%20the%20inexpressible&amp;pg=PA292#v=onepage&amp;q=Converting%20tacit%20knowledge%20into%20explicit%20knowledge%20means%20finding%20a%20way%20to%20express%20the%20inexpressible&amp;f=false" target="_blank">one of the most difficult things to do</a>.</p>
<p>Let&#8217;s assume though, that some portion of the knowledge that I hold in my head is converted into some form of writing at various periods throughout a consulting project.  Where does that explicit knowledge live?  In an email?  In some document stored on a server?  In a knowledge repository somewhere?</p>
<p>In the past, this problem has been attacked using centralized knowledge repository platforms.  These systems require users to log in and &#8216;enter&#8217; their knowledge into the system.  Many of these platforms have been well built and some have been successfully used in organizations, but the success stories are far outweighed by the stories of KM repositories sitting idle and unused.</p>
<p>So&#8230;how can we get that tidbit of knowledge from my brain into some form of knowledge repository without me logging in and &#8216;entering&#8217; it into the system?</p>
<h3>Web 2.0 as knowledge repository</h3>
<p>The use of Web 2.0 tools (blogs, IM, wikis, etc) has become ubiquitous..  If incorporated into a project environment, these tools might allow an easy and efficient method for capturing and sharing knowledge throughout project teams and project organizations.</p>
<p>The key to retrieving knowledge from tools is to make the user experience as seamless as possible. For example, an employee creates a blog on an organization&#8217;s intranet and then uses this blog to write different topics, some that pertain to her project and some that don&#8217;t.</p>
<p>Perhaps this employee is participating in two projects within the organization and she writes about topics that might be of interest to a portion of the organization and project team members.  While she writes about interesting topics and at times, writes about her experiences on the projects that she&#8217;s worked on, perhaps her blog posts aren&#8217;t widely read.  This employee has attempted to convert a portion of her tacit knowledge to explicit knowledge but few people on the project team or within the organization find this knowledge because its tucked away in the intranet site (which is rarely used anyway).</p>
<p>In the above scenario, knowledge was converted from tacit to explicit but few people are able to absorb this knowledge and make it their own (i.e., perform the conversion from explicit to tacit knowledge).  What would happen if this knowledge were indexed, searched and shared with the rest of the project team in something akin to a project knowledge &#8216;journal&#8217;?</p>
<p>Since Web 2.0 platforms are ubiqutious, why can&#8217;t we use these tools as our knowledge repository?  Employees and project team members are already using them&#8230;so can we find a way to &#8216;mine&#8217; these platforms for knowledge?</p>
<p>Could a system be built that &#8216;mines&#8217; these web 2.0 platforms along with other <a target="_blank" class="zem_slink" title="Unstructured data" rel="wikipedia" href="http://en.wikipedia.org/wiki/Unstructured_data">unstructured data</a> (documents, email, etc) to &#8216;build&#8217; a knowledge repository available to the entire organization?</p>
<h3>Mining for Knowledge</h3>
<p>I&#8217;m currently looking at ways to use <a target="_blank" href="http://en.wikipedia.org/wiki/Text_mining" target="_blank">text mining</a> methods and techniques to mine for knowledge. Text mining looks to be a good approach to solving this problem because it allows for knowledge to be gathered without additional work by project team members.</p>
<p>There are other approaches that could be used for gathering knowledge from project team members, but all require additional work to input information.  For example, a project team using a manual approach could ask team members to regularly update their blog and to ‘tag’ their posts with a special project tag or keyword so that a non-intelligent aggregation system (<a target="_blank" class="zem_slink" title="RSS" rel="wikipedia" href="http://en.wikipedia.org/wiki/RSS">RSS</a>, etc) could simply pull these tagged posts into a central repository.  While this is a good approach, it relies on the end-user to tag their content correctly, accurately and in a timely manner.  Tagging, and other categorization and taxonomic approaches, require the user to do something to allow their knowledge contribution to be categorized, indexed and found by aggregation systems and other users.</p>
<p>Using text-mining methods against pre-existing tools and platforms takes away the human fallibility issues found in current <a target="_blank" class="zem_slink" title="Knowledge management" rel="wikipedia" href="http://en.wikipedia.org/wiki/Knowledge_management">knowledge management</a> repository platforms or by requiring a user to ‘tag’ a piece of content correctly as described above.</p>
<p>Using text-mining and other <a target="_blank" class="zem_slink" title="Data mining" rel="wikipedia" href="http://en.wikipedia.org/wiki/Data_mining">data mining</a> approaches, I&#8217;m looking at ways to build semi-autonomous systems to index and organize both structured data and unstructured data pulled from blogs, email, IM, social networks, documents, spreadsheets and any other location / data sources. This system could aggregate knowledge found via text mining and <a target="_blank" class="zem_slink" title="Social network" rel="wikipedia" href="http://en.wikipedia.org/wiki/Social_network">social network analysis</a> and build a project knowledge ‘repository’ that will contain all knowledge for any specific project. This repository will be searchable and will contain both manually curated content (e.g., content uploaded by project team members) and automatically curated / generated content based on text-mining and indexing techniques.</p>
<p>There are some major privacy issues here of course. How can you mine a users email and find the relevant knowledge without truly invading their privacy?  Not sure you can but I&#8217;m looking at it.</p>
<h3>Trust &amp; Mined Knowledge</h3>
<p>One key element of this new inter-connected world that we live in is trust.   How can I trust that the information I read on a web page is worthwhile, honest and accurate?   If I want to know something about organizational behavior do I read go read a <a target="_blank" href="http://en.wikipedia.org/wiki/Organizational_studies" target="_blank">Wikipedia article on the subject</a> or do I go look through the <a target="_blank" href="http://www.hbs.edu/units/ob/" target="_blank">Harvard Business School&#8217;s Organizational Behavior faculty pages</a> and find publications written by the faculty there?</p>
<p>Which of these two sources of knowledge would you trust to be more accurate?</p>
<p>The same can be said of knowledge captured and shared within an organization. How do you know that the white paper on your new <a target="_blank" class="zem_slink" title="Application programming interface" rel="wikipedia" href="http://en.wikipedia.org/wiki/Application_programming_interface">API</a> is true?  Is it because it was released? Is it because of the author(s) of the paper?   What if you had a knowledge-base generated by an autonomous agent using text-mining techniques&#8230;how would you know to trust the information contained in it?  Who wrote the content?  Were did it come from?</p>
<p>This is where trust comes into play. If you could &#8216;see&#8217; the qualifications of the author or authors of the knowledge base articles would you trust the content more?  If I knew that the worlds leading authority on organizational behavior wrote the Wikipedia article on the subject, I&#8217;d tend to trust that article more.</p>
<p>This is another aspect of my research&#8230;building trust into the mined knowledge using <a target="_blank" href="http://lrs.ed.uiuc.edu/tse-portal/analysis/social-network-analysis/" target="_blank">social network analysis</a> (SNA) methods &amp; techniques.  Using SNA techniques, can the background, profiles, connections and knowledge of the users within an organization be automatically (or semi-automatically) generated to provide some form for initial trust metric to show that mined knowledge can be trusted?</p>
<p>I don&#8217;t know if it can&#8230;but I&#8217;m looking into it <img src='http://files.ericbrown.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h3>Next Steps?</h3>
<p>So what are the next steps for me and this research?</p>
<p>I&#8217;m working on a research paper now that I hope will outline the research in more detail.</p>
<p>Lots of questions still exist and there is quite a bit of research left to do.  I do believe I&#8217;m headed in the right direction as evidenced by an HBR video &amp; Blog tilted <a target="_blank" href="http://blogs.hbr.org/video/2010/07/how-knowledge-management-is-mo.html" target="_blank">How Knowledge Management Is Moving Away From the Repository as Goal</a> which discusses a similar topic.</p>
<p>Look for more on this topic from me in the coming months.</p>
<p><strong>Related articles by Zemanta</strong></p>
<ul class="zemanta-article-ul">
<li class="zemanta-article-ul-li"><a target="_blank" href="http://www.newscientist.com/article/mg20627624.900-tacit-knowledge-you-dont-know-how-much-you-know.html?DCMP=OTC-rss&amp;nsref=online-news">Tacit knowledge: you don&#8217;t know how much you know</a> (newscientist.com)</li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://hypergogue.posterous.com/conversation-matters-the-incentive-question-o">John Tropea: conversation matters: The Incentive Question or Why People Share Knowledge &#8211; hypergogue</a> (hypergogue.posterous.com)</li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://www.forbes.com/2010/04/23/randd-research-sharing-cooperation-leadership-managing-mitsloan.html">Why We Share Information</a> (forbes.com)</li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://www.baselinemag.com/c/a/Intelligence/Knowledge-Management-and-Collaboration-Create-Knowledge-Sharing-513230/">Knowledge Management and Collaboration Create Knowledge Sharing &#8211; Intelligence</a> (baselinemag.com)</li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://kmci.org/alllifeisproblemsolving/archives/problems-of-shifting-from-km-to-knowledge-sharing/">Problems of Shifting from KM to &#8220;Knowledge Sharing&#8221;</a> (kmci.org)</li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://blogs.hbr.org/video/2010/07/how-knowledge-management-is-mo.html">How Knowledge Management Is Moving Away From the Repository as Goal</a> (blogs.hbr.org)</li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://www.cmswire.com/cms/enterprise-20/how-social-tools-in-sharepoint-2010-encourage-engagement-and-innovation-007945.php">How Social Tools in SharePoint 2010 Encourage Engagement and Innovation</a> (cmswire.com)</li>
</ul>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=da9df4bc-de56-401e-900a-318a3fc51605" alt="" /></div>
]]></content:encoded>
			<wfw:commentRss>http://ericbrown.com/mining-for-knowledge.htm/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Super Crunchers</title>
		<link>http://ericbrown.com/book-review-super-crunchers.htm?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=book-review-super-crunchers</link>
		<comments>http://ericbrown.com/book-review-super-crunchers.htm#comments</comments>
		<pubDate>Fri, 19 Sep 2008 10:47:01 +0000</pubDate>
		<dc:creator>Eric D. Brown</dc:creator>
				<category><![CDATA[Book Reviews]]></category>
		<category><![CDATA[Data mining]]></category>
		<category><![CDATA[Freakonomics]]></category>
		<category><![CDATA[Ian Ayres]]></category>
		<category><![CDATA[Levitt]]></category>
		<category><![CDATA[Super Crunchers: Why Thinking-by-Numbers Is the New Way to Be Smart]]></category>

		<guid isPermaLink="false">http://ericbrown.com/?p=732</guid>
		<description><![CDATA[I picked up Super Crunchers: Why Thinking-By-Numbers is the New Way to be Smart by Ian Ayres while on vacation&#8230;it looked like an interesting read&#8230;and it was. The entire book is based on showing the reader how organizations are using statistics, data mining and regression analysis to determine how to better run their businesses and/or [...]]]></description>
			<content:encoded><![CDATA[<p>I picked up <a target="_blank" href="http://www.amazon.com/Super-Crunchers-Thinking-Numbers-Smart/dp/0553805401" target="_blank">Super Crunchers: Why Thinking-By-Numbers is the New Way to be Smart</a> by <a target="_blank" href="http://islandia.law.yale.edu/ayers/indexhome.htm" target="_blank">Ian Ayres</a> while on vacation&#8230;it looked like an interesting read&#8230;and it was.</p>
<p>The entire book is based on showing the reader how organizations are using statistics, <a target="_blank" class="zem_slink" title="Data mining" rel="wikipedia" href="http://en.wikipedia.org/wiki/Data_mining">data mining</a> and <a target="_blank" class="zem_slink" title="Regression analysis" rel="wikipedia" href="http://en.wikipedia.org/wiki/Regression_analysis">regression analysis</a> to determine how to better run their businesses and/or get more money from you.   The book is not too technical nor full of numbers and the author writes the book for the non-technical/non-geeks out there.</p>
<p>What I found most interesting about this book was the &#8216;behind-the-scenes&#8217; details of how companies like <a target="_blank" class="zem_slink" title="Wal-Mart" rel="homepage" href="http://www.walmartstores.com/">Wal-Mart</a> are using data mining and other techniques to model and manage their logistical systems.</p>
<p>Ayers also provides some very interesting (and slightly disturbing) anecdotes about the use of these methods by Casinos to ensure that gamblers don&#8217;t lose cross their &#8216;pain threshold&#8217; while gambling (this threshold is calculated based on various statistics about the gambler).  The casino will nonchalantly ask the gambler if they&#8217;d like to receive a free dinner&#8230;this isn&#8217;t really to &#8216;comp&#8217; the gambler&#8230;its just to make them forget about the money they&#8217;ve lost.</p>
<p>Another interesting/disturbing example shows credit card companies using data mining and modeling techniques to &#8216;get the most from&#8217; their customers.</p>
<p>This book is a fun read and one that I think everyone should pick up. It is a purely non-technical book on the subject of data mining, modeling and <a target="_blank" class="zem_slink" title="Statistics" rel="wikipedia" href="http://en.wikipedia.org/wiki/Statistics">statistical analysis</a> and is full if interesting nuggets of informaiton.  If you read the book <a target="_blank" href="ttp://www.amazon.com/Freakonomics-Economist-Explores-Hidden-Everything/dp/006089637X" target="_blank">Freakonomics</a> by Levitt and Dubner, you&#8217;ll like this book.</p>
<p>PS &#8211; If you are wondering why 2 book reviews in one week, its because I got caught up on reading during vacation. <img src='http://files.ericbrown.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h6 class="zemanta-related-title" style="font-size: 1em;">Related articles by Zemanta</h6>
<ul class="zemanta-article-ul">
<li class="zemanta-article-ul-li"><a target="_blank" href="http://www.socialmediatoday.com/SMC/47867">Cross-Pollinating Analytics</a></li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://freakonomics.blogs.nytimes.com/2007/12/26/does-this-analysis-of-test-scores-make-any-sense-a-guest-post/">Does This Analysis of Test Scores Make Any Sense? A Guest Post</a></li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://www.marginalrevolution.com/marginalrevolution/2007/10/regression-head.html">Regression-heads</a></li>
<li class="zemanta-article-ul-li"><a target="_blank" href="http://www.computerworld.com/taxonomy/000/000/000//taxonomy_000000054_index.jsp">More Data Mining News&#8230;</a></li>
</ul>
<div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a target="_blank" class="zemanta-pixie-a" title="Zemified by Zemanta" href="http://reblog.zemanta.com/zemified/63472417-6bf7-42ad-aa4e-4e15a9777f95/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://dev.ericbrown.com/wp-content/uploads/2008/09/reblog_e.png?x-id=63472417-6bf7-42ad-aa4e-4e15a9777f95" alt="Reblog this post [with Zemanta]" /></a></div>
]]></content:encoded>
			<wfw:commentRss>http://ericbrown.com/book-review-super-crunchers.htm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using apc
Page Caching using apc
Database Caching 1/34 queries in 0.010 seconds using apc
Object Caching 712/793 objects using apc
Content Delivery Network via Amazon Web Services: CloudFront: files.ericbrown.com

Served from: ericbrown.com @ 2012-05-24 20:47:17 -->
