Data, Information, and Knowledge

Data: a collection of facts
Information: data that have been processed into a format that is understandable by its intended audience
Knowledge: the perception of fact or truth; clear and certain mental apprehension

Progression: data is gathered to provide information in order to gain knowledge.

We have exposure to more data today than we ever have in the history of the world. Back in 2006, IBM put out a white paper, The toxic terabyte: How data-dumping threatens business efficiency, stating that by 2010 the amount of data available in the world would double every 11 hours. Last year, a British news source reported that the world’s digital content was equal to a stack of books stretching from Earth to Pluto 10 times. (this is just the digital data, mind you!) If you want some really mind-blowing facts on data, check out Tim Berners-Lee’s talk from the TED conferences in 2009.

Back in the late ’90s, Michael Lesk, a renowned professor of Library and Information Sciences at Rutgers University, conducted a study trying to estimate the amount of information available in the world. He drew this conclusion which reflects the world we now live in:

“Today the digital library community spends some effort on scanning, compression, and OCR; tomorrow it will have to focus almost exclusively on selection, searching, and quality assessment. Input will not matter as much as relevant choice. Missing information won’t be on the tip of your tongue; it will be somewhere in your files. Or, perhaps, it will be in somebody else’s files. With all of everyone’s work online, we will have the opportunity first glimpsed by H. G. Wells (and a bit later and more concretely by Vannevar Bush) to let everyone use everyone else’s intellectual effort. We could build a real `World Encyclopedia’ with a true `planetary memory for all mankind’ as Wells wrote in 1938. [Wells 1938]. He talked of “knitting all the intellectual workers of the world through a common interest;” we could do it. The challenge for librarians and computer scientists is to let us find the information we want in other people’s work; and the challenge for the lawyers and economists is to arrange the payment structures so that we are encouraged to use the work of others rather than re-create it.”

As sourcers, the challenge we face is not being able to find data, but rather putting it into bite-sized, digestible portions for the recruiters we support. Data is out there – we know how to find it. Making it comprehensible is the issue. Let’s start by taking a look at the things mentioned in Lesk’s conclusion.

Selection. What resources do you frequent? Consider the Pareto principle: 80% of the results come from 20% of the resources. When you are researching, do you rely on a small handful of tools? Try a search engine comparison tool, like Blind Search, or metasearch engines like Dogpile, ixquick, Yippy, Mamma, etc., to see how search results differ depending on what search engine you’re using.

Searching. Do you really understand Boolean? Does the resource you’re using support full Boolean or does it have some sort of proprietary search tool? Are you using semantic search?

Quality Assessment. Here’s where I think a lot of knowledge workers run into problems. Fact-checking and verification of data on the internet can be daunting, especially when it comes to information about people. Today, people seem to think that slapping the word “FACT” in front of a statement automatically turns it into truth, when in reality it is just someone’s opinion. As professional researchers, we need to cross our T’s and dot our I’s when it comes to verifying the data we present as information to our recruiting counterparts.

What many of us do – myself included – is we get excited about the data and the information we find. We chuck it at our colleagues with little to no explanation of our excitement or why they should be excited too. As knowledge workers, it is important for us to take our enthusiastic attitudes about the stuff we find and wrap it around that information in order for our colleagues to understand its importance. Think of data/information/knowledge like a gelcap: the medicine (data) on the inside might not taste good on its own, but wrapped in a gel capsule (information), it’s easier to swallow and reap the benefits of ingestion (knowledge).

Always keep in mind that more doesn’t always equal better. Just because we are exposed to more data than ever before doesn’t mean that it is quality data. It just means more stuff to sift through in order to find good, pertinent information. Knowledge workers like us are more important than ever to convert data into understandable information so that we can build rich knowledge bases for our organizations.

Happy Friday to you all!