Why is Semantic Search Important to Sourcers?

There is just too much confusion about so-called “Boolean search” so I’d like to clarify a few basic things:

Booleans are simply three LOGICAL OPERATORS: AND OR NOT. In search engines you don’t need the AND because its always assumed by default, so forget about it. That leaves you with two: OR and NOT. The The OR is called a logical disjunction, sometimes an inclusive disjunction or alternation in mathematics. In English grammar or is a coordinating conjunction, and in ordinary language it sometimes has the meaning of an exclusive disjunction. Bottom line is that when you use OR the result is “true” whenever one or more of the words are matched. There, now you know Boolean search.
When recruiters talk about “Boolean search” what they are really talking about is creating search strings, sometimes using advanced commands or complex search syntax to query specific fields inside of databases.
Well, that’s the end of my list.

Search syntax has been around since the beginning of databases. Each “database” (lumping in Monster, Google, LinkedIn, and Facebook under that term) has its own set of field search commands. When searching a database what we search are “fields;” for example, “Company Name,” “Email Address,” and so on.

You use fields to search your Outlook when you ask for people with a specific name or when you query by “job title” on Monster or LinkedIn. The “big search engines” also use fields and they share some in common, such as intitle, inurl, site, and filetype. That’s what recruiters and sourcers often misconstrue as “Boolean search.” In fact, the only Booleans used are the assumed AND between every search term, the occasional OR as with (intitle:resume OR inurl:resume), and the rare use of the NOT or AND NOT when applying negative search terms such as -jobs or -submit.

Search can be as simple or complex as we want it to be, but at the end of the day we are limited only to words in the fields of a database. That is, until the promise of Semantic Search.

The problem with search engines today

Most people agree that the biggest problem with online recruiting today is too much available information. Without a good search engine, you simply get lost in all the information. Unfortunately, today’s search engines are still inefficient, delivering mismatched information and requiring complex search string knowledge to use effectively. In an ideal world a search engine would function like a human, understanding the underlying meaning of the user’s search and then matching the search results accordingly. Many expert communities talk about Semantic Search applications being the most likely technology to deliver this kind of result, but few take the time to explain it in plain language. This whitepaper explains Semantic Search for candidate sourcing.

Major search engines constantly experiment with ways to simplify search, moving away from complex Boolean queries and the use of advanced field search syntax, in an effort to assist users in quickly finding what they seek. These efforts only begin to approach true Semantic Search capabilities and today are limited to providing only a basic understanding of associations and concepts related to a search. In contrast, Semantic Search technologies look for the concept that is being searched for rather than specific keyword(s)’s or synonym(s)’s occurrence in a document. The application of semantic technology shifts search from browsing for relevant documents to discovering relationships between content and delivering actionable information with insights. In short, Semantic Search applies grammatical analysis, logical interpretation, and linguistic morphology in the identification of concepts from unstructured data. This is different from the concept of a Semantic Web, which consists of highly structured data such as XML.

What is Semantic Search?

Semantics is defined as the field of study that focuses on meaning. Meaning, that is, as it is inherent in symbols, words, phrases, sentences, and larger blocks of text. In simple terms, a perfect semantic search engine would instantly and automatically take into consideration the meaning behind your question or “search string.” For example, it could disambiguate results that lead to people’s profiles from, say, the text in employment advertising: it would serve up only the profiles to recruiters and the employment ads to job seekers. Perhaps an even simpler example would be to tell the difference between Apple the fruit, Apple the company (or products), and Apple the record studio. Search for just Apple in Google and most of the top results will be about the company, not the other two, because most Google users search and click on results about Apple Inc. (and/or its products). This is how antiquated statistically-based, popularity-driven search devices like Google’s “page rank” work. This is not semantic search at all. Popular pages are not necessarily credible, nor are credible sources always incredibly popular.

One of the largest problems with implementing Semantic Search has been that it is difficult for the computer to know who you are (or in the example above, if you are a job seeker or a recruiter) unless the search engine can learn from your behavior and your previous selections, or you are able to manually indicate a category for it to narrow down your results into categories of meaning. Google, Yahoo!, and Bing have been applying technology in an attempt to learn individual search behavior by looking at what people click and where they are located (based on their IP address). Again, however, this only approximates meaning, but does not truly understand it. Another way so-called Semantic Search engines have tried to solve this problem is to ask the user to tag, catalog, sort, and otherwise try to “train” the search engine, which is far too time-consuming for the average person.

The Semantic Search technology that exists today can be categorized into three main approaches:

Lexicon and Ontological Based Search
Statistical Analysis and/or Pattern Matching Search
Contextual, Vertical, and Faceted Search

The white paper linked above will help you to determine which search engines fit into the above categories and how each of those categories affects how you conduct your research.

So why is this important to sourcers?

Well, if a computer knows what you mean right away, without having to learn from you or be trained by you, it would give you only relevant results and not show you all that other junk you have to manually sift through in today’s search engines. We’re getting there, but we’re not close enough yet.

The power of Semantic Search becomes clear when you imagine trying to fill a position for a Programmer working on mobile phone games, using the OpenGL programming language. The likelihood of a sourcer knowing which keywords to search with to find all likely candidates is extremely unlikely. However, with Semantic Search, it is unnecessary for the sourcer to know which keywords to search on, because the search engine automatically finds the words needed, making the recruiter a subject area expert on every search s/he does.

Check out the white paper or look at some of the other articles that have been written about Semantic Search both here on SourceCon and by others in the community, and arrive at your own conclusions about how this affects our future. I’d love to “hear” your thoughts and opinions here in the comments!

image source: Mikael Altemark