Semantic search is one of the next big things. Actually, it has been one of the next big things for several years now and it is only slowly making its way into the Web. But what is semantic search anyway? As opposed to the usual keyword search that most of us are familar with after many years of experience with Google and Bing, semantic search actually tries to understand your intention and desired results. The next image illustrates the difference between traditional keyword search and semantic search.
For example, when you search for "smartphones under $320 with an 8 mega pixel camera" on a standard search engine, you will most likely only retrieve sites that either use exactly these terms or the search engine will drop some terms because it considers them irrelevant. A semantic search engine on the other hand would understand that you search for "smartphones", which is very similar to "mobile phone", "cell phone", or "phone" and that the price must be under $320. While Google is also capable of using synonyms for search it lacks understanding of more specific phrases such as "under $300" because that requires the domain knowledge that mobile phones have prices, the unit of the price is dollar and prices can be sorted.
If you think about it, it is ridiculous that to date (2013), a simple natural language search query such as "smartphones under $320 with an 8 mega pixel camera" cannot be answered sufficiently. Universal search engines such as Google or Bing have to answer all kinds of queries so it would be much harder for them to create all the necessary domain knowledge to answer these queries. But what about shopping sites? Shouldn't they be interested in delivering exactly the right products for the user's query? Many shops offer a faceted navigation (filters for properties of the products such as color and price) to help the user, but that is as if you asked an assistant at your local bestbuy a specific question and he'd just shrug his shoulders and gave you a form to fill out before he can help you.
What search engines for natural language are there on the market?
While Walmart is huge offline, it now also wants to compete with Amazon online. And they seem to be serious about it. Walmart got the former team of Kosmix and developed a better search called "Polaris". The system is able to classify user queries and additionally uses social signals such as Pinterest to compute engagement scores. If somebody searches for "House", for example, the system would first offer you DVDs of the popular TV show instead of showing you home improvement items.
Walmart is able to understand simple queries such as "phones under $100" but fails at our longer natural language example query "smartphones under $320 with an 8 mega pixel camera".
FACT-Finder is a "conversion engine" meaning they try to help shops to make people buy more. Among other modules, FACT-Finder offers an on-site search which can cope with spelling errors, synonyms, and different languages.
From their description that doesn't sound all too fancy but they implemented the semantic product search for vacation booking on the German travel site suche.weg.de (the German word "weg"means "away"). This vaction search allows natural language queries such as "from Dresden, 2 adults, at most $400" and it really works. So far only in German but that is another indicator that they really parse the query and try to understand its meaning. This is a different process in each new language.
Fredhopper calls itself the "ultimate marketing machine". Similar to FACT-Finder, it consists of several modules to boost performance of online shops. One of the modules is on-site search which works for over 40 languages, however, only handles spelling mistakes and offers links to refine the search. The natural language query is not understood.
Their customer base includes Toys'R'Us, Clarks, Thomas Cook, Urban Outfitters, Philips, Otto, Conrad, Neckermann, bon prix, House of Fraser, Ted Barker, Debenhams, asos, halens, Waitrose, Northern, breuninger, AutoTrader, and some more. If you check one of their clients sites, for example, urban outfitters, you will not feel understood by the search. Take a very simple query such as "red shirt in size s", for instance, and you will not see the results you expect.
EasyAsk's eCommerce Edition offers a natural language search for online shops to deliver the products the user asked for instead of "no result" pages if the query was not understood.
This demo video explains EasyAsk's strength nicely. And apperently the system really works, you can test it on one of their clients Land's End. Other clients are Gap, Coldwater Creek, Lands End, Lillian Vernon, Aramark, JJill, JoAnn Fabrics, Boscovs, Schuler Shoes, Travers Tools, Broder Brothers, Rockler, Lamps Plus, Personalization Mall, True Value, and Harbor Freight Tools.
Endeca Systems (now owned by Oracle) calls itself "best of breed search" and says that the "Oracle Endeca Server provides best-of-breed search features that are designed to bridge structured, semi-structured, and unstructured content. Features include type-ahead, automatic alpha-numeric spell correction, positional search, Boolean and wildcard search, natural language processing, category search, and query clarification dialogs." (source)
Among others, eBags, an online seller of bags, uses Endeca and reported an increase in conversion rate. When you test the system with a simple natural language query such as "blue bags under $40" you do not get exactly what you wanted.
eelee's mission is to allow for natural language queries about electronic products. However, at the moment they only support answering queries regarding mobile phones. Simple queries such as "red nokia phones under $400" seem to be no problem for eelee. Our initial example of "smartphones under $320 with an 8 mega pixel camera" cannot be answered completely, yet. In contrast to other natural language parsers, eelee tells you how it understood your query. This way the user might rephrase his question to retrieve better results the second time.
eelee also uses the WebKnox API, check out more about eelee's functionality.
Other systems for natural language search over large document corpora are solutions by Autonomy (including features such as intent-based ranking, dynamic learning ability, understanding of contextual nuances, and language independence) and inbenta.
Interestingly, research in the field of natural language processing is very active but there are rather few resources when it comes to parsing natural language queries for semantic product search. An early paper "Do What I Mean: Online Shopping with a Natural Language Search Agent" from 2001 describes the problem an offers a solution and comparison to traditional search. The authors concluded that natural language search increased precision of answers by 50% without increasing the response time too much.
It is a mystery why there has been so little attention in the scientific community on improving product search. Especially with more and more people shopping online and the result of a better search has proven to increase conversion rates and user satisfaction.
If you have interesting insights, questions, suggestions, let me know.