Sphinx Search Engine – Overview

Sphinx is an open source full-text search server. Working with Sphinx is pretty much as with a database server . If you consider how to upgrade search engine in your application it can be an easiest way and number of relevance functions ensures you can tweak search quality as well.

Crew

Sphinx Technologies is US company, created in late 2007 by Andrew Aksyonoff, creator and primary developer of Sphinx, and Peter Zaitsev, former head of High-Performance group in MySQL AB, and a world-class expert in database technologies. Today, they are an office-less company with about 10 employees spanning across all the time zones, working online.

Sphinx API’s

  1. Binary API – not recommended, it can add an extra CPU load on the client.
  2. SphinxQL   – with MySQL protocol – I use it in my example.
  3. SphinxSE – plugin for MySQL – MySQL storage engine (SphinxSE)
  4. HTTP – since version 2.3.2 (October 2016) – currently in developing stage, information from official page.

SphinxQL   – example with WordPress DB

Installation on Ubuntu

Configuration with SphinxQL

Run It.

Check results in PHP.

To see results we must run 2 queries, on 2 databases to have one result. In my example I use MySQLi object to connect with Sphinx because I can’t connect with Mysql PDO without user and password. First query ($sphinxQl) is specific for sphinx but it’s based on MySQL and it’s very intuitive to use. Second query is lunched on WP MySQL. Last problem to solve is how to get our posts in correct order? MySQL answer – simply by order by FIELD operator. That’s it.

Rules

  • First thing with Sphinx is that we must use integer numbers as indexes  – there is no other way.
  • Second – we have only index in our results (maybe with extra calculations) without text values from our data.

Moreover, it’s impossible to reconstruct original text, because it’s not stored that way.  Sphinx is doing his job and it stores data in optimized way ( specific method for whitespace, capitalization, punctuation, etc.). But the sentence of sphinx is in it, we have a lot of operators to use with match() function.

Match operators

To tweak results we have huge amount of operators: OR, MAYBE, NOT, NEAR, SENTENCE, field search, field position limit modifier, multiple-field search, ignore field search, ignore multiple-field search, all-field search, phrase search, proximity search, quorum matching, strict order, exact form modifier, field-start and field-end modifier, and more…

The full meaning of this search is:

  • Find the words ‘John’ and ‘Doe’ adjacently in any field in a post;
  • Additionally, the same post must also contain the words ‘results’ in post_title field
  • Additionally, the same post must contain the word ‘List’ in the post_content field, but not contain exact phrase  ‘Johnatan’ or ‘Johnnie’;

I hope it helps to consider Sphinx as your search engine.

Have nice results

Leave a Reply

Your email address will not be published. Required fields are marked *