Using Sphinx as search server
This post should get you started to use Sphinx on your Rails application. I only try to cover some of core topics of Sphinx that is Indexing and Searching.
Sphinx
Sphinx is an open source full text search server, designed from the ground up with performance, relevance (aka search quality), and integration simplicity in mind. It's written in C++ and works on Linux (RedHat, Ubuntu, etc), Windows, MacOS, Solaris, FreeBSD, and a few other systems.
Sphinx lets you either batch index and search data stored in an SQL database, NoSQL storage, or just files quickly and easily — or index and search data on the fly, working with Sphinx pretty much as with a database server.
Sphinx can directly access and index data stored in MySQL (all storage engines are supported), PostgreSQL, Oracle, Microsoft SQL Server, SQLite, Drizzle, and anything else that supports ODBC.
Thingking Sphinx
A Ruby connector between Sphinx and ActiveRecord
The Rails application
Let say I have a simple blog application and I would like to add search functionality to it.
Install Sphinx
Download the code from Sphinx website. The standard set of commands should install it with MySQL support.
- Extract the downloaded file
tar xvfz sphinx-2.0.3-release.tar.gz
- Do the usual make steps
./configure make sudo make install
Install Thinking Sphinx
Add the following to your Gemfile
gem 'thinking-sphinx'
then run bundle install.
Indexing
The first thing you want to do after having thingking-sphinx is installed is to set up the indexes for your model.
class Post < ActiveRecord::Base
define_index do
indexes title, :sortable => true
indexes content
has created_at, updated_at
end
end
Now some sort explaination. Why do we define indexes? Sphinx daemon talk to a collection of indexes. Each index tracks a set of documents and each document is made up of fields and attributes. From the snippet above, title and content are fields. They are used as the content of your search queries. After the fields, there are attributes. Attributes are used for sorting, filtering and grouping your search results.
Now let's run the rake task to get Sphinx process the data
rake ts:index
You will see a message like below
... collected 3 docs ... ...
The number after 'collected' indicate rows count in the table, in our example, "posts". A row is defined as a document.
You don't need to re-index if you adding new data into the table unless you have made structural changes (like adding or removing fields). If you need to re-index, it can be done through a single rake task
rake ts:rebuild
Searching
Before you can search using Post model, you need to let the Sphinx daemon running.
rake ts:start
If everything is good to go, you will see a messagce much like this
Started successfully (pid 8379).
Now let's add a simple search form to the post index page.
<%= form_tag posts_path, :method => :get do %> <%= label_tag 'query', 'Search' %> <%= text_field_tag 'query' %> <%= submit_tag 'Go' %> <% end %>
Then in your controller index method use Post.search
@posts = params[:query].blank? ? Post.all : Post.search(params[:query])
Now go the blog application and try the new search form.
Summary
Sphinx enable you to setup a search engine for your application. It is faster then just store the searchable content in plain database table
The Thinking Sphinx gems make it easier to integrate Sphinx with existing Rails application. You can have a complex search logic without having to write any complex SQL queries.
There are plenty more Sphinx features to explore. Once you have done with the basic stuff, I urge you to try more advance topics.
