Table of Contents
Solr and acts_as_solr
Solr is a search server based on lucene java search library with a HTTP/XML interface. Using Solr, large collections of documents can be indexed based on strongly typed field definitions, thereby taking advantage of Lucene's powerful full-text search features.
acts_as_solr is a ruby on rails plugin adding Solr capabilities to activerecord models. It hides all configuration and manual setting efforts with Solr and provides you with simple find_by... methods. acts_as_solr can be used as a replacement to
acts_as_ferret because of inbuilt full text search capabilities ;-) . The purpose of this article is to explain acts_as_solr with examples.
Getting Started
Installation: Installation is well explained on acts_as_solr homepage and getting started with acts_as_solr
Note: acts_as_solr requires jre1.5 on system. Before running any of the solr methods make sure you start solr server with rake solr:start command.
Our example model for this tutorial will be DigitalCamera [classname: Camera] with following fields
- name (type:string)
- brand (type:string) [we want faceted browsing on this field]
- resolution (type:float) [we want faceted browsing on this field]
- other fields which we do not want to index
Basic Usage : for search
Lets start with basic search and then we will move on to faceted browsing. You need to specify which of the columns from your model file you want to be indexed for search (if no
:fields param is given, then all columns are indexed) :
1 2 3 |
class Camera acts_as_solr :fields=>[:name,:brand,:resolution] end |
@results = Camera.find_by_solr("canon powershot") |
:limit, :offset, :scores, :order, :operator]
limit: to limit the count of search resultsoffset: starting index of the search results.scores: solr scores for each of the result returnedorder: by which field results should be ordered and in asc/desc order?operator: words in query should be separated by "and" or "or" i.e. all words should exist in matching results or any of the words respectively
- If you do not want Camera models objects as results, you can use
find_id_by_solrwhich will return just id's from the Camera table (scoreoption is not supported yet) - The result of
find_by_solris not an array of camera objects. Its a ActsAsSolr::SearchResults object.
1 2 |
@products = @results.docs @total_hits = @results.total_hits |
Pagination
We will be using will_paginate for pagination (rails default pagination is buggy). One way of paginating returned results is explained nicely at Using will paginate with acts_as_solr. But we don't need an additional module for getting count of results returned. This module is an overhead because of extra Solr query. acts_as_solr returns the total number of results via
find_by_solr and you don't need to call count_by_solr separately for getting result count. In your .rhtml file include this where you want pagination
1 2 |
<%= will_paginate WillPaginate::Collection.new((params[:page]||1), (params[:products_per_page]||10),@total_hits) -%> |
Faceting
Presentation by Keith Instone is an excellent read on faceted browsing with examples.
Back to acts_as_solr ... open the model file. Using option:facets => ... add the columns on which you want to allow faceted browsing.
1 2 3 4 |
class Camera acts_as_solr :fields=>[:name,:brand,:resolution], :facets=>[:brand,:resolution] end |
resolution between 5 and 7). Modify the model to look like:
1 2 3 4 |
class Camera acts_as_solr :fields=>[:name,:brand,{:resolution=>:range_float}], :facets=>[:brand,:resolution] end |
1 2 |
@results = Camera.find_by_solr("powershot",{:facets=> {:zeros=>false,:fields=>[:brand]}}) |
:zerosparameter tells solr not to return the brand values whose facet count is zero.@results.facets["facet_fields"]["brand_facet"]contains the names of brand with the corresponding counts. A sample result:{"Canon USA"=>1, "Canon PowerShot"=>1, "Canon"=>91}
1 2 3 |
# INCORRECT @results = Camera.find_by_solr("powershot",{:facets=> {:zeros=>false,:fields=>[:brand,:resolution]}}) |
facet_query param of find_by_solr with ranges for resolution predefined i.e. Solr cannot calculate the range for the float/int fields itself and we need to specify the range of values while querying solr. For example if values in resolution column range from 0 to 20 and we want to have 4 facet ranges. Your query would be something like
1 2 3 4 5 6 7 |
@results = Camera.find_by_solr("powershot", {:facets=>{:zeros=>false, :fields=>[:brand], :query=>["resolution:[0 TO 4]","resolution:[5 TO 9]", "resolution:[10 TO 14]","resolution:[15 TO 20]"] } }) |
@results.facets["facet_queries"] If you want to display the results to the user as links along with counts (something like example images at the top), where user can make a further selection, you need to get results from solr using
1 2 3 |
@results = Camera.find_by_solr("powershot"+" AND resolution:[0 TO 4]", {:facets=>{:zeros=>false, :fields=>[:brand]}}) ## resolution queries has been removed since you already made a selection in resolution itself. |
between 0 to 4 and between 6 to 7.
So we can define query (first argument to find_by_solr) as "powershot" + " AND (resolution:[0 TO 4] OR resolution:[6 TO 7])".But unfortunately,
find_by_solr(query+" AND brand:Canon") doesn't seems to work By default all fields are of type string and faceting for these fields is done using
browse parameter i.e.
1 2 3 4 5 6 |
@results = Camera.find_by_solr(query,{:facets=> {:zeros=>false, :query=>["resolution:[0 TO 4]", "resolution:[5 TO 9]", "resolution:[10 TO 14]", "resolution:[15 TO 20]"], :browse=>["brand:Canon"]} }) |
Note: I don't know how to make this work with browse field because multiple options in browse are separated by AND and
:operator option from find_by_solr works only with words in query. A good alternative is to redefine your model fields which are of string type to field_type as
:facet. Our acts_as_solr declaration becomes:
1 2 |
acts_as_solr :fields=>[:name,{:brand=>:facet}, {:resolution=>:range_float}], :facets=>[:brand,:resolution] |
"powershot" + " AND (brand:Canon OR brand:Sony)"
Boosting
Using boost option you can give one field priority over the others. Just a small change in acts_as_solr declaration is enough.
acts_as_solr :fields => [{:name=>{:boost=>2}},:brand,:resolution] |
Quick Tips
- How to search solr with no query or 'nil' query. i.e. Product pages with navigation without a search query. Use
query = "[* TO *]"1 2 3 4 5 6 7
@results = Camera.find_by_solr("[* TO *]", {:facets=>{:zeros=>false, :fields=>[:brand], :query=>["resolution:[0 TO 4]", "resolution:[5 TO 9]", "resolution:[10 TO 14]", "resolution:[15 TO 20]"] } })
- Already have database, how to integrate solr now? Goto ./script/console and run
Camera.rebuild_solr_index - Adding or updating rows to table automatically adds/modifies them in table.
- If you have already created index on solr (not acts_as_solr), it will not work with acts_as_solr. This is because, in solr configuration, schema.xml tells the solr exactly about the columns you are going to index, their data types. But, with acts_as_solr you never modify
schema.xml. Actually, acts_as_solr uses dynamic fields to tell solr about the fields and their data types. So, acts_as_solr does not finds anything in your already created index because field names are no more same.
Testing Solr
Copy
vendor/plugins/acts_as_solr/test/test_helper.rb modified as shown below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
class Test::Unit::TestCase begin Net::HTTP.get_response(URI.parse('http://localhost:8981/solr/')) rescue Errno::ECONNREFUSED raise "You forgot to 'rake solr:start RAILS_ENV=test', foo!" end def self.fixtures(*table_names) if block_given? Fixtures.create_fixtures(Test::Unit::TestCase.fixture_path, table_names) { yield } else Fixtures.create_fixtures(Test::Unit:: TestCase.fixture_path, table_names) end table_names.each do |table_name| clear_from_solr(table_name) klass = instance_eval table_name.to_s.capitalize.singularize.camelize klass.find (:all).each{|content| content.solr_save} end end private def self.clear_from_solr(table_name) ActsAsSolr::Post.execute(Solr::Request::Delete.new(:query => "type_t:#{table_name.to_s.capitalize.singularize.camelize}") end end |
Modifications in the original file
- removed top lines
- added camelize for removing "_" from model names
- added lines to test presence of test server running .. (from google groups)
test_helper.rb in all your test files. After you have saved above file as RAILS_APP/test/solr_test_helper.rb do not forget to include solr_test_helper in your test files and start writing your tests. :)Oops! Still not working
Did you make sure that
config/solr.yml has been configured for testing environment. (by default solr test server runs on port 8981)
References
- acts_as_solr benchmark results
- acts_as_solr performance on some live site
- http://www.webdesignpractices.com/navigation/facets.html
- http://cfis.savagexi.com/articles/2007/07/10/how-to-profile-your-rails-application
- Deploying acts_as_solr on tomcat

Glad you found my presentation on faceted browsing useful.
This is the best rails + solr article I’ve found. Great work!
@Keith & @Tait: thanks.
wow cool its a big help, have some few questions – in the acts_as_solr there are two schema.xml found (acts_as_solr/solr/schema.xml) and the other one is (acts_as_solr/solr/solr/conf/schema.xml) which one does the the plugin uses.. Another is there a way that we can overide how acts_as_solr do on the dynamic fields, so can explicitly define what field to use that i defined on the schema.xml? thank more power to you guys..
@marjun:
1. acts_as_solr uses acts_as_solr/solr/solr/conf/schema.xml to define input schema for solr.
2. There is no direct way to integrate your schema.xml with acts_as_solr but the plugin can be hacked easily, I am almost done with it … will post the required changes soon!
thanks Quarks
im excited to see those changes and looking forward for it… specially on using the spellchecker it would be cool and handy..
more power guys
Very thorough writeup – good job.
After a couple of months using acts_as_solr for a project, I’ve submitted several bug reports/patches that have not made its way into the plugin yet (development seems dormant) and I’ve also written some things on top of acts_as_solr, such as a rake task for (re)indexing data. Might be interesting to readers of this article.
You mentioned that my will_paginate/acts_as_solr hack performs extra DB queries. To my knowledge (running code in the console, checking for SQL log output), it does not. It does, however, perform an extra Solr query (the count).
I’d love to reuse the
total_hits, but I can’t currently think of a good way to pass them out from the inner method to the outermost one in this case. Possibly I could temporarily override theMethod.countmethod (which will_paginate normally uses)…On a side note, it’d be nice if your comment system would honor paragraphs. :)
@Henrik
Great to see you here. Your work is really exciting.
Thanks for pointing out the mistake, it should be ‘extra Solr query’. I have modified the article. I think our way of pagination is simple which does not use an extra Solr query (may not be very clean).
Already working on Mephisto to change comment system.
thanks again :)
@marjun,
Checkout our latest article Advanced acts_as_solr, it contains patches and explains in depth how to integrate your schema.xml with acts_as_solr .
Thanks for your patience!
@Quarks ,
thanks .. and more power to you guys.. this post gives great help..
If you want to disable Solr in tests simply put this in your test_helper.rb
How can I make acts_as_solr able to search for different languages(ex: Arabic)?
thx.
@Ahmed: I am not sure about Arabic specifically, but there is support for non-english languages. Please have a look at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters. You will need to modify the solr conf file accordingly.
How we can search the query term with and operator
eg: india and politics
here “and” is operator how we can search both the words..
@harish:
you can use ModelName.find_by_solr(“india politics”, :operator=>”and”)
The boost value has to be a float.{:name=>{:boost=>2}} should change to {:name=>{:boost=>2.0}}