Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Full Text Search using pg_search

Full Text Search using pg_search

aag1091

June 21, 2014
Tweet

More Decks by aag1091

Other Decks in Programming

Transcript

  1. About Me l I have been working on ruby on

    rails for more then 2 and half years now. l Currently working on yowoto.com for Freewave Tech-solutions. l I will be joining TAMUK in Fall 2014 to pursue MS in CS.
  2. Bullet Points l Expain use case. l Basic pg_search methods.

    l Explain pg_search_scope. l Using tsvector columns l Working on tsvector columns
  3. pg_search techniques l Multi-search – Mostly used if require a

    single consolidated search which will search in all the models of your application. l l Search scopes – This search revolves around a single model and its associated models. l Today I will am going to explain Search scopes with help of Tsvector columns. l
  4. pg_search_scope l class BlogPost < ActiveRecord::Base l include PgSearch l

    pg_search_scope :search, l :against => [ :title, :short_description, :content], l :associated_against => { l :category => [:name], l :tags => [:name], l :author => [:first_name, :last_name] l }
  5. Some pg_search features l Tsearch – full text search l

    weighting, prefix, dictionary, any_word, dmetaphone. l Trigram - Trigram search works by counting how many three-letter substrings.
  6. l Tsearch – full text search l class BlogPost <

    ActiveRecord::Base l include PgSearch l pg_search_scope :search, l :against => [ {:title => 'A'} , {:short_description => 'B'}, {:content => 'C'}], l using => { l :tsearch => { l :dictionary => "english", l :any_word => true, l :prefix => true, l }
  7. Tsearch using tsvector column l class BlogPost < ActiveRecord::Base l

    include PgSearch l pg_search_scope :search, l against: :search_vector, l using: { l tsearch: { l dictionary: 'english', l any_word: true, l prefix: true, l tsvector_column: 'search_vector' l }
  8. But for that to work we require search vector column

    l class AddTsvectorColumnToPosts < ActiveRecord::Migration l def up l add_column :posts, :search_vector, :tsvector l execute <<-EOS l CREATE INDEX posts_search_vector_idx ON posts USING gin(search_vector); l EOS l end l def down
  9. So how does this work l The tsvector data type

    - A tsvector data type represents pre-processed document data; that is to say, document text has been parsed and the associations of lexemes (key words) have been stored alongside their original document data. l Tsquery - A tsquery represents a processed query (with predicates) that can be matched against tsvector data. l Special indexes - Special indexes GIN (Generalized Inverted iNdex) & GiST (GeneralIzed Search Tree) dramatically improve query response times.