Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Multi-table Full Text Search in Postgres

Multi-table Full Text Search in Postgres

Caleb Hearth

October 14, 2014
Tweet

More Decks by Caleb Hearth

Other Decks in Programming

Transcript

  1. Scopes def self.search(query) joins(:user) .where(<<-SQL, query, query, query) articles.body ILIKE

    %?% OR articles.title ILIKE %?% OR users.name ILIKE %?% SQL end
  2. Query object class Search def self.for(query) [ Article.where("title ILIKE %?%",

    query), Article.where("body ILIKE %?%", query), Article.joins(:user) .where("users.name ILIKE %?%", query), ].flatten.uniq end end
  3. SELECT DISTINCT(id) FROM ( SELECT id AS id, title ||

    ' ' || body AS the_text FROM articles UNION SELECT articles.id AS id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id ) AS this_doesnt_matter WHERE to_tsvector(the_text) @@ to_tsquery('?');
  4. SELECT DISTINCT(id) FROM ( SELECT id AS id, title ||

    ' ' || body AS the_text FROM articles UNION SELECT articles.id AS id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id
  5. id AS id, title || ' ' || body AS

    the_text FROM articles UNION SELECT articles.id AS id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id
  6. SELECT DISTINCT(id) FROM ( SELECT id AS id, title ||

    ' ' || body A FROM articles UNION SELECT articles.id AS id, authors.name AS the_te
  7. title || ' ' || body A FROM articles UNION

    SELECT articles.id AS id, authors.name AS the_te FROM authors JOIN articles ON authors.id = articles ) AS this_doesnt_matter WHERE
  8. SELECT DISTINCT(id) FROM ( SELECT id AS id, title ||

    ' ' || body AS t FROM articles UNION SELECT
  9. We could throw that into our query object class Search

    def self.for(query) <<-SQL SQL end end
  10. CREATE VIEW users_with_recent_activity AS SELECT DISTINCT ON (users.id) users.*, activities.created_at

    AS active_at FROM users JOIN activities ON activities.user_id = users.id WHERE activities.created_at >= CURRENT_DATE - interval '7 days';
  11. CREATE VIEW users_with_recent_activity AS SELECT DISTINCT ON (users.id) users.*, activities.created_at

    AS active_at FROM users JOIN activities ON activities.user_id = users.id WHERE activities.created_at >= CURRENT_DATE - interval '7 days';
  12. CREATE VIEW users_with_recent_activity SELECT DISTINCT ON (users.id) users.*, activities.created_at AS

    active FROM users JOIN activities ON activities.user_id = users.id WHERE activities.created_at >= CURRENT_DATE - interval '7 days
  13. CREATE VIEW users_with_recent AS SELECT DISTINCT ON (users.id) users.*, activities.created_at

    active_at FROM users JOIN activities ON activities.user_id = u
  14. VIEW users_with_recent_activity AS DISTINCT ON (users.id) users.*, activities.created_at AS active_at

    users activities activities.user_id = users.id activities.created_at >= CURRENT_DATE - interval '7 days';
  15. CREATE VIEW users_with_recent_activity AS SELECT DISTINCT ON (users.id) users.*, activities.created_at

    AS active_at FROM users JOIN activities ON activities.user_id = users.id WHERE activities.created_at >= CURRENT_DATE - interval '7 days';
  16. Querying a view looks just like querying a table SELECT

    * FROM users_with_recent_activity WHERE id IN (1,2,3…) ORDER BY active_at DESC
  17. SELECT id AS article_id, title || ' ' || body

    AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id;
  18. class CreateUsersWithRecentActivity < ActiveRecord::Migration def up ActiveRecord::Base.connection.execute( "--The CREATE VIEW

    sql from before" ) end def down ActiveRecord::Base.connection.execute( 'DROP VIEW users_with_recent_activity' ) end end
  19. SELECT id AS article_id, title || ' ' || body

    AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id UNION SELECT article_id, body AS the_text FROM comments
  20. authors.name AS the_tex FROM authors JOIN articles ON authors.id =

    article UNION SELECT article_id, body AS the_text FROM comments
  21. class CreateUsersWithRecentActivity < ActiveRecord::Migration def up ActiveRecord::Base.connection.execute( "CREATE OR REPLACE

    VIEW searches AS --The sql from the new view" ) end def down ActiveRecord::Base.connection.execute( "CREATE OR REPLACE VIEW searches AS --The sql from the old view" ) end end
  22. class CreateUsersWithRecentActivity < ActiveRecord::Migration def up ActiveRecord::Base.connection.execute( "DROP VIEW searches;

    CREATE VIEW searches AS --The sql from the new view" ) end def down ActiveRecord::Base.connection.execute( "DROP VIEW searches; CREATE VIEW searches AS --The sql from the old view" ) end end
  23. Model generator $ rails generate scenic:model search create app/models/search.rb create

    db/views/searches_v01.sql create db/migrate/..._create_searches.rb
  24. $ cat app/views/searches_v01.sql SELECT id AS article_id, title || '

    ' || body AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id
  25. $ cat app/views/searches_v02.sql SELECT id AS article_id, title || '

    ' || body AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id
  26. $ cat app/views/searches_v02.sql SELECT id AS article_id, title || '

    ' || body AS the_text FROM articles UNION SELECT articles.id AS article_id, authors.name AS the_text FROM authors JOIN articles ON authors.id = articles.author_id UNION SELECT article_id, body AS the_text FROM comments
  27. Use GIN • You don't have hundreds of thousands of

    rows • You're not concerned about longer writes blocking the db • You're adding the index late in the game • You don't care about disk space • You want fast lookups (read-heavy)
  28. Use GiST • Very large tables • You have performance

    concerns • For some reason disk space is important • Your table is write-heavy
  29. add_index :articles, :body, using: :gin add_index :articles, :title, using: :gin

    add_index :authors, :name, using: :gin add_index :comments, :body, using: :gin
  30. ElasticSearch via Tire or elasticsearch-rails or Chewy require 'elasticsearch/model' class

    Article < ActiveRecord::Base include Elasticsearch::Model include Elasticsearch::Model::Callbacks end
  31. Solr via Sunspot class Post < ActiveRecord::Base searchable do text

    :title, :body text :comments do comments.map { |comment| comment.body } end boolean :featured integer :blog_id integer :author_id integer :category_ids, :multiple => true double :average_rating time :published_at time :expired_at string :sort_title do title.downcase.gsub(/^(an?|the)/, '') end end end
  32. boolean :featured integer :blog_id integer :author_id integer :category_ids, :multiple =>

    true double :average_rating time :published_at time :expired_at
  33. sphinx with ThinkingSphinx ThinkingSphinx::Index.define :article, :with => :active_record do indexes

    subject, :sortable => true indexes content indexes author.name, :as => :author, :sortable => true has author_id, created_at, updated_at end
  34. :(

  35. :)

  36. Bibliography • http://shisaa.jp/postset/postgresql-full-text-search-part-1.html • http://blog.lostpropertyhq.com/postgres-full-text-search-is-good- enough/ • http://www.postgresql.org/docs/9.3/static/textsearch.html • http://linuxgazette.net/164/sephton.html

    • http://www.postgresql.org/docs/current/static/sql-createview.html • http://www.postgresql.org/docs/current/static/indexes-examine.html • http://www.postgresql.org/docs/current/static/textsearch-indexes.html • http://en.wikipedia.org/wiki/Gin_and_tonic