Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

Jetpack Related Posts for Power Users

xyu
August 16, 2014

Jetpack Related Posts for Power Users

The related posts module in Jetpack looks simple at first glance however underneath it’s powered by Elasticsearch, an advanced natural language search engine. Come as we peel back the covers to explain how Jetpack uses Elasticsearch to determine what’s related and more importantly how to customize it to take full advantage of Elasticsearch’s textual analytics abilities.

xyu

August 16, 2014
Tweet

More Decks by xyu

Other Decks in Technology

Transcript

  1. WordPress on NGINX + HHVM with Heroku Buildpacks WordPress on

    NGINX + HHVM It’s been a year since I last made any major changes to my WordPress on Heroku build and in tech years that’s a lifetime. Since then Heroku has released a new PHP buildpack with nginx and HHVM built in. Much progress have also been made both HHVM and WordPress to make both compatible with each other. So it seems like now is as good a time as any to update the stack this site is running on. So without further ado I like to introduce: Heroku WP — A template for HHVM powered WordPress served by nginx. The Goal There are numerous other templates out there for running WordPress on Heroku and my main goals for this templates are: It should be simple — use the default buildpack provided by Heroku so there’s no other 3rd party dependency to implicitly trust or to maintain. It should be fast — use the latest technologies available to squeeze every last ounce of performance out of each Heroku Dyno. It should be secure — security is not an add-on, admin pages should be secure by default and database connections needs to be encrypted. It should scale — just because we can serve millions of page hits a day off a single Heroku Dyno does not mean we’ll stop there. The template should be made with cloud architecture in mind so that the number of Dynos can scale up and down without breaking. The Stack Standing on the shoulder of giants I was able to use the latest Heroku buildpack and get WordPress running on: NGINX — An event driven web server that was engineered for the modern day to replace Apache. This high performance web server is preferred by more top 1,000 sites then any other and it’s what’s used by the largest WordPress install out there, WordPress.com. HHVM — HipHop Virtual Machine, a JIT (just in time) compiler developed by Facebook to run PHP scripts which when tested with WordPress showed up to a 2x improvement. I have yet to run any statical analysis on performance however antidotally it feels a lot faster navigating WP admin and page generation times looks much better. I’m looking forward to running more tests and performance tuning this build in the coming weeks. Update: While still not a head-to-head test looking at the response times as reported by StatusCake for this site running on Heroku-WP and a mirror of this site that is running on the old Heroku LAMP stack with no load other then StatusCake pings shows a dramatic improvement:
  2. SELECT COUNT(*)
 FROM wp_posts
 WHERE post_content LIKE "%WordPress%" SELECT COUNT(*)


    FROM wp_posts
 WHERE post_content LIKE "%on%" SELECT COUNT(*)
 FROM wp_posts
 WHERE post_content LIKE "%NGINX%" … SELECT COUNT(*)
 FROM wp_posts
 WHERE post_content LIKE "%improvement%"
  3. WordPress on NGINX + HHVM with Heroku Buildpacks WordPress on

    NGINX + HHVM It’s been a year since I last made any major changes to my WordPress on Heroku build and in tech years that’s a lifetime. Since then Heroku has released a new PHP buildpack with nginx and HHVM built in. Much progress have also been made both HHVM and WordPress to make both compatible with each other. So it seems like now is as good a time as any to update the stack this site is running on. So without further ado I like to introduce: Heroku WP — A template for HHVM powered WordPress served by nginx. The Goal There are numerous other templates out there for running WordPress on Heroku and my main goals for this templates are: It should be simple — use the default buildpack provided by Heroku so there’s no other 3rd party dependency to implicitly trust or to maintain. It should be fast — use the latest technologies available to squeeze every last ounce of performance out of each Heroku Dyno. It should be secure — security is not an add-on, admin pages should be secure by default and database connections needs to be encrypted. It should scale — just because we can serve millions of page hits a day off a single Heroku Dyno does not mean we’ll stop there. The template should be made with cloud architecture in mind so that the number of Dynos can scale up and down without breaking. The Stack Standing on the shoulder of giants I was able to use the latest Heroku buildpack and get WordPress running on: NGINX — An event driven web server that was engineered for the modern day to replace Apache. This high performance web server is preferred by more top 1,000 sites then any other and it’s what’s used by the largest WordPress install out there, WordPress.com. HHVM — HipHop Virtual Machine, a JIT (just in time) compiler developed by Facebook to run PHP scripts which when tested with WordPress showed up to a 2x improvement. I have yet to run any statical analysis on performance however antidotally it feels a lot faster navigating WP admin and page generation times looks much better. I’m looking forward to running more tests and performance tuning this build in the coming weeks. Update: While still not a head-to-head test looking at the response times as reported by StatusCake for this site running on Heroku-WP and a mirror of this site that is running on the old Heroku LAMP stack with no load other then StatusCake pings shows a dramatic improvement:
  4. WordPress on NGINX + HHVM with Heroku Buildpacks WordPress on

    NGINX + HHVM It’s been a year since I last made any major changes to my WordPress on Heroku build and in tech years that’s a lifetime. Since then Heroku has released a new PHP buildpack with nginx and HHVM built in. Much progress have also been made both HHVM and WordPress to make both compatible with each other. So it seems like now is as good a time as any to update the stack this site is running on. So without further ado I like to introduce: Heroku WP — A template for HHVM powered WordPress served by nginx. The Goal There are numerous other templates out there for running WordPress on Heroku and my main goals for this templates are: It should be simple — use the default buildpack provided by Heroku so there’s no other 3rd party dependency to implicitly trust or to maintain. It should be fast — use the latest technologies available to squeeze every last ounce of performance out of each Heroku Dyno. It should be secure — security is not an add-on, admin pages should be secure by default and database connections needs to be encrypted. It should scale — just because we can serve millions of page hits a day off a single Heroku Dyno does not mean we’ll stop there. The template should be made with cloud architecture in mind so that the number of Dynos can scale up and down without breaking. The Stack Standing on the shoulder of giants I was able to use the latest Heroku buildpack and get WordPress running on: NGINX — An event driven web server that was engineered for the modern day to replace Apache. This high performance web server is preferred by more top 1,000 sites then any other and it’s what’s used by the largest WordPress install out there, WordPress.com. HHVM — HipHop Virtual Machine, a JIT (just in time) compiler developed by Facebook to run PHP scripts which when tested with WordPress showed up to a 2x improvement. I have yet to run any statical analysis on performance however antidotally it feels a lot faster navigating WP admin and page generation times looks much better. I’m looking forward to running more tests and performance tuning this build in the coming weeks. Update: While still not a head-to-head test looking at the response times as reported by StatusCake for this site running on Heroku-WP and a mirror of this site that is running on the old Heroku LAMP stack with no load other then StatusCake pings shows a dramatic improvement:
  5. SELECT *
 FROM wp_posts
 WHERE
 post_content LIKE "%WordPress%" OR
 post_content

    LIKE "%NGINX%" OR
 post_content LIKE "%HHVM%" OR
 post_content LIKE "%Heroku%" OR
 post_content LIKE "%performance%"
 

  6. SELECT *
 FROM wp_posts
 WHERE
 post_content LIKE "%WordPress%" OR
 post_content

    LIKE "%NGINX%" OR
 post_content LIKE "%HHVM%" OR
 post_content LIKE "%Heroku%" OR
 post_content LIKE "%performance%"
 ORDER BY
 !?
  7. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms “The über-quick brown fox
 jumps over the lazy dogs.”
  8. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms <p>
 The &uuml;ber-quick brown fox
 jumps over the lazy dogs.
 </p>
  9. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms <p>
 The &uuml;ber-quick brown fox
 jumps over the lazy dogs.
 </p>
  10. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 The über-quick brown fox
 jumps over the lazy dogs.

  11. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 The über—quick brown fox 
 jumps over the lazy dogs.

  12. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 The über quick brown fox
 jumps over the lazy dogs

  13. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 The
 quick
 fox
 over
 lazy
 
 über
 brown
 jumps
 the
 dogs

  14. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 The
 quick
 fox
 over
 lazy
 
 über
 brown
 jumps
 the
 dogs

  15. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 the
 quick
 fox
 over
 lazy
 
 über
 brown
 jumps
 the
 dogs

  16. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 the
 quick
 fox
 over
 lazy
 
 über
 brown
 jumps
 the
 dogs

  17. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 the
 quick
 fox
 over
 lazy
 
 uber
 brown
 jumps
 the
 dogs

  18. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 the
 quick
 fox
 over
 lazy
 
 uber
 brown
 jumps
 the
 dogs

  19. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 the
 quick
 fox
 over
 lazy
 
 uber
 brown
 jump
 the
 dog

  20. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 the
 quick
 fox
 over
 lazy
 
 uber
 brown
 jump
 the
 dog

  21. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms 
 
 quick
 fox
 over
 lazy
 
 uber
 brown
 jump
 
 dog

  22. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms Terms Doc IDs brown 1 dog 1 fox 1 jump 1 lazy 1 … over 1 quick 1 uber 1
  23. Elasticsearch Analyzer Chain Raw Text → Character Filters → Tokenizer

    → Token Filters → Terms Terms Doc IDs brown 1, 3, 6, … dog 1, 2, 12… fox 1, 5, 7, … jump 1, 6, … lazy 1, 7, … … 3, 6, 7, … over 1, 3, 5, 6, … quick 1, 4, … uber 1, …
  24. WordPress on NGINX + HHVM with Heroku Buildpacks WordPress on

    NGINX + HHVM It’s been a year since I last made any major changes to my WordPress on Heroku build and in tech years that’s a lifetime. Since then Heroku has released a new PHP buildpack with nginx and HHVM built in. Much progress have also been made both HHVM and WordPress to make both compatible with each other. So it seems like now is as good a time as any to update the stack this site is running on. So without further ado I like to introduce: Heroku WP — A template for HHVM powered WordPress served by nginx. The Goal There are numerous other templates out there for running WordPress on Heroku and my main goals for this templates are: It should be simple — use the default buildpack provided by Heroku so there’s no other 3rd party dependency to implicitly trust or to maintain. It should be fast — use the latest technologies available to squeeze every last ounce of performance out of each Heroku Dyno. It should be secure — security is not an add-on, admin pages should be secure by default and database connections needs to be encrypted. It should scale — just because we can serve millions of page hits a day off a single Heroku Dyno does not mean we’ll stop there. The template should be made with cloud architecture in mind so that the number of Dynos can scale up and down without breaking. The Stack Standing on the shoulder of giants I was able to use the latest Heroku buildpack and get WordPress running on: NGINX — An event driven web server that was engineered for the modern day to replace Apache. This high performance web server is preferred by more top 1,000 sites then any other and it’s what’s used by the largest WordPress install out there, WordPress.com. HHVM — HipHop Virtual Machine, a JIT (just in time) compiler developed by Facebook to run PHP scripts which when tested with WordPress showed up to a 2x improvement. I have yet to run any statical analysis on performance however antidotally it feels a lot faster navigating WP admin and page generation times looks much better. I’m looking forward to running more tests and performance tuning this build in the coming weeks. Update: While still not a head-to-head test looking at the response times as reported by StatusCake for this site running on Heroku-WP and a mirror of this site that is running on the old Heroku LAMP stack with no load other then StatusCake pings shows a dramatic improvement:
  25. WordPress on NGINX + HHVM with Heroku Buildpacks WordPress on

    NGINX + HHVM It’s been a year since I last made any major changes to my WordPress on Heroku build and in tech years that’s a lifetime. Since then Heroku has released a new PHP buildpack with nginx and HHVM built in. Much progress have also been made both HHVM and WordPress to make both compatible with each other. So it seems like now is as good a time as any to update the stack this site is running on. So without further ado I like to introduce: Heroku WP — A template for HHVM powered WordPress served by nginx. The Goal There are numerous other templates out there for running WordPress on Heroku and my main goals for this templates are: It should be simple — use the default buildpack provided by Heroku so there’s no other 3rd party dependency to implicitly trust or to maintain. It should be fast — use the latest technologies available to squeeze every last ounce of performance out of each Heroku Dyno. It should be secure — security is not an add-on, admin pages should be secure by default and database connections needs to be encrypted. It should scale — just because we can serve millions of page hits a day off a single Heroku Dyno does not mean we’ll stop there. The template should be made with cloud architecture in mind so that the number of Dynos can scale up and down without breaking. The Stack Standing on the shoulder of giants I was able to use the latest Heroku buildpack and get WordPress running on: NGINX — An event driven web server that was engineered for the modern day to replace Apache. This high performance web server is preferred by more top 1,000 sites then any other and it’s what’s used by the largest WordPress install out there, WordPress.com. HHVM — HipHop Virtual Machine, a JIT (just in time) compiler developed by Facebook to run PHP scripts which when tested with WordPress showed up to a 2x improvement. I have yet to run any statical analysis on performance however antidotally it feels a lot faster navigating WP admin and page generation times looks much better. I’m looking forward to running more tests and performance tuning this build in the coming weeks. Update: While still not a head-to-head test looking at the response times as reported by StatusCake for this site running on Heroku-WP and a mirror of this site that is running on the old Heroku LAMP stack with no load other then StatusCake pings shows a dramatic improvement:
  26. Related Posts for Power Users • Customize placement with the

    related posts
 shortcode • Change results or look and feel with various
 filters • Go completely wild with the related posts
 raw object IT'S OVER 9000! IT'S OVER 9000!
  27. Related Posts for Power Users • Customize placement with the

    related posts
 shortcode • Change results or look and feel with various
 filters • Go completely wild with the related posts
 raw object IT'S OVER 9000! IT'S OVER 9000!
  28. jetpack_relatedposts_filter_args array(
 'size' => 3,
 'post_type' => get_post_type(),
 'has_terms' =>

    array(),
 'date_range' => array(),
 'exclude_post_ids' => array(),
 )
  29. jetpack_relatedposts_filter_args array(
 'size' => 3,
 'post_type' => get_post_type(),
 'has_terms' =>

    array(),
 'date_range' => array(),
 'exclude_post_ids' => array(),
 )
  30. jetpack_relatedposts_filter_post_type jetpack_relatedposts_filter_args array(
 'size' => 3,
 'post_type' => array(
 'post',


    'awesome_sauce',
 ),
 'has_terms' => array(),
 'date_range' => array(),
 'exclude_post_ids' => array(),
 )
  31. jetpack_relatedposts_filter_has_terms jetpack_relatedposts_filter_args array(
 'size' => 3,
 'post_type' => get_post_type(),
 'has_terms'

    => array(
 get_term_by( 'slug', 'devops', 'category' ),
 get_term_by( 'slug', 'hhvm', 'post_tag' ),
 ),
 'date_range' => array(),
 'exclude_post_ids' => array(),
 )
  32. jetpack_relatedposts_filter_date_range jetpack_relatedposts_filter_args array(
 'size' => 3,
 'post_type' => get_post_type(),
 'has_terms'

    => array(),
 'date_range' => array(
 'from' => strtotime( '-18 month' ),
 'to' => time(),
 ),
 'exclude_post_ids' => array(),
 )
  33. jetpack_relatedposts_filter_exclude_post_ids jetpack_relatedposts_filter_args array(
 'size' => 3,
 'post_type' => get_post_type(),
 'has_terms'

    => array(),
 'date_range' => array(),
 'exclude_post_ids' => array(
 1,
 1337,
 ),
 )
  34. array(
 'size' => 3,
 'post_type' => get_post_type(),
 'has_terms' => array(),


    'date_range' => array(),
 'exclude_post_ids' => array(),
 )
  35. array(
 array(
 'term' => array( 'tag.slug' => 'hhvm' )
 ),


    array(
 'not' => array(
 'term' => array( 'post_id' => 1337 )
 )
 ),
 …
 )
  36. array(
 array(
 'term' => array( 'tag.slug' => 'hhvm' )
 ),


    array(
 'not' => array(
 'term' => array( 'post_id' => 1337 )
 )
 ),
 …
 ) jetpack_relatedposts_filter_filters
  37. array(
 array(
 'term' => array( 'tag.slug' => 'hhvm' )
 ),


    array(
 'not' => array(
 'term' => array( 'post_id' => 1337 )
 )
 ),
 …
 ) jetpack_relatedposts_filter_filters developer.wordpress.com/docs/elasticsearch
  38. jetpack_relatedposts_filter_hits array(
 array( 'id' => 1337 ),
 array( 'id' =>

    631 ),
 array( 'id' => 1771 ),
 array( 'id' => 20 ),
 array( 'id' => 1491 ),
 )
  39. jetpack_relatedposts_returned_results [ {
 "id": 1771,
 "url": "http://xyu.io/2013/08/summer/",
 "url_meta": { "origin":

    2361, "position": 0 },
 "title": "Summer!",
 "format": false,
 "excerpt": "The cats of summer…",
 "context": "In 'cat pictures'",
 "img": {
 "src": "http://xyu.io/2013/08/summer.jpg",
 "width": 350, "height": 200
 }
 }, … ]
  40. Related Posts for Power Users • Customize placement with the

    related posts
 shortcode • Change results or look and feel with various
 filters • Go completely wild with the related posts
 raw object IT'S OVER 9000! IT'S OVER 9000!
  41. Using the Related Posts Raw Object $related = Jetpack_RelatedPosts::init_raw()
 ->set_query_name(

    'my_rp' ) // Optional
 ->get_for_post_id(
 $post_id, // For post_id
 5, // Get 5 results
 array( // ES filters
 array(
 'term' => array( 'tag.slug' => 'hhvm' )
 ),
 …
 )
 ) developer.wordpress.com/docs/elasticsearch
  42. Using the Related Posts Raw Object $related = array(
 array(

    'id' => 1337 ),
 array( 'id' => 631 ),
 array( 'id' => 1771 ),
 array( 'id' => 20 ),
 array( 'id' => 1491 ),
 ) developer.wordpress.com/docs/elasticsearch