Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Seek - Elasticsearch: From Hack to Production

Seek - Elasticsearch: From Hack to Production

Brett Christensen, Reza Yousefzadeh and Nivantha Mandawala from Seek.com.au talk about migrating from a legacy search system to Elasticsearch for the Seek.com.au 'Talent Search' service.

Elastic Co

April 28, 2016
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. About us • Brett Christensen ◦ https://www.linkedin.com/in/brett-christensen-a73a4010 ◦ @brett_c •

    Reza Yousefzadeh ◦ https://au.linkedin.com/in/reza-yousefzadeh-7052b939 ◦ @reza_yz • Nivantha Mandawala ◦ https://au.linkedin.com/in/nivantha ◦ @nivazone
  2. SEEK Search Team Mission: Provide world class search and matching

    • Highly customised search algorithms • Highly performant indexing/searching • World class consumer experience
  3. SEEK Search Team Continuous Delivery as a culture • Cloud

    first • Faster feedback cycles • Easy to diagnose/resolve issues • Zero downtime deployments • Speed to market++ Credit to Łukasz Górnicki / @derberq
  4. The problem: Hitting roadblocks • Proprietary search engine • Complex

    queries were too slow • Not horizontally scalable • Not cloud ready • Limited set of API’s, limited SDKs • Very small community
  5. Pick a Target Why Talent Search? • Decoupled/Continuously Delivered •

    All the components already in Cloud • Less high profile compared to jobsearch
  6. Pick a Target Why Elasticsearch? • Free and easily available

    • Doco/community++ • It’s Elastic! • Highly customizable • High team engagement
  7. Spiking it out: A hackathon project • 3 days for

    vertical slice of functionality • No experience with Elasticsearch as core search • Started out with Amazon hosted service
  8. Hackathon learnings about Elastic • Good documentation • Large community

    • Feature rich • Customisable • Highly performant • High confidence in estimating switching costs
  9. Next step: Get it on the roadmap Obstacles • Issues

    with current system known but... • Previous experience with migrating painful
  10. Next step: Get it on the roadmap Approach • Start

    small and demonstrate value quickly • Get people on board early • Build it fast and show it off.... • ...but not just a hack • Test with a small percentage of traffic
  11. Outcome • (Most) of Talent Search is now powered by

    Elasticsearch! • Job Search to follow...
  12. Continuous Delivery “In software, when something is painful, the way

    to reduce the pain is to do it more frequently, not less.” ― David Farley, Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation
  13. Automated Testing • Smoke test • Stop the pipeline on

    failed tests • Performance test before switching
  14. Elasticsearch - Customizing Similarity Algorithm { "settings": { "similarity": {

    "algorithm_1_bm25": { "type": "BM25", "k1": 0.2, "b": 0.35 }, "algorithm_2_bm25": { "type": "BM25", "k1": 0.001, "b": 0 } } } }
  15. { "mappings": { "profile": { "properties": { "jobTitle": { "type":

    "string", "similarity": "algorithm_1_bm25" }, "jobDescription": { "type": "string", "similarity": "algorithm_2_bm25" } } } } } Elasticsearch - Customizing Similarity Algorithm...
  16. Elasticsearch - Tuning Relevance • Challenging and time consuming •

    Quick feedback cycle • Weightings/Boosting • Avoid re-indexing when we can
  17. Elasticsearch - Tuning Performance • Memory intensive, Java heap size

    • Enable replicas after re-indexing • SSD instance storage over EBS • HTTP Compression • Cluster rebuilt/re-indexed in 2 hours!
  18. Prod vs. Dark, Champ vs. Challenger • Testing/tuning relevance in

    dummy data is quite hard • Why not create “dark cluster” with prod data? • Site override, Split traffic 50/50...
  19. Summary • Elasticsearch is easy to work with, customizable and

    feature rich • Continuous delivery made a huge difference • Sell your idea to fellow team members, gather momentum • Start small but build it as if you are going to productionize it
  20. Keep in touch... • Brett Christensen ◦ https://www.linkedin.com/in/brett-christensen-a73a4010 ◦ @brett_c

    • Reza Yousefzadeh ◦ https://au.linkedin.com/in/reza-yousefzadeh-7052b939 ◦ @reza_yz • Nivantha Mandawala ◦ https://au.linkedin.com/in/nivantha ◦ @nivazone