Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[Englisch] You Complete Me!

[Englisch] You Complete Me!

A highly scalable autocomplete for Europe's biggest recipe search in just three (short) days.

Topics: Scaling, Caching, Golang, DevOps, Docker, ELK

Follow me:
https://twitter.com/b00gizm
https://github.com/b00gizm

Avatar for Pascal Cremer

Pascal Cremer

March 10, 2017
Tweet

More Decks by Pascal Cremer

Other Decks in Technology

Transcript

  1. METT-A-THON • Internal company wide Hackathon • A whole work

    week (1 discovery day + 3.5 days of "Hacking") • Not only for developers, but for the whole staff • Projects MUST go live during the week (otherwise rm -rf) • In the following two weeks, all live projects will be evaluated to decide whether to continue or sunset them • http:/ /instagram.com/chefkoch.de
  2. –Project pitch during the discovery day. „An autocomplete functionality for

    our recipe search would support our users in different kind of ways, e.g. with faster and more targeted searches by displaying relevant suggestions, which would lead to meaningful results - or just as inspiration.“
  3. –Project hypothesis „With a more optimized search, fewer user will

    use the Google search, but will stay longer on our platform to use our own native search“
  4. class RecipesController extends Controller { public function suggestAction(Request $request) {

    $conn = $this->get('database_connection'); $queryBuilder = $conn->getQueryBuilder(); $queryBuilder ->select('title') ->from('recipes') ->where('title LIKE :term') ->setMaxResults(10) ->setParameter('term', $request->query->get('term').'%') ; $results = array_map(function($row) { return $row['title']; }, $queryBuilder->execute()->fetchAll()); return new JsonResponse(['suggestions' => $results]); } }
  5. 189,53 Mio.* Page Impressions (PER MONTH) 36,87 MIO.* UNIQUE VISITS

    (PER MONTH) USAGE OF OUR SEARCH (PER DAY, WEB) 300.000 * December 2016, source: www.ivw.eu
  6. class RecipesController extends Controller { public function suggestAction(Request $request) {

    $conn = $this->get('database_connection'); $queryBuilder = $conn->getQueryBuilder(); $queryBuilder ->select('title') ->from('recipes') ->where('title LIKE :term') ->setMaxResults(10) ->setParameter('term', $request->query->get('term').'%') ; $results = array_map(function($row) { return $row['title']; }, $queryBuilder->execute()->fetchAll()); return new JsonResponse(['suggestions' => $results]); } } JUST. NOPE!
  7. Search auto suggest • Search engine instead of MySQL text

    search • High performance (hundreds of req/s) • Easily scalable • Monitoring / Tracking • Realized as micro service
  8. docker run -d —-name my-nginx -p 8080:80 nginx curl -I

    http://localhost:8080 HTTP/1.1 200 OK Server: nginx/1.11.9 Date: Mon, 13 Feb 2017 09:47:45 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 24 Jan 2017 14:02:19 GMT Connection: keep-alive ETag: "58875e6b-264" Accept-Ranges: bytes
  9. DAY #1 - BE ELASTIC, BABY! • Elasticsearch as search

    engine • Lucene based, REST-API, Autocomplete Out-of-the-Box • PHP script for periodic index updates • Periodic fetches of delta change sets from the Chefkoch-API • jQuery Autocomplete Plugin
  10. FRONT VARNISH SEARCH-SUGGEST ELASTIC INDEXER (PHP) CK-API (holt Change-Sets) (befüllt

    Index per Cronjob) (Liefert Vorschläge) POST /search-suggestions/_search
  11. DAY #1 - MEH! • Quality of suggestions was kind

    of poor • We don't want recipe titles as suggestions, we want real search phrases from real users • Response had not been cached • We'll need a public API for our iOS & Android apps, but the raw Elasicsearch REST-API is way too chatty
  12. POST search-suggestions/_search { "suggest": { "recipes-suggest" : { "prefix" :

    "spag", "completion" : { "field" : "suggest" } } } }
  13. { "_shards" : { "total" : 5, "successful" : 5,

    "failed" : 0 }, "hits": ... "took": 2, "timed_out": false, "suggest": { "recipes-suggest" : [ { "text" : "spag", "offset" : 0, "length" : 4, "options" : [ { "text" : "Spaghetti Bolognese", "_index": "recipes", "_type": "recipe", "_id": "1", "_score": 1.0, "_source": { "suggest": ["Spaghetti Bolognese"] } } ] } ] } }
  14. $(function() { $("#inputfield_quicksearch").autocomplete({ source: function(request, response) { $.ajax({ type: "POST",

    url: ..., data: JSON.stringify( { suggestions: { prefix: request.term, completion: { field: "suggest" } } } ), success: function (data) { var suggestions = []; $.each(data.suggestions, function() { $.each(this.options, function() { suggestions.push(this.text); }); }); response(suggestions); } }); }, success: function() {} }) });
  15. FRONT VARNISH POST /search-suggestions/_search SEARCH-SUGGEST ELASTIC INDEXER (PHP) CK-API (holt

    Change-Sets) (befüllt Index per Cronjob) (Liefert Vorschläge) NOT THERE YET!
  16. DAY #2 - IT'S JUST A FACADE... • API facade

    to only expose what we really need to expose • High performance proxy between our Varnish and Elasticsearch for transforming requests and responses • Use of our front Varnish for caching • HTTP caching headers will be added by the API facade
  17. — The "Facade" design pattern (Wikipedia) „The Facade design pattern

    is often used when a system is very complex or difficult to understand because the system has a large number of interdependent classes or its source code is unavailable. This pattern hides the complexities of the larger system and provides a simpler interface to the client. It typically involves a single wrapper class that contains a set of members required by client. These members access the system on behalf of the facade client and hide the implementation details..“
  18. — The "Facade" design pattern (Wikipedia) „The Facade design pattern

    is often used when a system is very complex or difficult to understand because the system has a large number of interdependent classes or its source code is unavailable. This pattern hides the complexities of the larger system and provides a simpler interface to the client. It typically involves a single wrapper class that contains a set of members required by client. These members access the system on behalf of the facade client and hide the implementation details..“
  19. POST /search-suggestions/_search { "suggest": { "recipes-suggest" : { "prefix" :

    "spag", "completion" : { "field" : "suggest" } } } } GET /search-suggestions?t=spag
  20. { "_shards" : { "total" : 5, "successful" : 5,

    "failed" : 0 }, "hits": ... "took": 2, "timed_out": false, "suggest": { "recipes-suggest" : [ { "text" : "spag", "offset" : 0, "length" : 4, "options" : [ { "text" : "Spaghetti Bolognese", "_index": "recipes", "_type": "recipe", "_id": "1", "_score": 1.0, "_source": { "suggest": ["Spaghetti Bolognese"] } } ] } ] } } { "suggestions": [ "Spaghetti Bolognese", ... ] }
  21. Golang: A statically compiled, C-like programming language with an easy

    to learn syntax, huge benefits on performance and an awesome concurrency model.
  22. r := gin.New() r.Use(gin.Recovery()) r.Use(middlewares.Cache()) r.GET("/search-suggestions", func(c *gin.Context) { searchTerm,

    found := c.GetQuery("t") if !found { c.JSON(404, gin.H{ "error": "Missing search term", }) return } suggestion, err := elasticSuggester.Fetch(searchTerm) if err != nil { c.JSON(500, gin.H{ "error": "Error while fetching suggestions", }) return } c.JSON(200, suggestion) }) r.Run("0.0.0.0:8080")
  23. FRONT VARNISH SEARCH-SUGGEST GET /search-suggestions?t=spag FACADE INDEXER (PHP) ELASTIC ?!?

    (Serves suggestions) (Transforms req/res) (Adds HTTP Cache Header)
  24. DAY #3 - SCALE ALL THE THINGS! • A better

    index for our search suggestions though Elasticsearch • CSV export of real user search phrases from our BI Hadoop cluster • Scaling the API facade and Elasticsearch to an arbitrary number of instances • New Elasticsearch instances shall be able to (re-)index themselves • Integration of logging and monitoring
  25. INDEXER (PHP) ELASTICSEARCH Chefkoch Docker Registry ELASTICSEARCH FRONT VARNISH SEARCH-SUGGEST

    ELASTICSEARCH API FACADE API FACADE ELASTICSEARCH ELASTICSEARCH API FACADE BAMBOO (CI) NIGHTLY BUILD Hadoop Cluster (CSV Export)
  26. LEARNINGS! • Agility is your friend! • Keep striving for

    the most simple solution, which is almost always the best • Use the right tool for the right job • Seek for feedback as early as possible • Power of Proof • Don't have your last Gin Tonic at 5am on release day. No really, that was a shitty idea