Slide 1

Slide 1 text

HotelTonight Jatinder Singh and Paul Sorensen

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Jatinder Singh Director of Engineering, Platform Twitter @rubymerchant Email [email protected]

Slide 4

Slide 4 text

HotelTonight?

Slide 5

Slide 5 text

World’s first mobile only last minute hotel booking app.

Slide 6

Slide 6 text

Hotels Customers 1. 40% rooms unsold. 1. 20% to 70% discounts. HotelTonight

Slide 7

Slide 7 text

Make the world more spontaneous

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Spontaneity Brings Technical Challenges

Slide 11

Slide 11 text

Simplicity 1. Few taps and a swipe. 2. Just a small list of hotels. 3. Just the very best hotels for you. 4. Fast.

Slide 12

Slide 12 text

Perishable Inventory 1. Availability and pricing changes all the time. 2. Real-time

Slide 13

Slide 13 text

Ad

Slide 14

Slide 14 text

Jatinder Singh Director of Engineering, Platform Twitter @rubymerchant Email [email protected]

Slide 15

Slide 15 text

Finding the Best Hotels in the Moment How HotelTonight uses Elasticsearch to power its hotel search algorithm

Slide 16

Slide 16 text

Hi

Slide 17

Slide 17 text

About Me (Paul Sorensen) ● Platform Engineer at HotelTonight ● Work on our hotel ranking algorithm using Elasticsearch ● Currently fascinated with scaling web apps twitter: @paulnsorensen

Slide 18

Slide 18 text

● Hotels compete for display ● We show you the best deals this is where the ranking comes in ● Book up to 7 days in advance for up to a 5 night stay About HotelTonight

Slide 19

Slide 19 text

What if I told you... We increased our inventory records by 50x Our system can handle 10x more traffic We cut our response times by 150% That’s what we did.

Slide 20

Slide 20 text

How did we do it? gem install elasticsearch rake scale:hotels That’s it. THANKS FOR COMING!!! PROFIT JUST KIDDING

Slide 21

Slide 21 text

there is no silver bullet

Slide 22

Slide 22 text

scaling is hard

Slide 23

Slide 23 text

Scope

Slide 24

Slide 24 text

Overview Why Elasticsearch? What is Elasticsearch? The awesome challenges we get to work on

Slide 25

Slide 25 text

What this is not Not a technical deep dive Not an objective comparison between tech

Slide 26

Slide 26 text

Impetus

Slide 27

Slide 27 text

we grew up

Slide 28

Slide 28 text

from 3 cities to 2000 cities

Slide 29

Slide 29 text

we’re never not booking rooms

Slide 30

Slide 30 text

Early 2014: Our system was reaching its capacity MySQL O(n^2 log n) Ranking over hundreds of ActiveRecord objects gimme nearby hotels geolocation query

Slide 31

Slide 31 text

Early 2014: Our system was reaching its capacity MySQL O(n^2 log n) Ranking over hundreds of ActiveRecord objects gimme nearby hotels geolocation query geolocation query

Slide 32

Slide 32 text

Early 2014: Our system was reaching its capacity MySQL O(n^2 log n) Ranking over hundreds of ActiveRecord objects gimme nearby hotels geolocation query geolocation query geolocation query

Slide 33

Slide 33 text

Early 2014: Our system was reaching its capacity MySQL O(n^2 log n) Ranking over hundreds of ActiveRecord objects gimme nearby hotels geolocation query geolocation query geolocation query

Slide 34

Slide 34 text

Early 2014: Our system was reaching its capacity MySQL O(n^2 log n) Ranking over hundreds of ActiveRecord objects gimme nearby hotels geolocation NOPE. I’M OUT

Slide 35

Slide 35 text

Early 2014: Our system was reaching its capacity MySQL O(n^2 log n) Ranking over hundreds of ActiveRecord objects gimme nearby hotels geolocation NOPE. I’M OUT

Slide 36

Slide 36 text

Early 2014: Our system was reaching its capacity MySQL O(n^2 log n) Ranking over hundreds of ActiveRecord objects gimme nearby hotels geolocation NOPE. I’M OUT

Slide 37

Slide 37 text

we cannot have downtime

Slide 38

Slide 38 text

meanwhile

Slide 39

Slide 39 text

we still needed to grow

Slide 40

Slide 40 text

Later 2014: We wanted to expand our booking window from 1 to 7 days Same-day 6 more days, 7x data Advance booking HOW ARE WE GOING TO RUN GEO QUERIES!?

Slide 41

Slide 41 text

scaling is hard

Slide 42

Slide 42 text

scaling is unique

Slide 43

Slide 43 text

What were our choices? • More Caching? • Use OpenGIS on MySQL (geospatial index extension)? • Switch to PostgreSQL and use PostGIS? • Find something from Hacker News? • Use Elasticsearch? The full-text indexing engine?

Slide 44

Slide 44 text

Elasticsearch Use cases ● Full-text search ● Analytics: Elastic’s ELK (Elasticsearch, Logstash, Kibana) ● Spell-checking, Autocomplete ● Ranking hotel rooms?

Slide 45

Slide 45 text

Elasticsearch!

Slide 46

Slide 46 text

how does elasticsearch work?

Slide 47

Slide 47 text

Documents: { “_id” : 4492, “description: “The quick brown fox jumps over lazy dogs” }, { “_id” : 4493, “description: “The slow red fox doesn’t say anything” }

Slide 48

Slide 48 text

Match Query: { “query” : { “match” : { “description” : “quick fox” } } } => returns documents matching “quick” and/or “fox”

Slide 49

Slide 49 text

Inverted index: { “fox” => [4492, 4493], “brown” => [4492], “red” => [4493], } How is it stored? THIS MAKES IT FAST

Slide 50

Slide 50 text

Elasticsearch supports many filters A few examples we can use: ● term - exact match ● bool - combine filters ● various geo filters ● range

Slide 51

Slide 51 text

Independent filter caching ● queries cache individual filter matches* ● very fast to check if a document matches ● *but not geo, range or script filters

Slide 52

Slide 52 text

how can we use this?

Slide 53

Slide 53 text

run cheap filters first THEN run geo

Slide 54

Slide 54 text

but wait, how do you rank documents?

Slide 55

Slide 55 text

Elasticsearch orders documents by relevance ● Define your own scoring functions ● Let the Elasticsearch determine most relevant documents ● Don’t have to load ActiveRecord objects into memory to rank them anymore

Slide 56

Slide 56 text

less memory == faster

Slide 57

Slide 57 text

we wanna go fast

Slide 58

Slide 58 text

Alright — Let’s use Elasticsearch

Slide 59

Slide 59 text

✓ prototype ✓ perf test ✓ provision it

Slide 60

Slide 60 text

How it’s designed Docs MySQL price updates $$ Elasticsearch denormalization

Slide 61

Slide 61 text

How it’s designed Elasticsearch MySQL generate response generate query

Slide 62

Slide 62 text

Our biggest challenge

Slide 63

Slide 63 text

Elasticsearch MySQL must be kept in sync

Slide 64

Slide 64 text

Elasticsearch MySQL changing fields on a document type requires new index Elasticsearch

Slide 65

Slide 65 text

If Elasticsearch goes down, we go down Elasticsearch MySQL generate response generate query

Slide 66

Slide 66 text

If Elasticsearch goes down, we go down Elasticsearch MySQL generate response generate query

Slide 67

Slide 67 text

we cannot have downtime

Slide 68

Slide 68 text

we cannot have inconsistency

Slide 69

Slide 69 text

● We have to minimize consistency delays ● Defend against them when they do happen ● Zero-downtime mapping changes We are conquering these challenges

Slide 70

Slide 70 text

Zero-downtime Mapping Changes Docs MySQL Elasticsearch denormalization Elasticsearch track changes load documents from database

Slide 71

Slide 71 text

scaling is hard

Slide 72

Slide 72 text

scaling is unique

Slide 73

Slide 73 text

More is always sometimes better 6 more days of booking 50x inventory 10x traffic 150% quicker response times PROFIT

Slide 74

Slide 74 text

scaling is awesome

Slide 75

Slide 75 text

Thanks Try Elasticsearch (with us? we’re hiring) Twitter @paulnsorensen Email [email protected] $25 Off First Booking PAUL