Slide 1

Slide 1 text

1 Helping Grubhub Diners Find the Perfect Meal

Slide 2

Slide 2 text

2

Slide 3

Slide 3 text

Our Journey to Elasticsearch 3

Slide 4

Slide 4 text

Where we were before Elasticsearch ● Used Solr ● Deployed servers under an ELB ● Multi datacenter hot-hot ● Re-build indices and ship them into production with 0 downtime ● We were growing and needed to change …. 4

Slide 5

Slide 5 text

Elasticsearch Features Distributed Index 5 Data node Shard Data node Shard Data node Shard

Slide 6

Slide 6 text

Elasticsearch Features Data sources 6 Java Client / Hadoop Connector S3 Cassandra SQS/SNS Kinesis Elasticsearch

Slide 7

Slide 7 text

Elasticsearch Features Relational Data 7 Spatial Temporal Analytics Food

Slide 8

Slide 8 text

Deployment, Discover & Upgrades 8

Slide 9

Slide 9 text

Elasticsearch Deployment 9 AWS Region AWS Availability Zone Elasticsearch ● AWS Cloud Plugin ● Netflix Eureka ● Metrics in Datadog ● Tags in Eureka ● Eureka aware Client ● Index & Node Discovery ● Shard allocation aware ● Snapshots in S3 Master Node Master Node Master Node Master Node Data Nodes Data Nodes Data Nodes Data Nodes Eureka Eureka Eureka Eureka App Master Node App S3

Slide 10

Slide 10 text

Elasticsearch Snapshots ● Snapshots for everyone ● Emulating a Production dataset: ○ Snapshot of the market ○ Test out a sort ● Performance testing: ○ Index in production ○ Replay production load 10

Slide 11

Slide 11 text

Search and Relevance 11

Slide 12

Slide 12 text

Data Collection and Feedback 12 Search Cassandra S3 Elasticsearch A B TEST Search Logs Clickstream Events Impressions Data Science

Slide 13

Slide 13 text

Improving Search Features ● Create consumable data ● What are your KPIs? ● What user attributes can you use to influence KPIs? ● Decouple moving parts to allow for independent testing ● Measure, make changes and improve 13

Slide 14

Slide 14 text

Tuning for Production 14

Slide 15

Slide 15 text

Metrics that matter 15 ● Index store size and memory ● Active searcher threads ● Balance Read and Write ● Query distribution / Profiler

Slide 16

Slide 16 text

Balancing Shards 16

Slide 17

Slide 17 text

Location Based Sharding 17 ● Balance data in each shard. ● Tuned for index size. ● Tuned for query throughput. ● Indices behind an alias to allow for greater flexibility.

Slide 18

Slide 18 text

Beyond Elasticsearch 18 ● Impressions and click tracking ● Relevance tuning ● Data analytics ● Machine learning ● A/B testing

Slide 19

Slide 19 text

19