Slide 1

Slide 1 text

Finding Cars and Hunting Down Logs: Elasticsearch @ AutoScout24 AutoScout24 24 Nov 2016 Philipp Garbe | Lead developer | [email protected] Juri Smarschevski | Team lead | [email protected]

Slide 2

Slide 2 text

Search AutoScout24 search journey in nutshell 3

Slide 3

Slide 3 text

Who we are ? Unique Monthly Visitors in Europa 4 … 10 more

Slide 4

Slide 4 text

Some numbers Search index contains ~2.6M classifieds 5 Unique visitors (monthly): ~10M Search requests per day: ~36M Index update rate per day ~400.000 classifieds

Slide 5

Slide 5 text

Status quo. March 2013. Endeca used as a search engine 6 Use case: providing search results and facets for the entire AS24 platform Problems: •  New product requirements, performance of Endeca becomes slower •  Time to market of our required features is not sufficient •  Maintenance is complex / expensive

Slide 6

Slide 6 text

Possible candidates Solr ? •  too complex installation / configuration 7 Sphinx ? •  Support situation is unclear Elasticsearch ? •  Fresh buzzword •  From beginning on built for distributed systems (rumors) •  Easy installation / configuration (fact)

Slide 7

Slide 7 text

POC Goals •  Performance should be comparable with Endeca •  The solution should be scalable 8

Slide 8

Slide 8 text

9 Rollout plan. 03.2013 - 11.2013 07.2013 11.2013 02.2013 03.2013 05.2013 POC Implementation & migration Training Go live phase #real_project_picture_squeezed

Slide 9

Slide 9 text

10 Endeca Elasticsearch (0.9.x) Amount of machines 60 20 [Re]index time ~180 min ~45 min Deploy to Live up to 2 days < 3 hours Effort for testing an issue on local machine 4 h 1 h Performance = = Product / dev guys satisfaction :( :) 300% 400% 1000% 400% % ? Results after 8 months of working.

Slide 10

Slide 10 text

No problems after migration ? Cluster split brain Has in fact nothing to do with Elasticsearch, is more related to learn phase at AS24 11 Deep pagination Elasticsearch 5.x release notes: “Deep pagination of search results is now possible with the search_after feature, which efficiently skips over previously returned results to return just the next page.“

Slide 11

Slide 11 text

12 Status quo. November 2014. Project “Tatsu” has started .NET => JVM C# => Scala IIS / Windows => Play / Linux Local data center => AWS Monolith => Micro services Windows workstations => Mac notebooks ... => ...

Slide 12

Slide 12 text

13 Status quo. November 2014. Project “Tatsu” has started .NET => JVM C# => Scala IIS / Windows => Play / Linux Local data center => AWS Monolith => Micro services Windows workstations => Mac notebooks ? => ? => 2015

Slide 13

Slide 13 text

14 Elasticsearch clusters “lift & shift” to AWS ? AWS Elasticsearch Service ? Elasticsearch as a service (SaaS) ? Own hosting in AWS ?

Slide 14

Slide 14 text

15 Rolling update in details (possible scenario). Time 1 Initial state

Slide 15

Slide 15 text

16 Rolling update in details (possible scenario). Node has been replaced Time 1 2 Initial state ~ 60 sec

Slide 16

Slide 16 text

17 Rolling update in details (possible scenario). Master has been killed Node has been replaced Time 1 2 3 Initial state

Slide 17

Slide 17 text

18 Rolling update in details (possible scenario). Master has been killed Node has been replaced Master election Time 1 2 3 4 Initial state

Slide 18

Slide 18 text

19 Rolling update in details (possible scenario). Master has been killed Node has been replaced Master election Time 1 2 3 4 5 Initial state Last node has been replaced

Slide 19

Slide 19 text

20 Rolling update findings Master has been killed ? Outage =

Slide 20

Slide 20 text

21 Rolling update findings

Slide 21

Slide 21 text

Logging Continuously deployed, immutable and stateful 22

Slide 22

Slide 22 text

7.4 billion documents Some numbers 36 TB EBS 18 nodes á m4.4xlarge (64GB / 53.5 cpu units)

Slide 23

Slide 23 text

Unified Logs 24

Slide 24

Slide 24 text

Challenge: Deployment time

Slide 25

Slide 25 text

Rolling updates 26

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

Challenge: Costs

Slide 32

Slide 32 text

First setup •  18x m4.4xlarge •  18x 2TB gp2 •  3TB/day cross-zone traffic

Slide 33

Slide 33 text

Cost/Usage Optimized Setup •  15x m4.x2large •  15x 384GB gp2 •  6x SpotFleet •  6x 4TB st1 •  9TB/day cross-zone traffic Savings: ~40%

Slide 34

Slide 34 text

Future. What next ? Percolator (saved search) 35 Elastic Graph (recommendations) Freetext search

Slide 35

Slide 35 text

36 Conclusion Here is a simple question - if we had the possibility to go back in the time and start the same journey with Elasticsearch, would we do it the same way ?

Slide 36

Slide 36 text

Q & A 37