Slide 1

Slide 1 text

lessons learned Solr @jeroenrosenberg

Slide 2

Slide 2 text

Frontend of Lucene

Slide 3

Slide 3 text

Lucene xml/json api + field types + caching + faceting + grouping +

Slide 4

Slide 4 text

Indexing

Slide 5

Slide 5 text

Indexing

Slide 6

Slide 6 text

Lucene's inverted index

Slide 7

Slide 7 text

Efficient when many docs share the same value

Slide 8

Slide 8 text

Field types

Slide 9

Slide 9 text

Field type definition

Slide 10

Slide 10 text

Field type definition

Slide 11

Slide 11 text

... ... Field type definition

Slide 12

Slide 12 text

... ... Field type definition

Slide 13

Slide 13 text

Schemaless

Slide 14

Slide 14 text

Segments

Slide 15

Slide 15 text

Tune the merge factor

Slide 16

Slide 16 text

Max. # of segments Faster search, but slower indexing Faster indexing, but slower search

Slide 17

Slide 17 text

Don't commit. Ever.

Slide 18

Slide 18 text

Don't commit often.

Slide 19

Slide 19 text

Sharding

Slide 20

Slide 20 text

Manual distribution

Slide 21

Slide 21 text

foo foo foo core1 core2 core3 Index distributor replication

Slide 22

Slide 22 text

Look Ma, no downtime!

Slide 23

Slide 23 text

q=name:hotel1& shards=solr2:7070/solr/foo,solr3: 7070/solr/foo& partialResults=true Distributed search

Slide 24

Slide 24 text

10 * true solr2:7070/solr/foo,solr3:7070/solr/foo Distributed search config

Slide 25

Slide 25 text

10 * true solr2:7070/solr/foo,solr3:7070/solr/foo Distributed search config

Slide 26

Slide 26 text

10 * true solr2:7070/solr/foo,solr3:7070/solr/foo Distributed search config

Slide 27

Slide 27 text

q=name:hotel1&qt=distributedSearch Distributed search

Slide 28

Slide 28 text

Caching

Slide 29

Slide 29 text

Field value Filter Document Query result

Slide 30

Slide 30 text

Document Field value Query result Filter Doc ids of results per filter query

Slide 31

Slide 31 text

Query result Document Filter Field value Field names (facets) mapped to mapping of doc ids to terms

Slide 32

Slide 32 text

Field value Filter Document Query result Ordered set of doc ids of top N results

Slide 33

Slide 33 text

Field value Filter Query result Document Stored fields for each doc

Slide 34

Slide 34 text

Autowarming

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

q=*:*&fq=country:AN&fq=duration:[1 TO *]& fq=date:[NOW TO 2013-07-01T00:00:00Z] Filter queries...

Slide 37

Slide 37 text

q=*:*&fq=country:AN&fq=duration:[1 TO *]& fq=date:[NOW TO 2013-07-01T00:00:00Z] Match all documents q=*:*

Slide 38

Slide 38 text

q=*:*&fq=country:AN&fq=duration:[1 TO *]& fq=date:[NOW TO 2013-07-01T00:00:00Z] Filter by field value fq=country:AN

Slide 39

Slide 39 text

q=*:*&fq=country:AN&fq=duration:[1 TO *]& fq=date:[NOW TO 2013-07-01T00:00:00Z] Range query with wildcard fq=duration:[1 TO *] range query using DateMath syntax fq=date:[NOW TO 2013-07-01T00:00:00Z]

Slide 40

Slide 40 text

q=*:*&rows=10000000 Getting all results

Slide 41

Slide 41 text

Faceting

Slide 42

Slide 42 text

rows=0&facet=true&facet.field=departureairport& facet.field=touroperator&facet.limit=-1& facet.mincount=1&f.touroperator.facet.limit=2 A facet query...

Slide 43

Slide 43 text

rows=0&facet=true&facet.field=departureairport& facet.field=touroperator&facet.limit=-1& facet.mincount=1&f.touroperator.facet.limit=2 Enable faceting facet=true

Slide 44

Slide 44 text

rows=0&facet=true&facet.field=departureairport& facet.field=touroperator&facet.limit=-1& facet.mincount=1&f.touroperator.facet.limit=2 rows=0 Suppress document results

Slide 45

Slide 45 text

rows=0&facet=true&facet.field=departureairport& facet.field=touroperator&facet.limit=-1& facet.mincount=1&f.touroperator.facet.limit=2 facet.field=departureairport Specify a field name ...and another one facet.field=touroperator

Slide 46

Slide 46 text

rows=0&facet=true&facet.field=departureairport& facet.field=touroperator&facet.limit=-1& facet.mincount=1&f.touroperator.facet.limit=2 Unlimited field values (globally) facet.limit=-1

Slide 47

Slide 47 text

rows=0&facet=true&facet.field=departureairport& facet.field=touroperator&facet.limit=-1& facet.mincount=1&f.touroperator.facet.limit=2 Unlimited field values (globally) facet.limit=-1 Basically, always a good idea

Slide 48

Slide 48 text

rows=0&facet=true&facet.field=departureairport& facet.field=touroperator&facet.limit=-1& facet.mincount=1&f.touroperator.facet.limit=2 Override global limit for specific field names f.touroperator.facet.limit=2

Slide 49

Slide 49 text

rows=0&facet=true&facet.field=departureairport& facet.field=touroperator&facet.limit=-1& facet.mincount=1&f.touroperator.facet.limit=2 At least 1 document per field value facet.mincount=1

Slide 50

Slide 50 text

q=*:*&fq={!tag=country}country:AN&facet=true& facet.field={!ex=country}country&facet.limit=-1& facet.mincount=1 Multi-select faceting...

Slide 51

Slide 51 text

q=*:*&fq={!tag=country}country:AN&facet=true& facet.field={!ex=country}country&facet.limit=-1& facet.mincount=1 fq={!tag=country}country:AN Tag a filter query... ...and exclude it for a field value facet.field={!ex=country}country

Slide 52

Slide 52 text

FACET ALL THE THINGS! FACET ALL THE THINGS!

Slide 53

Slide 53 text

Grouping

Slide 54

Slide 54 text

group=true&group.field=accoid& group.sort=price asc&sort=popularity asc& group.facets=UNGROUPED A grouping query...

Slide 55

Slide 55 text

group=true&group.field=accoid& group.sort=price asc&sort=popularity asc& group.facets=UNGROUPED Enable grouping group=true

Slide 56

Slide 56 text

group=true&group.field=accoid& group.sort=price asc&sort=popularity asc& group.facets=UNGROUPED Specify the field name group.field=accoid

Slide 57

Slide 57 text

group=true&group.field=accoid& group.sort=price asc&sort=popularity asc& group.facets=UNGROUPED Determines group head group.sort=price asc Determine order of document results sort=popularity asc

Slide 58

Slide 58 text

group=true&group.field=accoid& group.sort=price asc&sort=popularity asc& group.facets=UNGROUPED Determines group head group.sort=price asc Determine order of document results sort=popularity asc Only group heads are returned!

Slide 59

Slide 59 text

ONE DOES NOT SIMPLY EXPLAIN SOLR QUERIES ONE DOES NOT SIMPLY EXPLAIN SOLR QUERIES

Slide 60

Slide 60 text

debugQuery=true

Slide 61

Slide 61 text

Solr 4.3 is coming http://docs.lucidworks.com/display/solr/Major+Changes+from+Solr+3+to+Solr+4

Slide 62

Slide 62 text

Queries?