Slide 1

Slide 1 text

F I R S T N A M E L A S T N A M E @ S K U R F U E R S T S E A R C H I N G I N N E O S S E B A S T I A N K U R F Ü R S T

Slide 2

Slide 2 text

Sebastian Kurfürst @skurfuerst

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

exply.io Enterprise Search meets Business Intelligence

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Slide 8

Slide 8 text

features (Page) main (ContentCollection) … (Headline) … (Text) roadmap (Page) neostypo3org (Page) Tree of Nodes de en de en de en en en unsere-codesprints (Page) de

Slide 9

Slide 9 text

TYPO3CR is great for Tree Traversal

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Find all articles written by Sebastian. Display the first three locations tagged with ConferenceLocation. What are the newest pages in a certain category?

Slide 12

Slide 12 text

features (Page) main (ContentCollection) … (Headline) … (Text) roadmap (Page) neostypo3org (Page) unsere-codesprints (Page)

Slide 13

Slide 13 text

features (Page) main (ContentCollection) … (Headline) … (Text) roadmap (Page) neostypo3org (Page) unsere-codesprints (Page)

Slide 14

Slide 14 text

All Documents & Content Currently Relevant Content

Slide 15

Slide 15 text

Currently, TYPO3CR does not yet effectively provide this set-based view on nodes.

Slide 16

Slide 16 text

to the rescue!

Slide 17

Slide 17 text

Getting Started

Slide 18

Slide 18 text

1. Set up ElasticSearch # ElasticSearch 1.4.4 - config/elasticsearch.yml script.disable_dynamic: sandbox script.groovy.sandbox.class_whitelist: java.util.LinkedHashMap script.groovy.sandbox.receiver_whitelist: java.util.Iterator, 
 java.lang.Object, java.util.Map, java.util.Map$Entry script.groovy.sandbox.enabled: true cluster.name: [PUT_YOUR_CUSTOM_NAME_HERE] network.host: 127.0.0.1 index.number_of_shards: 1 index.number_of_replicas: 0

Slide 19

Slide 19 text

2. Start ElasticSearch bin/elasticsearch

Slide 20

Slide 20 text

composer require --prefer-source typo3/typo3cr-search @dev composer require --prefer-source flowpack/elasticsearch- contentrepositoryadaptor @dev 2. Require the CR Adaptor TODO: no @dev anymore!

Slide 21

Slide 21 text

/flow nodeindex:build 3. Indexing

Slide 22

Slide 22 text

4. Debugging Tools http://localhost:9200/_plugin/head/ http://localhost:9200/_plugin/sense/

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

composer require --prefer-source flowpack/searchplugin @dev

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

This is fulltext search.

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

Node Querying

Slide 31

Slide 31 text

Article Category Tag contains tagged with

Slide 32

Slide 32 text

1. Node References # build up relation in NodeTypes.yaml 'Sandstorm.News:Article': superTypes: ['TYPO3.Neos:Document'] ... properties: tags: type: references ui: label: 'Tags' inspector: editorOptions: # allow only references to tags nodeTypes: ['Sandstorm.News:Tag']

Slide 33

Slide 33 text

2. Query in TypoScript # replace main content area by a custom TypoScript object prototype(PrimaryContent).newsTag { condition = ${q(node).is('[instanceof Sandstorm.News:Tag]')} type = 'Sandstorm.News:Tag' }

Slide 34

Slide 34 text

2. Query in TypoScript # inherits from Template by default prototype(Sandstorm.News:Tag) { latestArticlesTaggedWithTag = ${...} }

Slide 35

Slide 35 text

2. Query in TypoScript latestArticlesTaggedWithTag = ${Search.query(site) # search underneath this site
 .nodeType('Sandstorm.News:Article') # filter by node type .exactMatch('tags', node) # where tag == current tag .limit(3) # first 3 results .sortDesc('publishDate') # and sort by publishing date desc .execute()}

Slide 36

Slide 36 text

3. Use in Template

Slide 37

Slide 37 text

${Search.query(site)
 .fulltext('Alice') .execute()}

Slide 38

Slide 38 text

Node References together with Search Queries

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

ElasticSearch Core Concepts

Slide 41

Slide 41 text

A T2 T1 T3 Normalized Data in a relational DB A T1 A T2 A T3 Denormalized Data in an index

Slide 42

Slide 42 text

GET Index/Type/Document-ID

Slide 43

Slide 43 text

GET Index/Type/_mapping

Slide 44

Slide 44 text

GET Index/Type/_mapping _all

Slide 45

Slide 45 text

typo3cr-1426882860 typo3cr-1426885219 typo3cr Index Aliases allow index rebuilds!

Slide 46

Slide 46 text

Hey, InspiringCon 2015 Hey, InspiringCon 2015 Tokenization Token Filtering Hey InspiringCon 2015 hey inspiringcon 2015 Indexing Pipeline InspiringCon2015 Search Pipeline InspiringCon 2015 inspiringcon 2015

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

${Search.query(site)
 .fulltext('Alice') .execute()} .log() 15-03-23 07:20:50 1820 DEBUG Query Log (): {"query":{"filtered":{"query":{"bool":{"must":[{"match_all":[]},{"query_string": {"query":"Alice"}}]}},"filter":{"bool":{"must":[{"term":{"__parentPath":"\/sites\/neosdemotypo3org"}},{"terms":{"__workspace":["live"]}}],"should":[],"must_not":[{"term": {"_hidden":true}},{"range":{"_hiddenBeforeDateTime":{"gt":"now"}}},{"range":{"_hiddenAfterDateTime":{"lt":"now"}}}]}}}},"fields":["__path"],"highlight":{"fields": {"__fulltext*":{"fragment_size":150,"no_match_size":150,"number_of_fragments":2}}}} -- execution time: 10.998010635376 ms -- Total Results: 28 Data/Logs/ElasticSearch.log

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

No content

Slide 51

Slide 51 text

Aggregations calculate statistical information about the current result. TODO Kibana Screenshot

Slide 52

Slide 52 text

Aggregations calculate statistical information about the current result.

Slide 53

Slide 53 text

Fine-Tuning ElasticSearch+Neos

Slide 54

Slide 54 text

1. ElasticSearch Schema

Slide 55

Slide 55 text

1. ElasticSearch Schema TYPO3:
 TYPO3CR:
 Search:
 defaultConfigurationPerType:
 string:
 elasticSearchMapping:
 type: string
 include_in_all: false
 boolean:
 elasticSearchMapping:
 type: boolean
 date:
 elasticSearchMapping:
 type: date
 format: 'date_time_no_millis'
 include_in_all: false
 Settings.yaml NodeTypes.yaml 
 'TYPO3.Neos:Node': &node
 properties:
 '__identifier':
 search:
 elasticSearchMapping:
 type: string
 index: not_analyzed
 include_in_all: false
 
 defaults overrides indexing: '${node.identifier}'

Slide 56

Slide 56 text

2. ElasticSearch Indexing indexing: '${Indexing.buildAllPathPrefixes(node.parentPath)}' indexing: '${node.identifier}'

Slide 57

Slide 57 text

3. Fulltext Searching We at InspiringCon (Article) main (ContentCollection) … (Headline) … (Text) collect all content Fulltext Root

Slide 58

Slide 58 text

3. Fulltext Searching We at InspiringCon (Article) main (ContentCollection) … (Headline) … (Text) h1 h2 ... text

Slide 59

Slide 59 text

3. Fulltext Searching # predefined in Neos 'TYPO3.Neos:Document':
 search:
 fulltext:
 isRoot: true 'TYPO3.Neos.NodeTypes:Text':
 properties:
 'text':
 search:
 fulltextExtractor: '${Indexing.extractHtmlTags(value)}' 'Sandstorm.News:Article':
 properties:
 'title':
 search:
 fulltextExtractor: '${Indexing.extractInto("h1", value)}'

Slide 60

Slide 60 text

Indexing Additional Data

Slide 61

Slide 61 text

typo3cr-1426882860 search Index Aliases allow to link multiple indices. products-2976886808

Slide 62

Slide 62 text

ElasticSearch Rivers can poll data from other systems.

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

ElasticSearch too big for your project? Use SimpleSearch! composer require --prefer-source flowpack/simplesearch- contentrepositoryadaptor @dev

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

Resources http://www.elasticsearch.org/guide/ README of Flowpack.ElasticSearch.ContentRepositoryAdaptor

Slide 67

Slide 67 text

Thank You!

Slide 68

Slide 68 text

No content