Slide 1

Slide 1 text

This presentation is about .... 21 jun 2013 Ard Schrijvers [email protected] [email protected]

Slide 2

Slide 2 text

RSI

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Long spinning wheels.... Is there on your computer anything more annoying? more useless? or even more violent?

Slide 5

Slide 5 text

Avoid spinning wheels...the 7.8 helps!

Slide 6

Slide 6 text

7.7 / 7.8 HST performance

Slide 7

Slide 7 text

Where is the improvement?

Slide 8

Slide 8 text

It is more subtle First of all : A blistering fast HST does not guarantee blistering fast websites for the 7.7 ...nor for the 7.8

Slide 9

Slide 9 text

A durable fast delivery requires 1. A blistering fast HST ( + repository) 2. Confident developers questioning and challenging customers requirements 3. Customers willing to KISS during first iterations : Don't start innovating directly 4. An efficient content model 5. No developer mistakes 6. Performance measurements upfront 7. Performance logging

Slide 10

Slide 10 text

So what improved in the 7.8?

Slide 11

Slide 11 text

7.7 Vulnerabilities / easy mistakes 1. Date range queries 2. Large hst configuration resulting in hiccups during changes 3. Many needless virtual nodes 4. Possible expensive searches for CMS users with read access to a small part of the repository 5. Getting total number of search results is expensive 6. CMS is not lazy 7. CMS template editor encourages incorrect usage of compounds 8. Repository based website menus loading large parts of the repository

Slide 12

Slide 12 text

7.7 Vulnerabilities / easy mistakes 1. No page caching / thundering herd vulnerability 2. Server side external (service) calls 3. CMS JCR events 4. Free text queries possibly blowing up the repository 5. Easy feedback / page diagnostics missing

Slide 13

Slide 13 text

Why extra performance focus with 7.8?

Slide 14

Slide 14 text

Enough enough enough

Slide 15

Slide 15 text

Fast Date Range Queries 1. Supported in 7.7.9 and higher 2. For 7.7.9, Lucene re-index needed 3. In hst-config.properties set the default with default.query.date.range.resolution = 4. Default in 7.8.x for HST is resolution DAY 5. Default in 7.7.9 for HST is resolution MILLISECOND 6. Resolution based query supported through HstFilter api

Slide 16

Slide 16 text

How much faster?

Slide 17

Slide 17 text

Log scale

Slide 18

Slide 18 text

Bonus Gimme all documents of 2013 or Gimme all documents of 2013/06/21 do not need range queries anymore!! Filter#addEqualTo("my:date", Calendar.getInstance(), DateTools.Resolution. YEAR) Filter#addEqualTo("my:date", Calendar.getInstance(), DateTools.Resolution. DAY)

Slide 19

Slide 19 text

HST stale model support 1. Supported in 7.7.9 and higher 2. What it solves : hiccups in case reloads of large hst configuration

Slide 20

Slide 20 text

Virtual nodes reduced I think star trek is stupid but just want to be one of the geeks and I think managers and marketers like this because they then think I must be really smart and technical.... Ever wondered what 'trek' stands for? To make a sloooooow or arduous journey ... How dull

Slide 21

Slide 21 text

Virtual nodes reduced

Slide 22

Slide 22 text

Virtual nodes reduced

Slide 23

Slide 23 text

Authorized searches Required by accessing canonical content instead of virtual nodes

Slide 24

Slide 24 text

How did we do it? The Authorization Query

Slide 25

Slide 25 text

Does it perform? Oh yes! See my blog

Slide 26

Slide 26 text

Bonus 1. Users with little read access have instant authorized searches! 2. Correct instant faceted navigation authorized counts!

Slide 27

Slide 27 text

Efficient total number of search hits javax.jcr.query.QueryResult#getNodes()#getSize() close to broken in Jackrabbit in practice....

Slide 28

Slide 28 text

Why? QUIZ : Assume there are 1.000.000 newsdocs javax.jcr.QueryManager mngr = ... String q = "//element(*,my:newsdoc)"; long size = mngr.createQuery(q,"xpath").execute() .getNodes().getSize(); System.out.println(size); q = "//element(*,my:newsdoc) order by @jcr:score"; size = mngr.createQuery(q,"xpath").execute().getNodes().getSize(); System.out.println(size); output: -1 1000000 (after an hour)

Slide 29

Slide 29 text

But...this looks really dumb It isn't really... The problem is namely very complex Ask search gurus about it... "Authorization...hmmm, yeah, I don't care"

Slide 30

Slide 30 text

But you have the Authorization Query So you have instant correct authorized counts from Lucene? Correct Is Hippo R&D then smarter than all the search gurus and Jackrabbit developers? No not really... We just chose a security model that could be mapped to Lucene queries But isn't that super smart and bright?

Slide 31

Slide 31 text

Authorized total search hits HstQueryResult#getTotalSize() or HippoNodeIterator#getTotalSize()

Slide 32

Slide 32 text

CMS Laziness

Slide 33

Slide 33 text

CMS Laziness Why wasn't this done before? Well....it's like the north-south line Also realize: It is easy to state the obvious or know the problem but way harder to do something about it

Slide 34

Slide 34 text

HST Page cache In a running environment it can be switched on/off per 1. Virtual host 2. Channel (mount) 3. Sitemap item 4. Component

Slide 35

Slide 35 text

HST Page cache

Slide 36

Slide 36 text

HST Page cache bonus Stampeding herd protection

Slide 37

Slide 37 text

What about partially cacheable pages?

Slide 38

Slide 38 text

And without client side ajax?

Slide 39

Slide 39 text

Caching and relevance / targeting? Page caching works with targeting! HOW? number of uniquely rendered pages << unique number of site visitors

Slide 40

Slide 40 text

And 100% personalized blocks?

Slide 41

Slide 41 text

Server side external (service) calls

Slide 42

Slide 42 text

Server side external (service) calls Suppose requesting a remote football JSON object... but the service cannot make up its mind... Use ASYNC = true Optionally ASYNCMODE = esi Or asynchronous server side calls & caching

Slide 43

Slide 43 text

CMS JCR events Much more efficient now Ask this guy

Slide 44

Slide 44 text

Free text queries

Slide 45

Slide 45 text

Page Diagnostics Configurable in repository hst config

Slide 46

Slide 46 text

Page Diagnostics

Slide 47

Slide 47 text

7.8 Tackled vulnerabilities wrap up 1. Date range queries 2. Large hst configuration resulting in hiccups during changes 3. Many needless virtual nodes 4. Possible expensive searches for CMS users with read access to a small part of the repository 5. Getting total number of search results is expensive 6. CMS is not lazy 7. CMS template editor encourages incorrect usage of compounds 8. Repository based website menus loading large parts of the repository

Slide 48

Slide 48 text

7.8 Tackled vulnerabilities wrap up 1. No page caching / thundering herd vulnerability 2. Server side external (service) calls 3. CMS JCR events 4. Free text queries possibly blowing up the repository 5. Easy feedback / page diagnostics missing

Slide 49

Slide 49 text

7.8 improvements in progress / done 1. Fine grained publishing of hst configuration 2. Reduced memory consumption of hst model

Slide 50

Slide 50 text

7.8 further improvements / features 1. Lazy loading hst model 2. Reduce usage of facetselect and mirrors with multivalued properties containing uuids 3. Stale page cache support 4. Cluster wide (couchbase memcached bucket) shared hst page cache with stale page support

Slide 51

Slide 51 text

References 1. www.onehippo.com/en/resources/blogs/Author/Ard+Schrijvers 2. www.onehippo.com/en/resources/blogs?query=performance 3. www.onehippo. org/7_8/librarysearch/Version/7_8/Category/performance

Slide 52

Slide 52 text

No content