Slide 1

Slide 1 text

Simon Willnauer @ Lucene Revolution 2011 PMC Member & Core Comitter Apache Lucene simonw@apache.org / simonw@jteam.nl Exploiting Concurrency to Lucene Indexing

Slide 2

Slide 2 text

IndexWriter in 3.x 2 d d d d d do d d d d d do d d d d d do d d d d d do d d d d d do Thread State DocumentsWriter IndexWriter Thread State Thread State Thread State Thread State do do do do do doc merge segments in memory Flush to Disk Merge on flush Multi-Threaded Single-Threaded Directory

Slide 3

Slide 3 text

Influence on Indexing - Throughput 3

Slide 4

Slide 4 text

Lucene 4 with DocumentsWriterPerThread 4 d d d d d do d d d d d do d d d d d do d d d d d do d d d d d do DWPT DocumentsWriter IndexWriter DWPT DWPT DWPT DWPT Flush to Disk Multi-Threaded Directory

Slide 5

Slide 5 text

Indexing Throughput with DWPT 5

Slide 6

Slide 6 text

Looking at nightly benchmarks - IMPRESSIVE! 6

Slide 7

Slide 7 text

Looking at nightly benchmarks - IMPRESSIVE! 7

Slide 8

Slide 8 text

Wanna know more? http://blog.jteam.nl/2011/05/03/lucene-indexing-gains- concurrency/ http://blog.jteam.nl/2011/04/01/gimme-all-resources-you- have-i-can-use-them/ 8

Slide 9

Slide 9 text

Thank you! 9