Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scorch! a New Index for Bleve

Scorch! a New Index for Bleve

Bleve, an open-source full-text search library for Go, has moved beyond the general-purpose key/value store and now implements its own custom binary index format named Scorch. Learn about the data-structures and Go libraries we've chosen to build this solution.

16cdfb0c4af5297e261cb36e30fa5c20?s=128

Marty Schoch

August 03, 2018
Tweet

More Decks by Marty Schoch

Other Decks in Technology

Transcript

  1. Scorch Marty Schoch @mschoch 2018-08-03 a New Index for Bleve

  2. None
  3. None
  4. Too Big and Too Slow Bleve Index

  5. None
  6. Tai Marty Sreekanth Abhinav Steve Aruna Alex Couchbase Full Text

    Search
  7. Design

  8. None
  9. None
  10. No Top-Level Bleve API Changes

  11. No Top-Level Bleve API Changes Bleve

  12. No Top-Level Bleve API Changes Bleve Apps

  13. No Top-Level Bleve API Changes Bleve Apps Couchbase FTS

  14. None
  15. Segmented Index

  16. Monolithic Key/Value Index

  17. Monolithic Key/Value Index Update

  18. Monolithic Key/Value Index Back Index Update

  19. Monolithic Key/Value Index Back Index Update

  20. Monolithic Key/Value Index Back Index Current Values Update

  21. Monolithic Key/Value Index Back Index New Values Current Values Update

  22. Monolithic Key/Value Index Back Index New Values Current Values Update

  23. Monolithic Key/Value Index Back Index Current Values Update

  24. Monolithic Key/Value Index Back Index Current Values Update

  25. Monolithic Key/Value Index Update

  26. Monolithic Key/Value Index Update

  27. Monolithic Key/Value Index Update

  28. Monolithic Key/Value Index

  29. Segmented Index

  30. Segmented Index Update

  31. Segmented Index Update

  32. Segmented Index Update

  33. New Storage Abstractions

  34. None
  35. None
  36. None
  37. None
  38. None
  39. None
  40. Term Dictionary

  41. Postings List

  42. None
  43. randomised

  44. randomised

  45. Levenshtein Edit Distance randomised randomized ~1

  46. None
  47. None
  48. Finite State Transducer Term Dictionary Levenshtein Automata mon, tues, thurs

    tues~2
  49. Vellum backed Term Dictionary

  50. Bitmap Postings List 0 1 n … 2 3

  51. Roaring Bitmap backed Postings List •Uncompressed Bitsets •Slice of Integers

    •Run-length encoding •Chunked Encoding, combining each
  52. Couchbase Release Schedule

  53. None
  54. Proof of Concept

  55. $

  56. mkdir scorch $

  57. Memory Only Segments

  58. Index Snapshot Introducer Application Goroutine introduction channel read read/write

  59. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

  60. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

  61. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

  62. Index Snapshot Introducer Application Goroutine Segment Introduction Batch() introduction channel

    read read/write
  63. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

  64. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

    Segment Introduction
  65. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

    Segment Introduction
  66. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

    Segment Introduction
  67. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

    Segment Introduction
  68. Index Snapshot Introducer Application Goroutine Batch() introduction channel read read/write

  69. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26
  70. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26
  71. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26
  72. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26
  73. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26 f.bolt
  74. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26 f.bolt root.bolt 27 a, b, c, d, e, f
  75. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26 f.bolt root.bolt 27 a, b, c, d, e, f Persister - lastEpoch 27
  76. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26 f.bolt root.bolt 27 a, b, c, d, e, f Persister - lastEpoch 27
  77. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26 f.bolt root.bolt 27 a, b, c, d, e, f Persister - lastEpoch 27
  78. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26 f.bolt root.bolt 27 a, b, c, d, e, f Persister - lastEpoch 27
  79. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26 f.bolt root.bolt 27 a, b, c, d, e, f Persister - lastEpoch 27
  80. None
  81. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write f Persister - lastEpoch 26 Persister - lastEpoch 26 f.zap root.bolt 27 a, b, c, d, e, f Persister - lastEpoch 27
  82. 4x Faster and 10x Smaller

  83. None
  84. None
  85. Team

  86. Merging Segments (Steve)

  87. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26
  88. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26
  89. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26
  90. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26
  91. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26 Merge Introduction g.zap
  92. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26
  93. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26 Merge Introduction g.zap
  94. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26 Merge Introduction g.zap Merger - lastPlanned 27
  95. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister f Merger - lastPlanned 26 Merge Introduction g.zap Merger - lastPlanned 27
  96. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister Merger - lastPlanned 26 Merge Introduction g.zap Merger - lastPlanned 27
  97. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister Merger - lastPlanned 26 Merge Introduction g.zap Merger - lastPlanned 27 g
  98. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister Merger - lastPlanned 26 Merge Introduction g.zap Merger - lastPlanned 27 g Index Snapshot - epoch 28
  99. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister Merger - lastPlanned 26 Merge Introduction g.zap Merger - lastPlanned 27 g Index Snapshot - epoch 28
  100. Index Snapshot - epoch 27 Introducer Application Goroutine introduction channel

    read read/write Persister Merger - lastPlanned 26 Merger - lastPlanned 27 g Index Snapshot - epoch 28
  101. Document Values (Sreekanth)

  102. Rollback (Abhinav)

  103. None
  104. None
  105. None
  106. None
  107. None
  108. None
  109. None
  110. None
  111. None
  112. Performance

  113. None
  114. None
  115. None
  116. REALLY NEEDS

  117. REALLY REALLY NEEDS

  118. None
  119. None
  120. None
  121. Introducer Persister Merger

  122. Introducer Persister Merger Loosely Coupled

  123. Introducer Persister Merger Loosely Coupled

  124. Introducer Persister Merger Loosely Coupled increases files decreases files

  125. Introducer Persister Merger Loosely Coupled increases files decreases files increases

    memory
  126. Introducer Persister Merger Loosely Coupled increases files decreases files increases

    memory
  127. None
  128. Reflection

  129. None
  130. None
  131. None
  132. None
  133. Thanks