Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bleve — Go Israel

Bleve — Go Israel

In this talk we'll start with an overview of the functionality provided by Bleve. Next we'll look at some examples of how you can integrate Bleve with your Go applications. Finally, we'll talk about Scorch, the latest index scheme used by Bleve, and how it fits into the future of the project.

Marty Schoch

October 29, 2018
Tweet

More Decks by Marty Schoch

Other Decks in Technology

Transcript

  1. About Marty 7 years with Couchbase • Databases • Indexing

    • Search • Go • Distributed Systems
  2. Searching the Index engineers engineer Apply the same text analysis

    at search time that we used at index time. engineer ... ... ... ... ... ... wise Inverted Index exact match
  3. import "github.com/blevesearch/bleve" type WebPage struct { Content string } func

    main() { mapping := bleve.NewIndexMapping() index, err := bleve.New("website.bleve", mapping) if err != nil { log.Fatal(err) } page := WebPage{"..."} err = index.Index("p1", page) if err != nil { log.Fatal(err) } fmt.Println("Indexed Document") } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
  4. import "github.com/blevesearch/bleve" type WebPage struct { Content string } func

    main() { mapping := bleve.NewIndexMapping() index, err := bleve.New("website.bleve", mapping) if err != nil { log.Fatal(err) } page := WebPage{"..."} err = index.Index("p1", page) if err != nil { log.Fatal(err) } fmt.Println("Indexed Document") } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
  5. import "github.com/blevesearch/bleve" type WebPage struct { Content string } func

    main() { mapping := bleve.NewIndexMapping() index, err := bleve.New("website.bleve", mapping) if err != nil { log.Fatal(err) } page := WebPage{"..."} err = index.Index("p1", page) if err != nil { log.Fatal(err) } fmt.Println("Indexed Document") } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
  6. import "github.com/blevesearch/bleve" type WebPage struct { Content string } func

    main() { mapping := bleve.NewIndexMapping() index, err := bleve.New("website.bleve", mapping) if err != nil { log.Fatal(err) } page := WebPage{"..."} err = index.Index("p1", page) if err != nil { log.Fatal(err) } fmt.Println("Indexed Document") } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
  7. import "github.com/blevesearch/bleve" type WebPage struct { Content string } func

    main() { mapping := bleve.NewIndexMapping() index, err := bleve.New("website.bleve", mapping) if err != nil { log.Fatal(err) } page := WebPage{"..."} err = index.Index("p1", page) if err != nil { log.Fatal(err) } fmt.Println("Indexed Document") } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
  8. import "github.com/blevesearch/bleve" type WebPage struct { Content string } func

    main() { mapping := bleve.NewIndexMapping() index, err := bleve.New("website.bleve", mapping) if err != nil { log.Fatal(err) } page := WebPage{"..."} err = index.Index("p1", page) if err != nil { log.Fatal(err) } fmt.Println("Indexed Document") } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
  9. import "github.com/blevesearch/bleve" func main() { index, err := bleve.Open("website.bleve") if

    err != nil { log.Fatal(err) } query := bleve.NewMatchQuery("bleve") request := bleve.NewSearchRequest(query) result, err := index.Search(request) if err != nil { log.Fatal(err) } fmt.Println(result) } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
  10. import "github.com/blevesearch/bleve" func main() { index, err := bleve.Open("website.bleve") if

    err != nil { log.Fatal(err) } query := bleve.NewMatchQuery("bleve") request := bleve.NewSearchRequest(query) result, err := index.Search(request) if err != nil { log.Fatal(err) } fmt.Println(result) } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
  11. import "github.com/blevesearch/bleve" func main() { index, err := bleve.Open("website.bleve") if

    err != nil { log.Fatal(err) } query := bleve.NewMatchQuery("bleve") request := bleve.NewSearchRequest(query) result, err := index.Search(request) if err != nil { log.Fatal(err) } fmt.Println(result) } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
  12. import "github.com/blevesearch/bleve" func main() { index, err := bleve.Open("website.bleve") if

    err != nil { log.Fatal(err) } query := bleve.NewMatchQuery("bleve") request := bleve.NewSearchRequest(query) result, err := index.Search(request) if err != nil { log.Fatal(err) } fmt.Println(result) } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
  13. Other Types of Queries • Phrase Queries • Multi-Phrase Queries

    • Fuzzy Queries • Regular Expression • Wildcard
  14. Bleve Beyond Text • Exact String Comparison • Numeric Range

    Queries • Date Range Queries • Geo Point Distance Queries • Combine them with AND/OR • Bleve’s Inverted Index is very well suited for this
  15. No. Bleve is a library. But… It makes building a

    distributed search engine possible.
  16. bleve index bleve client bleve client index alias query request

    bleve index bleve index bleve index index alias bleve index bleve index index alias query requests Using IndexAlias to Search Multiple Nodes/Indexes
  17. Scorch Indexing Performance 5.5.0-2780 5.5.0-2780 Scorch Improved (> 1 is

    better) Index size(MB), 1 node, 1M docs 5,063 973 5.20 Index build time (sec), 1 node, 1M docs 36 28 1.29 Index size(MB), 3 nodes, 1M docs 5,427 1,003 5.41 Index build time (sec), 3 nodes, 1M docs 35 23 1.52 Index size(MB), 2 nodes, 10M docs, DGM 53,334 14,034 3.80 Index build time (sec), 2 nodes, 10M docs, DGM 687 274 2.51
  18. Scorch Query Latency 80th percentile query latency (ms), no kv-load,

    wiki 1M x 1KB, 1 node, FTS Performance compared to upside_down/moss Fuzzy-1 Searches ~92% Term date facet Searches ~77% Phrase Searches ~50% High Frequency Conjunction Searches ~33% Fuzzy-2 Searches ~24% Prefix/Wildcard Searches ~25%
  19. Scorch Query Throughput Average Throughput (q/sec), no kv-load, wiki 1M

    x 1KB, 1 node, FTS Performance compared to upside_down/moss Fuzzy-1 Searches ~4X Term date facet Searches ~3X Fuzzy-2 Searches ~2X Phrase Searches ~2X Medium/Low Frequency Term Searches 25/40% Prefix/Wildcard Searches 25%
  20. Future Overhaul Relase Managment 1.0 - Formalize official non- scorch

    release 1.1 / 2.0 - First officially supported (and default to) Scorch