Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How it works. Indexes | Kirill Duborenko

How it works. Indexes | Kirill Duborenko

Kirill Duborenko
Meetup #6

Minsk MongoDB User Group

September 18, 2012
Tweet

More Decks by Minsk MongoDB User Group

Other Decks in Programming

Transcript

  1. Index A database index is a data structure that improves

    the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space.
  2. Index (Example) B • B-tree • Bitmap index C •

    Compressed suffix array G • Grid (spatial index) I • Incremental encoding • Database index • Indexed file I cont. • Inverted index K • K-d tree M • M-tree O • Octree P • Priority R-tree Q • Quadtree R cont. • R* tree • R+ tree • Reverse index S • Substring index U • UB-tree X • X-tree Z • Z-order curve
  3. MongoDB Indexes Api • db.people.ensureIndex( {title : 1}, {sparse :

    true} ) • db.stats() • db.collection.stats() • db.collection.getIndexes() • db.system.indexes.find()
  4. Types of Indexes • Basic/Simple • Compound • Unique •

    Sparse • Index on Array field • Shard key
  5. Types of Indexes (Primary Key) { _id: 1, x: 4,

    y: 5, z:[3,null] } { _id: 2, x: 5, y: 5, z: [4,8,2] } { _id: 3, x: null, y: 6, z:[5,3] } 1 2 3
  6. Types of Indexes (Basic) { _id: 1, x: 4, y:

    5, z:[3,null] } { _id: 2, x: 5, y: 5, z: [4,8,2] } { _id: 3, x: null, y: 6, z:[5,3] } null 4 5 ensureIndex( { x:1 } )
  7. Types of Indexes (Compound) { _id: 1, x: 4, y:

    5, z:[3,null] } { _id: 2, x: 5, y: 5, z: [4,8,2] } { _id: 3, x: null, y: 6, z:[5,3] } 5 6 4 5 null ensureIndex( { y:1, x:1 } )
  8. Types of Indexes (Compound) { _id: 1, x: 4, y:

    5, z:[3,null] } { _id: 2, x: 5, y: 5, z: [4,8,2] } { _id: 3, x: null, y: 6, z:[5,3] } 5 6 5 4 null ensureIndex( { y:1, x:-1 } )
  9. Types of Indexes (Unique) { _id: 1, x: 4, y:

    5, z:[3,null] } { _id: 2, x: 5, y: 5, z: [4,8,2] } { _id: 3, x: null, y: 6, z:[5,3] } 5 6 ensureIndex( { y:1, x:1 }, { unique:true } )
  10. Types of Indexes (Sparse) { _id: 1, x: 4, y:

    5, z:[3,null] } { _id: 2, x: 5, y: 5, z: [4,8,2] } { _id: 3, x: null, y: 6, z:[5,3] } 4 5 ensureIndex( { y:1, x:1 }, { sparse:true } )
  11. Types of Indexes (By Array Field) { _id: 1, x:

    4, y: 5, z:[3,null] } { _id: 2, x: 5, y: 5, z: [4,8,2] } { _id: 3, x: null, y: 6, z:[5,3] } 3 3 4 2 null 5 8
  12. Types of Indexes (Shard Key) { _id: 1, x: 4,

    y: 5, z:[3,null] } { _id: 2, x: 5, y: 5, z: [4,8,2] } { _id: 3, x: null, y: 6, z:[5,3] } 0 .. 2 3 .. 5
  13. Techniques • Covered Indexes MongoDB can return data from the

    index only when the query only involves keys which are present in the index. Not inspecting the actual documents can speed up responses considerably since the index is compact in size and usually fits in RAM, or is sequentially located on disk. • Sparse + Unique Index You can combine sparse with unique to produce a unique constraint that ignores documents with missing fields.
  14. Geospatial Indexes db.people.ensureIndex({loc:"2d"}) • Document examples: { loc: [10, 10]

    } { loc: { a:10, b:10 } } • Query examples: find({loc:{$near:[9,9]}}) runCommand({getNear:...}) find({loc:{$within:{$center:[...]}}}) find({loc:{$within:{$box:[...]}}}) find({loc:{$within:{$poligon:[...]}}})
  15. Geospatial Indexes (Limitations) • You may only have 1 geospatial

    index per collection, for now • In compound index geospatial part should be first • Sharding on a geo-key isn't recommended MongoDB can't use two-dimensional index to route queries • Operation $near is not working on sharded collection Use "getNear" command rather than $near
  16. "Count" Problem • Performance "Count" operation iterates over all matched

    values in index to calculate its value. Sometime it can takes too much time • "Count" ignores ordering Actual count of retrieved documents can be different than result of "cont" operation: db.t.save({x:1}) db.t.save({x:1,y:2}) db.t.save({x:1,y:3}) db.t.find().sort({y:1}).toArray().length // 2 db.t.find().sort({y:1}).count() // 3
  17. Indexes + Sharding Machine 1 Machine 2 Machine 3 Alabama

    → Arizona Colorado → Florida Arkansas → California Indiana → Kansas Idaho → Illinois Georgia → Hawaii Maryland → Michigan Kentucky → Maine Minnesota → Missouri Montana → Montana Nebraska → New Jersey Ohio → Pennsylvania New Mexico → North Dakota Rhode Island → South Dakota Tennessee → Utah Vermont → West Virgina Wisconsin → Wyoming
  18. Tools • "Explain" method will display "explain plan" information about

    a query from the database db.coll.find({...}).explain() • Dex, the Index Bot Dex is a MongoDB performance tuning tool that compares queries to the available indexes in the queried collection(s) and generates index suggestions based on simple heuristics.