Slide 1

Slide 1 text

Geoindexing with MongoDB Leszek Krupiński WebClusters 2012

Slide 2

Slide 2 text

About me

Slide 3

Slide 3 text

On-line since 1997

Slide 4

Slide 4 text

Funny times

Slide 5

Slide 5 text

1 hr of internet for 1 USD

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

First social site: geocities

Slide 9

Slide 9 text

My first web page

Slide 10

Slide 10 text

What do I do now

Slide 11

Slide 11 text

Day-time job Managing team of developers for Polish Air Force

Slide 12

Slide 12 text

Side: consulting, optimizing, desiging

Slide 13

Slide 13 text

Buzzwords incoming!

Slide 14

Slide 14 text

The Internet 2008

Slide 15

Slide 15 text

Web 2.0

Slide 16

Slide 16 text

http://en.wikipedia.org/wiki/File:Web_2.0_Map.svg CC-BY-SA-2.5

Slide 17

Slide 17 text

Be social in your bedroom

Slide 18

Slide 18 text

alone.

Slide 19

Slide 19 text

The Internet 2012

Slide 20

Slide 20 text

Web 3.0

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Why geospatial?

Slide 23

Slide 23 text

Needs shifted

Slide 24

Slide 24 text

Why? Because they could.

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

How to implement?

Slide 29

Slide 29 text

Database. Duh.

Slide 30

Slide 30 text

Keep, but also query

Slide 31

Slide 31 text

Is there a person at 53.438522,14.52198? Nope. Is there a person at 53.438522,14.52199? Nope. Is there a person at 53.438522,14.52199? Yeah, here’s Johnny!

Slide 32

Slide 32 text

Not too useful.

Slide 33

Slide 33 text

Give me nearby homies. Within the range of 1 km there is: • Al Gore (53.438625,14.52103) • Bill Clinton (53.432531,14.55127) • Johnny Bravo (53.438286,14.52363)

Slide 34

Slide 34 text

Now that’s better.

Slide 35

Slide 35 text

Geoindexing. Nothing new.

Slide 36

Slide 36 text

Oracle, PostreSQL, Lucene/Solr, even MySQL (via extensions)

Slide 37

Slide 37 text

SELECT c.holding_company, c.location FROM competitor c, bank b WHERE b.site_id = 1604 AND SDO_WITHIN_DISTANCE(c.location, b.location, ’distance=2 unit=mile’) = ’TRUE’ ORACLE

Slide 38

Slide 38 text

SQL is so last year

Slide 39

Slide 39 text

Let’s use something cool

Slide 40

Slide 40 text

MongoDB. Because all the cool kids use NoSQL now

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

Why MongoDB?

Slide 43

Slide 43 text

Choose your NoSQL wise.

Slide 44

Slide 44 text

NoSQL in MongoDB • Document –based • Queries (JS-like syntax) • JSON-like storage

Slide 45

Slide 45 text

Why MongoDB? Use Cases • Archiving • Event logging • Document and CMS • Gaming • High volume sites • Mobile • Operational datastore • Agile development • Real-time stats Features • Ad hoc queries • Indexing • Replication • Load Balancing • File Storage • Aggregation • Server-side JavaScript • Capped collections http://en.wikipedia.org/wiki/Mongodb

Slide 46

Slide 46 text

Back to geo.

Slide 47

Slide 47 text

{ loc: [ 52.0, 21.0 ], name: ”Warsaw”, type: ”City” }

Slide 48

Slide 48 text

db.nodes.ensureIndex({loc: '2d'})

Slide 49

Slide 49 text

That’s it.

Slide 50

Slide 50 text

Query • Exact o db.places.find( { loc : [50,50] } ) • Near o db.places.find( { loc : { $near : [50,50] } } ) • Limit o db.places.find( { loc : { $near : [50,50] } } ).limit(20) • Distance o db.places.find( { loc : { $near : [50,50] , $maxDistance : 5 } } ).limit(20)

Slide 51

Slide 51 text

Compound index • db.places.ensureIndex( { location : "2d" , category : 1 } ); • db.places.find( { location : { $near : [50,50] }, category : 'coffee‚ } );

Slide 52

Slide 52 text

Bound queries • box = [ [40.73083, -73.99756], [40.741404, -73.988135] ] • db.places.find( {"loc" : {"$within" : {"$box" : box }} } )

Slide 53

Slide 53 text

Problems

Slide 54

Slide 54 text

Units

Slide 55

Slide 55 text

Coordinates in arc units Distance in kilometers

Slide 56

Slide 56 text

In query

Slide 57

Slide 57 text

earthRadius = 6378 // km multi = earthRadius * PI / 180.0 range = 3000 // km … maxDistance : range * multi…

Slide 58

Slide 58 text

In results

Slide 59

Slide 59 text

pointDistance = distances[0].dis / multi

Slide 60

Slide 60 text

Earth is not flat.

Slide 61

Slide 61 text

Problem: can’t use linear distance

Slide 62

Slide 62 text

Earth isn’t flat too.

Slide 63

Slide 63 text

Solution? Use approximation.

Slide 64

Slide 64 text

MongoDB has it built-in distances = db.runCommand( { geoNear : "points", near : [0, 0], spherical : true, maxDistance : range / earthRadius /* to radians */ } ).results

Slide 65

Slide 65 text

Focus: runCommand distances = db.runCommand({ geoNear : "points" …

Slide 66

Slide 66 text

Sort by distance Only with runCommand

Slide 67

Slide 67 text

Automatically sorted • db.runCommand( { geoNear : "places" , near : [50,50], num : 10 } ); • { "ns" : "test.places", "results" : [ { "dis" : 69.29646421910687, "obj" : … }, { "dis" : 69.29646421910687, "obj" : … }, … ], … }

Slide 68

Slide 68 text

Demo

Slide 69

Slide 69 text

OpenStreetMaps database of Poland imported into MongoDB

Slide 70

Slide 70 text

14.411.552 nodes

Slide 71

Slide 71 text

3GB of raw XML data

Slide 72

Slide 72 text

PHP in virtual machine

Slide 73

Slide 73 text

Imported about 100.000 nodes every 10s.

Slide 74

Slide 74 text

Pretty cool, eh?

Slide 75

Slide 75 text

Kudos to Derick Rethans Part of this talk was inspired by his talk

Slide 76

Slide 76 text

Questions?

Slide 77

Slide 77 text

Thanks! Rate me at https://joind.in/talk/view/6475

Slide 78

Slide 78 text

Geoindexing with MongoDB supplement Leszek Krupiński WebClusters 2012

Slide 79

Slide 79 text

Why MongoDB?

Slide 80

Slide 80 text

Evaluate.

Slide 81

Slide 81 text

PostGIS is cool too. (but it’s SQL, meh)

Slide 82

Slide 82 text

Why MongoDB? Use Cases • Archiving • Event logging • Document and CMS • Gaming • High volume sites • Mobile • Operational datastore • Agile development • Real-time stats Features • Ad hoc queries • Indexing • Replication • Load Balancing • File Storage • Aggregation • Server-side JavaScript • Capped collections http://en.wikipedia.org/wiki/Mongodb

Slide 83

Slide 83 text

If you need other features of MongoDB, use it

Slide 84

Slide 84 text

If you don’t, evaluate.

Slide 85

Slide 85 text

Evaluate.

Slide 86

Slide 86 text

Demo (hopefully)

Slide 87

Slide 87 text

Questions?

Slide 88

Slide 88 text

Please leave feedback! Rate me at https://joind.in/6475