Slide 1

Slide 1 text

Engineer, 10gen Mark Hillick - @markofu – [email protected] #mongodbdays Using the MongoDB Monitoring Service (MMS)

Slide 2

Slide 2 text

Agenda 1.  What, where, numbers? 2.  How? 3.  Measure Me!!! 4.  Alerting 5.  Security 6.  Documentation 7.  Futures 8.  Conclusion 9.  Questions? MongoDB Monitoring Service, Mark Hillick

Slide 3

Slide 3 text

What, where, numbers?

Slide 4

Slide 4 text

What is MMS? •  MongoDB monitoring Saas solution with: –  Granularity: min, hour, day –  Alerting: host up / down, metrics etc –  Event tracking (server restart, step down, …) •  Host management (auto discover) •  Profiling •  Hardware stats also

Slide 5

Slide 5 text

Why use MMS? (1)

Slide 6

Slide 6 text

Why use MMS? (2) •  Overview – Bird’s Eye –  Macro •  Drill down (minute by minute) –  Micro

Slide 7

Slide 7 text

Why use MMS? (3) •  Haz all teh things J •  Tailored specifically for MongoDB •  Incredibly helpful for 10gen Support when troubleshooting

Slide 8

Slide 8 text

A few numbers … •  40k writes per second •  400 metrics per ping packet •  9 billion metrics recorded per day

Slide 9

Slide 9 text

How?

Slide 10

Slide 10 text

Set up MMS – it’s easy •  Go to http://mms.10gen.com –  Create a new account or sign in with jira user. –  Pick an explicit company name –  Download and run the agent –  From MMS dashboard, add a host to monitor

Slide 11

Slide 11 text

The MMS client (agent) •  Small Python app •  A single agent process –  Failover – multiple agents •  Connect to mms.10gen.com (SSL over TCP 443)

Slide 12

Slide 12 text

Host

Slide 13

Slide 13 text

Operational Stats

Slide 14

Slide 14 text

Measure me!!!

Slide 15

Slide 15 text

Metrics •  Source : http://www.kaushik.net/avinash/wp-content/uploads/2007/10/metrics.jpg

Slide 16

Slide 16 text

opcounters •  Count of every operation per second •  getMore – each batch of a query

Slide 17

Slide 17 text

memory •  Mapped: sum of files on disk •  Virtual memory: 2 x mapped (j) + process overhead •  Resident memory: data in RAM actively used

Slide 18

Slide 18 text

Page faults •  Disk IO •  Readahead

Slide 19

Slide 19 text

Lock % •  Amount of time spent in the write lock •  From 2.2 : each db has own lock

Slide 20

Slide 20 text

Background flush •  Flush every 60 seconds •  Watch: if flush time gets close to sync delay

Slide 21

Slide 21 text

Replication •  On primary: amount of time in oplog •  On secondary: replication delay to primary

Slide 22

Slide 22 text

Metrics that we discussed •  Opcounters •  Memory •  Page Faults •  Lock % •  Background Flush •  Replication

Slide 23

Slide 23 text

Metrics for performance •  Resident memory: how much data in RAM? •  Page Faults: paging to disk? Readahead? •  Journal commits in write lock: separate journal •  High background flush: reduce sync delay to smooth

Slide 24

Slide 24 text

Alerting

Slide 25

Slide 25 text

Alerts - Config

Slide 26

Slide 26 text

All good 

Slide 27

Slide 27 text

Alerts - Closed

Slide 28

Slide 28 text

Events

Slide 29

Slide 29 text

Events

Slide 30

Slide 30 text

Security

Slide 31

Slide 31 text

What is sent? •  Purely metadata •  HTTPS & connections are outbound only (from the agent) •  Log transfer has to be turned on •  If profiling in db & MMS, then profiling data is sent

Slide 32

Slide 32 text

On-premise MMS •  Locally Hosted in Customer Infrastructure •  PCI, HIPAA, SOX etc •  Enterprise Customers (2.4)

Slide 33

Slide 33 text

Documentation

Slide 34

Slide 34 text

Docs? Where? •  Manual : https://mms.10gen.com/help/ •  FAQ : https://mms.10gen.com/docs/faq

Slide 35

Slide 35 text

Futures

Slide 36

Slide 36 text

Feature Request •  JIRA Ticket : MMSSUPPORT •  Non-CS : Google Group

Slide 37

Slide 37 text

Coming up… •  Data visualization •  New agent - Python Go •  On-Premise Features

Slide 38

Slide 38 text

Conclusion

Slide 39

Slide 39 text

Conclusion •  Easy to use •  Macro & micro •  Detailed monitoring features •  Helps 10gen Support immensely

Slide 40

Slide 40 text

Thanks •  Thanks to Mike Stimpson for the awesome pics J http://imgur.com/a/0XvKw (I bought his book btw!)

Slide 41

Slide 41 text

Engineer, 10gen & Star Wars fan  Mark Hillick - @markofu – [email protected] #mongodbdays Questions?