Slide 1

Slide 1 text

@KeithResar _FROM METRICS TO INSIGHT_ Prometheus by Example Keith Resar, Red Hat

Slide 2

Slide 2 text

_WHAT IS PROMETHEUS_ Architecture and Components @KeithResar

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

● A multi-dimensional data model with time series data identified by metric name and key/value pairs ● Time series collection happens via a pull model over HTTP ● No reliance on distributed storage; single server nodes are autonomous ● PromQL, a flexible query language to leverage this dimensionality ● Pushing time series is supported via an intermediary gateway ● Targets are discovered via service discovery or static configuration ● Multiple modes of graphing and dashboarding support Prometheus at a Glance @KeithResar

Slide 5

Slide 5 text

Self-contained monitoring system and time-series database. Pulls metrics from external sources, pushes alerts to external sources, and accepts inbound requests via http. Prometheus (Core) Major Prometheus Components...

Slide 6

Slide 6 text

Incoming samples are not persisted but are stored in-memory Storage and TSDB 0101010101 01010001 011 10 0101010101 01010001 011100 011010 0101010101 01010001 0111001110 010011010 011100011010011 0110110010101 1 Write-ahead logs (WAL) secures incoming samples and can be replayed in event of server crash Ingested samples are compacted and persisted in the TSDB in files with 2 hour chunks. 3 2

Slide 7

Slide 7 text

Prometheus (Core) Major Prometheus Components...

Slide 8

Slide 8 text

Service Discovery K8s _____ ___ ___ ______ Discover new targets via direct Kubernetes integration, or by modifying a watched json file Prometheus (Core) Service Discovery

Slide 9

Slide 9 text

010 001 101 010 001 101 010 001 101 Exporters Prometheus scrapes data from exporter targets. Dozens of these are available, including: databases, hardware, storage, http, APIs, logging, Kubernetes 010 001 101 010 001 101 010 001 101 Prometheus (Core) Service Discovery K8s _____ ___ ___ ______ Exporters scraper pull

Slide 10

Slide 10 text

010 001 101 010 001 101 010 001 101 Exporters 010 001 101 010 001 101 010 001 101 Prometheus (Core) Service Discovery K8s _____ ___ ___ ______ http Interface API Interface Web UI Simple Web UI Console API Interface consumed by your own scripts, Grafana, Web UI, and other Prometheus servers

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

_QUERYING_ Finding and manipulating your metric data @KeithResar

Slide 13

Slide 13 text

Metric Types Counters Monotonically increasing counter whose value can only increase or be reset to zero on restart Gauge Numerical value that can arbitrarily go up and down. Histogram Samples observations and counts them in buckets. It also provides a sum of all observed values.

Slide 14

Slide 14 text

Metric Naming and Labeling mysql_requests_total{job="metrics"}

Slide 15

Slide 15 text

Metric Naming and Labeling _mysql__requests_total{job="metrics"} Application Namespace Prefix Most applications add a single-work prefix to metrics to define an application namespace. This is typically omitted from standardized metrics exported by client libraries

Slide 16

Slide 16 text

Metric Naming and Labeling mysql__requests__total{job="metrics"} Metric Base Name (self explanatory?)

Slide 17

Slide 17 text

Metric Naming and Labeling mysql_requests__total_{job="metrics"} Suffix Given in plural form (e.g. seconds, bytes). In this example total is unit-less to represent a count.

Slide 18

Slide 18 text

Metric Naming and Labeling mysql_requests_total{_job="metrics"_} Labels Use labels to differentiate the characteristics of the thing that is being measured:

Slide 19

Slide 19 text

Metric Naming and Labeling mysql_requests_total{job="metrics"} Time Series Every unique metric name (inclusive of labels) represents a new time series. Best practice is not to use dimensions in the labels that represent high cardinality (e.g. user IDs, email addresses).

Slide 20

Slide 20 text

Element Value up{instance="45.76.27.239:9104", ...} 1 up{instance="localhost:9090" ...} 1 PromQL> up Query Data Types - Instant Vector

Slide 21

Slide 21 text

Element Value up{instance="45.76.27.239:9104", ...} 1 @1549918896.263 1 @1549918911.263 1 @1549918926.263 1 @1549918941.263 PromQL> up[1m] Query Data Types - Range Vector

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

_VISUALIZATION_ Visualizing Analysis and Trends @KeithResar

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

_RULES_ @KeithResar

Slide 26

Slide 26 text

Recording Rules Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their result as a new set of time series. Querying the precomputed result will then often be much faster than executing the original expression every time it is needed. This is especially useful for dashboards, which need to query the same expression repeatedly every time they refresh.

Slide 27

Slide 27 text

010 001 101 010 001 101 010 001 101 Exporters Prometheus emits alerts based on rules generated at the core 010 001 101 010 001 101 010 001 101 Prometheus (Core) Service Discovery K8s _____ ___ ___ ______ AlertManager AlertManager push 1 External AlertManager then manages those alerts, including silencing, inhibition, aggregation and sending out notifications via methods such as email, PagerDuty and HipChat. 2 0101010101 01010001

Slide 28

Slide 28 text

Alerting Rules Alerting rules allow you to define alert conditions based on Prometheus expression language expressions and to send notifications about firing alerts to an external service. Whenever the alert expression results in one or more vector elements at a given point in time, the alert counts as active for these elements' label sets.

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

_THANKS_ @KeithResar