Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monitorama Cyanite Workshop

Monitorama Cyanite Workshop


Pierre-Yves Ritschard

June 17, 2015

More Decks by Pierre-Yves Ritschard

Other Decks in Technology



  2. @PYR CTO at exoscale, Swiss Cloud Hosting. Open source developer:

    pithos, cyanite, riemann, collectd.
  3. AIM OF THIS TALK Presenting graphite and its ecosystem Presenting

    cyanite Feedback on cyanite usage
  4. OUTLINE Graphite overview The problem with graphite Cyanite: approach and

    internals Production feedback

  6. FROM THE SITE Graphite does two things: 1. Store numeric

    time-series data 2. Render graphs of this data on demand http://graphite.readthedocs.org
  7. SCOPE A metrics tool Not a complete monitoring solution Interacts

    with metric submission tools Optional event storage
  8. GRAPHITE COMPONENTS whisper carbon graphite-web

  9. WHISPER RRD like storage library Written in python Each file

    contains different roll-up periods and an aggregation method
  10. CARBON Asynchronous (twisted) TCP and UDP service to input time-

    series data Simple storage rules Split across several daemons
  11. CARBON-CACHE Main carbon daemon Temporarily caches values to RAM Writes

    out to whisper
  12. CARBON-AGGREGATOR Aggregates data and forwards to carbon-cache Less I/O strain

    on the filesystem At the expense of resolution
  13. CARBON-RELAY Provides sharding and replication Forwards to appropriate carbon-cache processes

    based on a provided hashing method
  14. GRAPHITE-WEB Simple Django-Based HTTP api Persists configuration to SQL Data

    query and manipulation through a very simple DSL Graph rendering Composer client interface to build graphs # # s u m C P U v a l u e s s u m S e r i e s ( " c o l l e c t d . w e b 0 1 . c p u - * " ) # # p r o v i d e m e m o r y p e r c e n t a g e a l i a s ( a s P e r c e n t ( w e b 0 1 . m e m . u s e d , s u m S e r i e s ( w e b 0 1 . m e m . * ) ) , " m e m p e r c e n t " )



  18. MODULARITY IN GRAPHITE Recently improved A module can implement a

    storage strategy for graphite-web Carbon modularity is a bit harder
  19. THE GRAPHITE ECOSYSTEM A wealth of tools are now graphite

  20. STATSD Very popular metric service to integrate within applications. Aggregates

    events in n second windows Ships off to graphite s t a t s d . i n c r e m e n t ' s e s s i o n . o p e n ' s t a t s d . g a u g e ' s e s s i o n . a c t i v e ' , 3 7 0 s t a t s d . t i m i n g ' p d f . c o n v e r t ' , 3 2 0
  21. COLLECTD Very popular collection daemon with a graphite destination Every

    conceivable system metrics A wealth of additional metric sources (such as a fast statsd server) < p l u g i n w r i t e _ g r a p h i t e > < c a r b o n > H o s t " g r a p h i t e - h o s t " < / c a r b o n > < / p l u g i n >
  22. GRAPHITE-API Alternative to graphite-web Shares data manipulation code No persistence

    of configuration
  23. GRAFANA Quickly becoming the default graphite visualization front-end Inspired by

    the kibana project for logstash Optional persistence to elasticsearch for configuration
  24. RIEMANN Distributed system monitoring solution ( d e f g

    r a p h ! ( g r a p h i t e { : h o s t " g r a p h i t e - s e r v e r " } ) ) ( s t r e a m s ( w h e r e ( s e r v i c e " h t t p . 4 0 4 " ) ( r a t e 5 g r a p h ! ) ) )
  25. AND A LOT MORE syslog-ng logstash descartes tasseo jmxtrans

  26. GREAT PROJECT Active and friendly developer community Growing ecosystem Very

    few contenders

  28. ESSENTIALY A SINGLE-HOST SOLUTION Built in a day where cacti

    reigned Innovative project at the time which decoupled collection from storage and display
  29. THE WHISPER FILE FORMAT One file per data point Optimized

    for space, not speed Plenty of seeks Only shared storage option is NFS… In many ways can be seen as RRD in python
  30. SCALING STRATEGIES Tacked on after the fact The decoupled architecture

    means that both graphite-web and carbon need upfront knowledge on the locations of shard

  32. IT GETS A BIT HAIRY Cluster topology must be stored

    on all nodes Manual replication mechanism (through carbon-relay) Changing cluster topology means re-assigning shards by hand
  33. WHAT GRAPHITE CAN KEEP Persistence of configuration Local data manipulation

  34. WHAT GRAPHITE WOULD NEED Automatic shard assignment Replication Easy management

    Easy cluster topology changes (horizontal scalability)
  35. THE CYANITE APPROACH Leveraging Apache Cassandra to store time-series Leveraging

    Graphite for the interface
  36. A CASSANDRA-BACKED CARBON REPLACEMENT Written in clojure Async I/O &

    Threads No more whisper files Horizontally scalable (stateless!) Interfaced with graphite-web through graphite-cyanite
  37. CYANITE DUTIES Providing graphite-compatible input methods (carbon listeners) Providing a

    way to retrieve metric names and metric time- series A metric-store A path-store Both pluggable The rest is up to the graphite eco-system, through graphite- cyanite The recommended companion is graphite-api
  38. GETTING UP AND RUNNING A simple configuration file c a

    r b o n : h o s t : " 1 2 7 . 0 . 0 . 1 " p o r t : 2 0 0 3 r e a d t i m e o u t : 3 0 r o l l u p s : - p e r i o d : 6 0 4 8 0 r o l l u p : 1 0 - p e r i o d : 1 0 5 1 2 0 r o l l u p : 6 0 0 h t t p : h o s t : " 0 . 0 . 0 . 0 " p o r t : 8 0 8 0 l o g g i n g : l e v e l : i n f o f i l e s : - " / v a r / l o g / c y a n i t e / c y a n i t e . l o g " s t o r e : c l u s t e r : ' l o c a l h o s t ' k e y s p a c e : ' m e t r i c '

    e x : u s e : " i o . c y a n i t e . e s _ p a t h / e s - r e s t " i n d e x : " c y a n i t e _ p a t h s " u r l : " h t t p : / / s e a r c h . i n t e r n a l . e x a m p l e . c o m : 9 2 0 0 "
  40. GRAPHITE-CYANITE with graphite-web: S T O R A G E

    _ F I N D E R S = ( ' c y a n i t e . C y a n i t e F i n d e r ' , ) C Y A N I T E _ U R L S = ( ' h t t p : / / h o s t : p o r t ' , ) with graphite-api: c y a n i t e : u r l s : - h t t p : / / c y a n i t e - h o s t : p o r t f i n d e r s : - c y a n i t e . C y a n i t e F i n d e r
  41. LEADING ARCHITECTURE DRIVERS Simplicity Optimize for speed As few moving

    parts as possible Multi-tenancy Resource efficiency Remain compatible with the graphite ecosystem


  44. CASSANDRA Good at high write to read ratio workload No

    manual shard allocation or reassignment Wide columns

    B L E " m e t r i c " ( t e n a n t t e x t , p e r i o d i n t , r o l l u p i n t , p a t h t e x t , t i m e b i g i n t , d a t a l i s t < d o u b l e > , P R I M A R Y K E Y ( ( t e n a n t , p e r i o d , r o l l u p , p a t h ) , t i m e ) )
  46. WIDE COLUMNS Each row has a key, called the partitioning

    key Here a composite of t e n a n t , p e r i o d , r o l l u p and p a t h Each row has an arbitrary number of columns (not homogeneous) Columns are sorted by a clustering key Here, the timestamp Columns may have TTLs

  48. WORK IN PROGRESS ITEMS DSL support Pickle support Path storage

    Event storage Input methods Integrations Docs
  49. DSL SUPPORT Still on-going. Parser is finished. Multi-method based implementation

    ( d e f m e t h o d a p p l y - t r a n s f o r m : a b s o l u t e [ _ s e r i e s ] ( m a p - v a l u e s ( f n [ p o i n t ] ( M a t h / a b s p o i n t ) ) s e r i e s ) )
  50. PICKLE SUPPORT Pickle is a painful protocol 100% clojure implementation:

    Input is ready but not fully integrated. This is the easy way in for cyanite in your infra Via carbon-relay https://github.com/pyr/pickler
  51. PATH STORAGE Sub-optimum at the moment Having to go to

    ES is sad Leveraging user-provided secondary indexes in Cassandra would be great Won't work out of the box
  52. EVENT STORAGE Unplanned at the moment. Should it really be

    the graphite ecosystem's responsibility?
  53. ALTERNATIVE INPUT METHODS Support queue input of metrics Collectd &

    Logstash already supports shipping graphite data to Apache Kafka & RabbitMQ. Support the statsd protocol directly.
  54. PROVIDE A CYANITE LIBRARY Easy, standard-compliant storage from JVM based

  55. STANDARD BATCH OPERATION RECIPES Compactions of rolled up series Dynamic

    thresholds Great opportunity to leverage the cassandra & spark interaction
  56. DOCS Current state: a disgrace


  58. Breaking news, building a scalable OSS TSDB is not as

    easy as bolting on NoSQL. 5:48 AM ­ 31 May 2015 24 20 Jason Dixon ​ @obfuscurity Follow A WORD OF WARNING
  59. WHAT YOU GET Trading off the complexity of dealing with

    whisper for the complexity of dealing with cassandra (and optionaly ES).
  60. HOW WE USE IT Not 100% dogfood Still some metrics

    in carbon/whisper. Gradually moving all Input to happen through Kafka. Cyanite used for lookup.
  61. ELSEWHERE Few known installations (in the 10s). Always large. Most

    used cassandra previously.
  62. SCHEMA HURDLES Partitioning for collection intervals < 10s. Cassandra CQL

    collections types have significant overhead. Cells should be more compact. It is a trade-off to avoid read-then-write. Kafka helps solve this elegantly. It's a big requirement list for "just" metrics though.
  63. MAINTENANCE (CYANITE) Prune old metrics with cyanite-utils: Whisper to Cyanite

    conversion: https://github.com/WrathOfChris/cyanite-utils https://gist.github.com/deniszh/7986974
  64. MAINTENANCE (CASSANDRA) The usual applies Schedule regular repairs of your

    clusters Follow releases Best supported version: 2.1.x Use D a t e T i e r e d C o m p a c t i o n S t r a t e g y
  65. SCALING Cyanite is stateless Colocate cassandra and cyanite daemons Split

    Data/Proxy nodes for huge deployments Haproxy to distribute queries
  66. OVERALL SENTIMENT A few pending things to go in Not

    (yet) for the faint of heart Gets the job done, better maintenance story especially if used to Cassandra & ES.
  67. THANKS ! Cyanite owes a lot to: Max Penet (@mpenet)

    for the great alia & jet library Bruno Renie (@brutasse) for graphite-api, graphite-cyanite and the initial nudge Datastax for the awesome cassandra java-driver Its contributors Apache Cassandra obviously @pyr