Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to monitor Cosmos validator by Prometheus

How to monitor Cosmos validator by Prometheus

Rico Chen

June 30, 2022
Tweet

More Decks by Rico Chen

Other Decks in Programming

Transcript

  1. About me 2 • Translated Google - The Site Reliability

    Workbook for Taiwan community • DevOps Taiwan volunteer • AWS Taiwan User Group volunteer • Starbugs Weekly DevOps recommend article writer • Gamer • Disney Frozen lover
  2. Agenda 1. Blockchain Basic Mechanism 2. Cosmos-SDK Basic Mechanism 3.

    How to monitor? 4. How to monitor remotely? 5. Our architecture 6. Reference 3
  3. Blockchain layers 6 • Networking layer - Makes sure that

    each node receives transactions • Consensus layer - Makes sure that each node agrees on the same transactions to modify their local state ◦ PoW (Proof of work) ◦ PoS (Proof of stake) • Application layer - Input a transaction and a state will return a new state ◦ Bitcoin - Account balance ◦ Ethereum - EVM ◦ Chainlink - Provide external data Proof of Work (PoW) vs. Proof of Stake (PoS)
  4. Cosmos SDK • The goal of the Cosmos SDK is

    to allow developers to easily create custom blockchains from scratch that can natively interoperate with other blockchains • Blockchains built with the Cosmos SDK are generally referred to as application-specific blockchains ◦ ICO ◦ DeFi ◦ GameFi • Golang programming language 7 gopher!!
  5. Cosmos-SDK usage example 11 • Cosmos-SDK includes Networking, Consensus and

    covers some Application layers. ◦ Account, balance, governance…etc. • Developers only need to focus on business logic. ◦ For example Defi, GameFi, ICO...etc
  6. Prometheus • Prometheus is the Cosmos-SDK monitoring solution • We

    can leverage on Prometheus Cloud Native ecosystem ◦ node-exporter and process-exporter ◦ Grafana ◦ Alertmanager ◦ Kubernetes 13
  7. Tendermint metrics • Make sure our nodes keep syncing •

    The P2P count matches our expectations • Does our validator miss the blocks? • We only set up alert rules in these metrics • The P2P count matches our expectations ◦ Sentry P2P peers > 20 ◦ Validator P2P peers > 2 15 Tendermint metrics doc
  8. Cosmos-SDK build-in feat metrics • These metrics focus on the

    application layer. Most of the time, only developers would observe metrics on the dashboards 16 Cosmos-SDK metrics
  9. Custom metrics • These metrics focus on the application layer.

    Most of the time, only developers would observe metrics on the dashboards • Not all the Cosmos-SDK chains create Prometheus metrics for their custom features ◦ For example: Agoric custom metrics 17
  10. Telemetry monitoring • The mechanism is similar to the blockchain

    scanner website • No need to access the production environment • Be careful with the huge request by exporter itself ◦ To reach the public data node rate limit ◦ DoS (denial-of-service) your own data node 20
  11. Prometheus exporters • solarlabsteam/cosmos-exporter ◦ Pros: ▪ Basically, it can

    use on all Cosmos-SDK chain ▪ Included consensus and application layer metrics ◦ Cons: ▪ Need gRPC and Tendermint RPC (Most of the public nodes don't support gRPC) 21
  12. stakefish Twitter If you would like to learn more about

    stakefish or staking in general, please don’t hesitate to reach out.