Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Что OpenTelemetry нам готовит?

Что OpenTelemetry нам готовит?

Ilya Kaznacheev

March 04, 2020
Tweet

More Decks by Ilya Kaznacheev

Other Decks in Programming

Transcript

  1. Что
    нам готовит?

    View Slide

  2. что такое
    distributed tracing?

    View Slide

  3. 2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"

    View Slide

  4. 2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"

    View Slide

  5. 2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"
    2009/11/10 23:00:01 starting application...
    2009/11/10 23:00:02 connecting to database
    2009/11/10 23:00:03 connecting to file storage
    2009/11/10 23:00:04 connecting to meow API 1...
    2009/11/10 23:00:05 connecting to meow API 2...
    2009/11/10 23:00:06 connecting to meow API 3...
    2009/11/10 23:00:22 GET /meow -> [200] "meow"
    2009/11/10 23:00:33 GET /meow -> [200] "meow"
    2009/11/10 23:00:44 GET /woof -> [400] "bad request"

    View Slide

  6. View Slide

  7. View Slide

  8. View Slide

  9. View Slide

  10. distributed tracing

    View Slide

  11. View Slide

  12. View Slide

  13. View Slide

  14. View Slide

  15. vs

    View Slide

  16. - Создан CNCF в конце 2016
    - Только API
    - Разрабатывается LightStep, Uber, NewRelic, Datadog, RedHat…
    - Широко используется

    View Slide

  17. - Создан Google в начале 2018
    - API и имплементация
    - Разработчики Google, Microsoft, Splunk, Honeykomb…
    - Широко используется

    View Slide

  18. Проблемы
    - Усиление связанности микросервисов
    - Vendor lock
    - OSS библиотеки тоже хотят делать трейсинг,
    но не быть vendor-specific
    - Как разработчик библиотеки (возможно) вы тоже не хотите
    привязываться к какому-то вендору, а хотите дать пользователю
    возможность самому выбирать

    View Slide

  19. Решение

    View Slide

  20. View Slide

  21. - Создан теми же в середине 2019
    - API и имплементация
    - API отделена от имплементаций (spec + libs)
    - lib dev может использовать API
    - app dev может использовать API и имплементацию
    - “каноничная” имплементация для каждого ЯП
    - Vendor-agnostic
    - Pluggable
    - Содержит все, что надо
    - tracing
    - monitoring
    - logging

    View Slide

  22. Архитектура OpenTracing

    View Slide

  23. обратная совместимость
    - сначала через bridges
    - потом через API
    - изменения:
    - tags -> attributes
    - logs -> events
    - сначала через bridges
    - сначала как часть OC но через API
    - потом отдельно через API

    View Slide

  24. Поддержка
    - OpenTelemetry - новая версия OpenTracing и OpenCensus
    - OpenTelemetry обратно совместим с OpenTracing и OpenCensus
    - После выхода стабильной версии OpenTelemetry в каждом ЯП
    разработка OpenTracing и OpenCensus прекращается
    - Поддержка OpenTracing и OpenCensus 2 года (с лета 2019)
    - Переход на OpenTelemetry возможен сразу через Bridges
    - В новых проектах предполагается использование API

    View Slide

  25. Поддерживаемые языки
    - Python
    - Java
    - JS (Node+browser)
    - Go
    - PHP
    - Ruby
    - .Net
    - Erlang
    - C++
    - Rust
    Статус:
    - pre-release
    - Prod-ready
    вторая половина 2020
    - в разных ЯП сроки
    отличаются

    View Slide

  26. View Slide

  27. Tracing API Metrics API

    View Slide

  28. Tracing API
    Tracer - отслеживание активного Span.
    Функционал
    - создает новый Span
    - возвращает активный Span
    - делает указанный Span активным
    SpanContext - серилиазируемые данные, передаваемые
    между уровнями приложения (между разными Span)
    Содержит TraceId, SpanId, TraceFlags и др. поля.

    View Slide

  29. Tracing API
    Span - представляет одну операцию в трейсе (функция, метод, и т.п.)
    Вкладывается один в другой по стеку (дереву) прохождения запроса.
    Включает
    - Имя
    - Родительский Span (Span, SpanContext или null)
    - timestamp начала
    - timestamp окончания
    - Map Attribute
    - Список Events
    - Ссылки на другие Spanы
    - Статус и прочие метаданные

    View Slide

  30. Tracing API
    Span имеет функционал:
    - Добавлять Attribute (key-value) - атрибуты/параметры Span
    - Доабвлять Event (имя, набор Attribute, timestamp) - лог внутри Span
    - Устанавливать статус (Ok, Cancelled, NotFound и еще 14)
    - Изменять имя
    - Возвращать контекст

    View Slide

  31. Metrics API
    Instruments - инструменты сбора метрик, на стороне приложения
    - Counter - счетчик, суммирующий значения синхронно
    - Add() function
    - Measure - измерения изменяющейся величины синхронно
    - Record() function
    - Observer - срезы/наблюдения величины асинхронно
    - Observe() callback

    View Slide

  32. Metrics API
    Metric Event Format
    - Context (Span или иное)
    - timestamp (неявно собирается на стороне SDK)
    - instrument definition (имя, тип, опции)
    - label set (key-value для использования потом)
    - value (signed integer или floating point)

    View Slide

  33. Metrics API
    Aggregations - инструменты обработки метрик, на стороне SDK
    - Sum для Counter
    - MinMaxSumCount для Measure
    - LastValue для Observer

    View Slide

  34. Pluggable instrumentation
    - можно инструментировать код без подключения коллектора
    - можно писать код сразу с трейсингом, а подключить его потом
    - меньше работы разработчика, больше переиспользования,
    стандартные практики
    - одинаковый API для разных сервисов, уменьшение связанности
    - меньше работы DevOps, разные сервисы (tracing, monitoring)
    ходят через один API для разных сервисов и ЯП
    - Vendor-agnostic
    - библиотеки могут быть инструментированы без привязки к вендору
    -> не нужно жертвовать observability, подключая 3rd party библиотеки

    View Slide

  35. Ссылки
    https:/
    /github.com/open-telemetry/opentelemetry-specification
    https:/
    /medium.com/opentelemetry
    https:/
    /opentelemetry.io

    View Slide

  36. Контакты
    kaznacheev.me
    github.com/ilyakaznacheev
    dev.to/ilyakaznacheev
    linkedin.com/in/ilyakaznacheev
    t.me/ilyakaznacheev
    t.me/kaznacheev_feed

    View Slide