$30 off During Our Annual Pro Sale. View Details »

How Bitso Empowers Its Devs to Troubleshoot K8s Independently

How Bitso Empowers Its Devs to Troubleshoot K8s Independently

More often than not, Kubernetes is behaving as it should: fast, agile and scalable. But what happens when K8s misbehaves? Do the responders have the right tools to deal with the situation? Or are they just part of an escalation chain that always leads to the same small group of experts?

Meet Bitso, a billion-dollar company and the largest crypto platform in Latin America, which has mastered the shift-left approach and empowered its dev team to troubleshoot K8s independently.

Join this session to hear Bitso’s engineering lead, Juan Jose Mejia, and Komodor’s head of solution architects, Oren Ninio, as they share Bitso’s journey and how the company was able to:

Create a lean (and mean) K8s troubleshooting workflow that reduced MTTR by 75%
Relieve DevOps bottlenecks and save more than 30 DevOps hours every week
Use a mix of developer-friendly tools and training strategies to bridge the K8s knowledge gap across the organization within just a few months

Komodor

May 01, 2022
Tweet

More Decks by Komodor

Other Decks in Programming

Transcript

  1. HOW BITSO
    EMPOWERS ITS DEVS
    TO TROUBLESHOOT


    K8s INDEPENDENTLY

    View Slide

  2. WHO ARE WE?
    Oren Nini
    o

    Head of Solution Architectur
    e

    @Komodor
    Juan Jose Meji
    a

    Engineer Lea
    d

    @Bitso

    View Slide

  3. Bitso is a fully remote organization with Bitsonauts in
    over 38 countries around the world. Bitso’s mission is to
    make crypto useful by providing millions of Latin
    Americans with an alternative way to access fast, and
    reliable financial services powered by crypto.


    This will address Latin America’s financial inclusion
    issues for the 70% of unbanked and underrepresented
    communities in the region
    .

    With over 4 million users, Bitso is the leading
    cryptocurrency platform in Latin America.
    ABOUT BITSO

    View Slide

  4. PART
    1

    A GLIMPSE INTO


    BITSO’S
    ARCHITECTURE

    View Slide

  5. TRANSITIONING FROM A
    STARTUP TO MEDIUM-
    SIZED COMPANY
    How did our existing architecture look like
    before
    ?

    ● Monolith services each living in a K8S pod, to
    Microservices living in their own pods
    .

    ● Relational unclustered databases, to clustered
    databases
    .

    ● Synchronous messaging architecture, to
    asynchronous event driven architecture
    .

    View Slide

  6. PART
    2

    BITSO’S 3
    CHALLENGES


    OF IMPLEMENTING
    KUBERNETES

    View Slide

  7. CHALLENGE #1
    The migration to K8s meant we needed to re-
    evaluate our legacy monolithic architecture
    .

    ● Services running on VMs from a custom
    provider
    .

    ● Logs only accessed via kubectl
    .

    ● Jenkins CI added a manual extra step to run
    a codebuild
    .

    ● Standalone VPNs required a lot of
    maintenance.

    View Slide

  8. CHALLENGE #2
    Our dev processes were no longer relevant
    .

    The migration to K8s also meant we needed to re-evaluate our
    current dev processes, for example:


    ● Manual merges to dev/stage
    .

    ● Creation of new services on Kubernetes
    .

    ● Pull Request interaction
    .

    ● Logging in to AWS.

    View Slide

  9. CHALLENGE #3
    We noticed a lack of K8s expertise &
    knowledge
    .

    K8s is an emerging technology with few
    engineers having deep understanding of
    its inner workings
    .

    This knowledge gap impacted the speed
    of our development processes and our
    troubleshooting capabilities.
    Source: The State of Kubernetes 2021
    (VMware)

    View Slide

  10. PART
    3

    BITSO’S 3 BEST
    PRACTICES


    TO IMPROVE THE DAY-
    TO-DAY WORK WITH
    K8S

    View Slide

  11. Implemented K8s-friendly tools.
    SOLUTION #1
    Replaced our standalone VPNs
    with AWS VPNs in order to
    improve our uptime and reliability.
    Replaced Jenkins CI with
    CircleCI in order to have
    automated pipelines without
    human intervention.

    View Slide

  12. Implemented K8s-friendly tools.
    SOLUTION #1
    Implemented Komodor to help
    our devs troubleshoot production
    clusters and lower environment
    issues. With Komodor, we had
    access to pods health statuses,
    deployment and health change
    events, and loggings per pod
    instance, all in one central place.
    Instead of accessing logs via the
    kubectl console, we implemented
    Splunk to view our logs in a
    proper and more detailed way on
    the browser.

    View Slide

  13. Adapted our processes to our cloud
    native architecture.
    SOLUTION #2
    ● Implemented a bot that automatically merges and undo’s branches
    .

    ● Set up sync meetings to anticipate and plan new services creations
    .

    ● Changed process where teams needed to agree on a solution together.
    before coding so the PR result is closer to the expected for the team
    .

    ● Used solutions like saml2aws that simplified the aws cli interaction
    .

    View Slide

  14. Continuously bridging the K8s
    knowledge gap.
    SOLUTION #3
    We established a dedicated, weekly session for
    knowledge sharing for our engineering teams.


    These have multiple objectives, including
    :

    ● Sharing specific use cases with K8s
    .

    ● Discussing best practices the devs would like to
    share
    .

    ● Displaying specific issues that were
    troubleshooted and lessons learnt from these
    scenarios.

    View Slide

  15. Q&A

    View Slide

  16. If you want to be part of Bitso


    check out our open roles
    !

    View Slide