Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How we learned to love the Data Center Operating System

How we learned to love the Data Center Operating System

by Saulius Valatka
DevOps Pro Vilnius 2016

DevOps Pro

June 01, 2016
Tweet

More Decks by DevOps Pro

Other Decks in Technology

Transcript

  1. HOW WE LEARNED TO LOVE
    THE DATA CENTER OPERATING SYSTEM
    SAULIUS VALATKA / ADFORM

    View Slide

  2. Online Advertising Full Stack Platform

    View Slide

  3. Online Advertising Full Stack Platform
    Realtime “smart“ ads
    Forecasting, fraud detection, etc.

    View Slide

  4. Online Advertising Full Stack Platform
    Realtime “smart“ ads
    Forecasting, fraud detection, etc.
    1mln QPS under 100ms
    1TB daily data

    View Slide

  5. EVOLUTION

    View Slide

  6. MIDDLE AGES
    ctrtrain.ec2-aws.com test2.ec2-aws.com modelling.ec2-aws.com

    View Slide

  7. THE TORTURE
    # yum install python R libboost-3.12
    $ scp script.R test.aws.com:/script.R
    # crontab -e

    View Slide

  8. THE TORTURE
    # yum install python R libboost-3.12
    $ scp script.R test.aws.com:/script.R
    # crontab -e
    “strange, worked on my machine …”

    View Slide

  9. RENAISSANCE
    ctrtrain.ec2-aws.com test2.ec2-aws.com worker-1.adform.com worker-2.adform.com
    ab34na3n ar2afga3n

    View Slide

  10. View Slide

  11. CONTAINERIZE !
    self contained artifacts
    isolated runtime
    basically no overhead
    unified deployment

    View Slide

  12. BUT WAIT …
    what about configuration ?

    View Slide

  13. The twelve-factor app stores config in environment variables
    Env vars are easy to change between deploys without changing any code
    There is little chance of them being checked into the code repo accidentally
    They are a language- and OS-agnostic standard

    View Slide

  14. BUT WAIT …
    where do I log ?
    and what about metrics ?

    View Slide

  15. View Slide

  16. MODERN ERA
    e34sadf
    ab34na3n
    af4f5a4r
    aafde33a
    fa45daws
    faes4fa3
    aaf444a2
    fas3rfa4

    View Slide

  17. View Slide

  18. MARATHON
    the init of the DCOS
    constraints
    deployment
    {
    "id": “my-nginx",
    "container": {
    "type": "DOCKER",
    "docker": {
    "image": "nginx:1.7.7",
    "network": "BRIDGE",
    }
    },
    "instances": 1,
    "cpus": 0.5,
    "mem": 128
    }

    View Slide

  19. SPRINT
    the exec of he DCOS
    will be open sourced
    scheduler to follow!

    View Slide

  20. MANAGING RESOURCES
    how much memory do I really need ?

    View Slide

  21. MANAGING RESOURCES
    how much memory do I really need ?
    and CPUs ?
    what does 0.5 CPUs mean anyway ?

    View Slide

  22. MANAGING RESOURCES
    how much memory do I really need ?
    and CPUs ?
    what does 0.5 CPUs mean anyway ?
    and what happens with the network ?

    View Slide

  23. View Slide

  24. ISOLATION
    cgroups:
    cpu
    cpuset
    memory
    blkio
    net_cls

    View Slide

  25. NETWORK ISOLATION
    Layer 3 routing software defined networks

    View Slide

  26. ATOMIC AGE
    a4faw3f
    4afsdgg
    asdf4faf
    se4faw
    aw3d3ff
    g4aefgsd
    5gsdgr54s
    a4rff4afa 4f4qaf4

    View Slide

  27. SERVICE DISCOVERY
    where is my app ? how do I reach it ?
    won’t containers conflict about ports ?

    View Slide

  28. MARATHON-LB

    View Slide

  29. PERSISTENCE
    so .. where do I store my data ?
    on the host ? won’t it disappear ?

    View Slide

  30. PERSISTENCE
    / /opt/app/cache
    /var/lib/docker/devicemapper
    /var/lib/mesos/slave/volumes

    View Slide

  31. PERSISTENCE
    / /opt/app/cache /opt/app/profile
    /var/lib/docker/devicemapper
    /var/lib/mesos/slave/volumes
    /mnt/sdc
    network block storage

    View Slide

  32. FUTURE PLANS
    DC/OS
    IP per container
    Containerize all the things

    View Slide

  33. @adforminsider

    View Slide