How have we been building a containers-based PaaS these last 5 years?

How have we been building a containers-based PaaS these last 5 years?

Talk about building a PaaS, and container scheduling system

D2781d29cecddfee7a6f2fbdd4662882?s=128

Soulou

June 26, 2018
Tweet

Transcript

  1. 4.

    What is Scalingo? Third-party application hosting platform → Images built

    from code source Database as a Service hosting → Pre-built images available on Docker Hub No Docker knowledge required
  2. 6.

    Constraints PaaS provider objective ≠ Software service company Infrastructure optimization

    High level of consolidation Efficient scheduling → Avoid paying unused resources
  3. 9.

    Tooling timeline Tools Docker 0.1.0 → 2013-03-23 Etcd 0.1.0 →

    2013-08-11 Orchestration Swarm 1.0.0 → 2015-11-03 Kubernetes 1.0.6 → 2015-09-11
  4. 14.
  5. 15.

    A (small) swissknife A Little of everything Not efficient everywhere

    Lock-in → A tool for us: Isolation helper Apps distribution
  6. 17.

    Logging - demo $ docker run -it --log-opt file-max=1 --log-opt

    file-size=1m ubuntu:16.04 bash $ for i in $(seq 100000) ; do perl -e "print 'xxxxxxxxxxxxxxxxxxxxxxxxx$i' x 10, \"\n\"" ; done $ docker logs --follow <id> Demo: breaking docker logs
  7. 18.

    Logging Agent TCP + TLS Agent TCP + TLS Message

    Bus 1.10.0 (2016-02-04) : Syslog driver TCP+TLS
  8. 22.

    Networking throttling $ interface_id=$(echo 'ip link show eth0' | nsenter

    --target #{pid} --net --pid) $ iface=$(ip link show | grep “^${interface_id}:”) $ tc qdisc replace dev “${iface}” root tbf rate “${limit}”mbit latency 200ms burst “${burst}”MB No hook possible with Docker → Our job
  9. 25.

    Security Concerns Unprivileged users only 1 container → 1 user

    Apps are built with 1 layer → 1 tarball Ability to patch base image
  10. 30.

    NFS → Good enough to start Downsides: • SPoF •

    HA tricky to get • Hard to scale • No blkio cgroup! Storage - Beginnings
  11. 31.

    Storage - NFS HA NFS DB DB DB DB NFS

    DRBD VIP DB DB DB DB MASTER FAILOVER
  12. 32.

    Software Defined Storage - GlusterFS* - Ceph* - OpenSDS* -

    OpenEBS* - Rook* - StorageOS - Hedvig - ScaleIO (Dell) - Kasten - Virtuoso - Datacore - Diamanti - Hatchway (VMWare) - Portworx - Quobyte - Datera - Robin Cloud Platform - ... * FOSS
  13. 33.

    Attached disks from SAN ≥ 100 volumes per host LVM

    magic Storage ie. https://github.com/Scalingo/go-fssync Good cloning tools Reattach on host failure
  14. 34.

    Little Reminder Our goal: optimize servers usage Automate all the

    things Include business logics Scheduling
  15. 35.

    Stop the world and move things Online vs Offline New

    container to an existing topology Scheduling
  16. 37.

    Scheduling - Online - E. Arzuaga and D. R. Kaeli.

    Quantifying load imbalance on virtualized enterprise servers. page 235. ACM Press. - D. Ferrari and S. Zhou. An empirical investigation of load indices for load balancing applications. - T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif. Sandpiper : Black-box and gray-box resource management for virtual machines. 53(17) : 2923–2938. Current Strategy → Sandpiper
  17. 41.

    Business Logics and Support → Control needed on core infrastructure

    → Technology mastering → Need to patch K8s? → Scheduling strategy (w/ overcommitting) → “Unstable” solution Why not existing tech today?
  18. 42.

    Owned Apps vs Third-Party → Not design for third-party →

    Billing requirements → Fine grained resource control Why not existing tech today?