Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Containerization primatives
Search
Sam Kottler
November 05, 2014
Technology
0
140
Containerization primatives
Sam Kottler
November 05, 2014
Tweet
Share
More Decks by Sam Kottler
See All by Sam Kottler
This is your database on Linux
skottler
0
280
How to Debug Anything - DevOpsDay PGH
skottler
1
1.1k
Icinga at DigitalOcean
skottler
1
1k
PuppetConf '14
skottler
0
230
Configuration Management Anti-Patterns
skottler
2
1.1k
Other Decks in Technology
See All in Technology
10個のフィルタをAXI4-Streamでつなげてみた
marsee101
0
160
GitHub Copilot のテクニック集/GitHub Copilot Techniques
rayuron
27
12k
20241220_S3 tablesの使い方を検証してみた
handy
3
370
AWS re:Invent 2024で発表された コードを書く開発者向け機能について
maruto
0
190
ブラックフライデーで購入したPixel9で、Gemini Nanoを動かしてみた
marchin1989
1
520
社内イベント管理システムを1週間でAKSからACAに移行した話し
shingo_kawahara
0
180
Opcodeを読んでいたら何故かphp-srcを読んでいた話
murashotaro
0
170
終了の危機にあった15年続くWebサービスを全力で存続させる - phpcon2024
yositosi
1
1.8k
サイバー攻撃を想定したセキュリティガイドライン 策定とASM及びCNAPPの活用方法
syoshie
3
1.2k
re:Invent 2024 Innovation Talks(NET201)で語られた大切なこと
shotashiratori
0
310
成果を出しながら成長する、アウトプット駆動のキャッチアップ術 / Output-driven catch-up techniques to grow while producing results
aiandrox
0
270
日本版とグローバル版のモバイルアプリ統合の開発の裏側と今後の展望
miichan
1
130
Featured
See All Featured
Mobile First: as difficult as doing things right
swwweet
222
9k
Why You Should Never Use an ORM
jnunemaker
PRO
54
9.1k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
26
1.5k
4 Signs Your Business is Dying
shpigford
181
21k
Making Projects Easy
brettharned
116
5.9k
For a Future-Friendly Web
brad_frost
175
9.4k
Product Roadmaps are Hard
iamctodd
PRO
49
11k
How GitHub (no longer) Works
holman
311
140k
Bash Introduction
62gerente
608
210k
Optimising Largest Contentful Paint
csswizardry
33
3k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
229
52k
VelocityConf: Rendering Performance Case Studies
addyosmani
326
24k
Transcript
CONTAINERIZATION PRIMITIVES Sam Kottler @samkottler
ABOUT ME • Work at DigitalOcean as a systems engineer
• Formerly of Red Hat, Venmo, Acquia • Committer/core for Puppet, Ansible, Fedora, CentOS, RubyGems, Bundler
WE’RE GONNA BE TALKING ABOUT LINUX
GOOD TO KNOW’S • What is a syscall • Basic
understanding of linux networking • Containers vs. virtualization
WHY DO WE CARE ABOUT ANY OF THIS?
CONTAINERS ARE THE PAST *, PRESENT, AND FUTURE * Most
of the linux ideas are poached from other OS’s
VIRTUALIZATION HAS BECOME MASSIVELY POPULAR BECAUSE OF ITS ECONOMICS
CONTAINERS ARE BECOMING MASSIVELY POPULAR BECAUSE THEY ALLOW LOGICAL SEPARATION
APPLICATION VS. FULL CONTAINERS
NETWORKS, USERS, AND PROCESSES
NAMESPACES • mnt: filesystem • pid: process • net: network
• ipc: SysV IPC • uts: hostname • user: UID
THE BASICS • Namespaces do not have names • Six
inodes exist under /proc/<pid>/ns • Each namespace has a unique inode
USERSPACE TOOLING • iproute2 • util-linux • systemd
NAMESPACE SYSCALLS • unshare() • moves existing process into a
new namespace • clone() • creates new process and namespace • setns() • joins an existing namespace
NETWORK ISOLATION • One namespace per networking device • Single
default namespace, init_net(*nets) • A lo device is included in every ns_net.
NETWORK NAMESPACES IN PRACTICE • ip netns add testns1 •
creates /var/run/netns/testns1 • route management per-NS • prevents cross-NS bonds • setns(int fd, int nstype) • validates namespace type vs. FD
SOCKET ISOLATION • Sockets are mapped into network namespaces •
Also part of a single network namespace • sk_net is part of the sock struct • sock_net()/sock_net_set() getter/setter
SOCKET ACTIVATION • Listen on a socket, but have no
services behind it • Request arrives, service is spun up, responds • Enabling 10k+ low-usage services on a VM
USER ISOLATION • Allows non-privileged usage • Often used as
the start of a namespace chain • UID’s come from the overflow rules
CGROUPS • Resource management • Around since 2006/2007 • Widely
used by userspace management tools
CGROUPS + NAMESPACES • “This PID can only see part
of the filesystem” • “This PID can only see part of the filesystem, use 384mb of memory, and utilize a single CPU.”
CGROUP IMPLEMENTATION • Hooks into fork() and exit() • VFS
of a new type called “cgroup” • More complex descriptors for task_struct • Procfs entry in /proc/<pid>/cgroup • All actions take place on the FS
CGROUP MANAGEMENT • 4 files per-cgroup • tasks • cgroup.procs
• cgroup.event_control • notify_on_release
CPU • Split into “shares” • Default is 2048 shares
• Linear CPU time use
MEMORY • Exposes most of the memory subsystem • NUMA
management • Most complex type of cgroup
LETS TALK ABOUT SECURITY…
SHARING A KERNEL IS INHERENTLY LESS SECURE
KERNEL VULNERABILITIES AROUND BREAKOUT ARE USUALLY MITIGATED BY RUNNING SERVICES
NON- PRIVILEGED
THANKS! • @samkottler • https://github.com/skottler •
[email protected]