Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Containerization primatives
Search
Sam Kottler
November 05, 2014
Technology
0
150
Containerization primatives
Sam Kottler
November 05, 2014
Tweet
Share
More Decks by Sam Kottler
See All by Sam Kottler
This is your database on Linux
skottler
0
290
How to Debug Anything - DevOpsDay PGH
skottler
1
1.2k
Icinga at DigitalOcean
skottler
1
1k
PuppetConf '14
skottler
0
230
Configuration Management Anti-Patterns
skottler
2
1.2k
Other Decks in Technology
See All in Technology
ビズリーチ求職者検索におけるPLMとLLMの活用 / Search Engineering MEET UP_2-1
visional_engineering_and_design
1
160
ガバメントクラウドの概要と自治体事例(名古屋市)
techniczna
3
240
dbtとBigQuery MLで実現する リクルートの営業支援基盤のモデル開発と保守運用
recruitengineers
PRO
3
130
「改善」ってこれでいいんだっけ?
ukigmo_hiro
0
360
ソフトウェアエンジニアの生成AI活用と、これから
lycorptech_jp
PRO
0
520
OAuthからOIDCへ ― 認可の仕組みが認証に拡張されるまで
yamatai1212
0
140
速習AGENTS.md:5分で精度を上げる "3ブロック" テンプレ
ismk
6
1.8k
現場データから見える、開発生産性の変化コード生成AI導入・運用のリアル〜 / Changes in Development Productivity and Operational Challenges Following the Introduction of Code Generation AI
nttcom
0
300
HonoとJSXを使って管理画面をサクッと型安全に作ろう
diggymo
0
120
Node.js 2025: What's new and what's next
ruyadorno
0
550
今この時代に技術とどう向き合うべきか
gree_tech
PRO
2
2.1k
なぜAWSを活かしきれないのか?技術と組織への処方箋
nrinetcom
PRO
5
990
Featured
See All Featured
Statistics for Hackers
jakevdp
799
220k
Building Flexible Design Systems
yeseniaperezcruz
329
39k
We Have a Design System, Now What?
morganepeng
53
7.8k
Agile that works and the tools we love
rasmusluckow
331
21k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.6k
Optimizing for Happiness
mojombo
379
70k
Embracing the Ebb and Flow
colly
88
4.9k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.7k
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.6k
Writing Fast Ruby
sferik
629
62k
Mobile First: as difficult as doing things right
swwweet
225
10k
Facilitating Awesome Meetings
lara
56
6.6k
Transcript
CONTAINERIZATION PRIMITIVES Sam Kottler @samkottler
ABOUT ME • Work at DigitalOcean as a systems engineer
• Formerly of Red Hat, Venmo, Acquia • Committer/core for Puppet, Ansible, Fedora, CentOS, RubyGems, Bundler
WE’RE GONNA BE TALKING ABOUT LINUX
GOOD TO KNOW’S • What is a syscall • Basic
understanding of linux networking • Containers vs. virtualization
WHY DO WE CARE ABOUT ANY OF THIS?
CONTAINERS ARE THE PAST *, PRESENT, AND FUTURE * Most
of the linux ideas are poached from other OS’s
VIRTUALIZATION HAS BECOME MASSIVELY POPULAR BECAUSE OF ITS ECONOMICS
CONTAINERS ARE BECOMING MASSIVELY POPULAR BECAUSE THEY ALLOW LOGICAL SEPARATION
APPLICATION VS. FULL CONTAINERS
NETWORKS, USERS, AND PROCESSES
NAMESPACES • mnt: filesystem • pid: process • net: network
• ipc: SysV IPC • uts: hostname • user: UID
THE BASICS • Namespaces do not have names • Six
inodes exist under /proc/<pid>/ns • Each namespace has a unique inode
USERSPACE TOOLING • iproute2 • util-linux • systemd
NAMESPACE SYSCALLS • unshare() • moves existing process into a
new namespace • clone() • creates new process and namespace • setns() • joins an existing namespace
NETWORK ISOLATION • One namespace per networking device • Single
default namespace, init_net(*nets) • A lo device is included in every ns_net.
NETWORK NAMESPACES IN PRACTICE • ip netns add testns1 •
creates /var/run/netns/testns1 • route management per-NS • prevents cross-NS bonds • setns(int fd, int nstype) • validates namespace type vs. FD
SOCKET ISOLATION • Sockets are mapped into network namespaces •
Also part of a single network namespace • sk_net is part of the sock struct • sock_net()/sock_net_set() getter/setter
SOCKET ACTIVATION • Listen on a socket, but have no
services behind it • Request arrives, service is spun up, responds • Enabling 10k+ low-usage services on a VM
USER ISOLATION • Allows non-privileged usage • Often used as
the start of a namespace chain • UID’s come from the overflow rules
CGROUPS • Resource management • Around since 2006/2007 • Widely
used by userspace management tools
CGROUPS + NAMESPACES • “This PID can only see part
of the filesystem” • “This PID can only see part of the filesystem, use 384mb of memory, and utilize a single CPU.”
CGROUP IMPLEMENTATION • Hooks into fork() and exit() • VFS
of a new type called “cgroup” • More complex descriptors for task_struct • Procfs entry in /proc/<pid>/cgroup • All actions take place on the FS
CGROUP MANAGEMENT • 4 files per-cgroup • tasks • cgroup.procs
• cgroup.event_control • notify_on_release
CPU • Split into “shares” • Default is 2048 shares
• Linear CPU time use
MEMORY • Exposes most of the memory subsystem • NUMA
management • Most complex type of cgroup
LETS TALK ABOUT SECURITY…
SHARING A KERNEL IS INHERENTLY LESS SECURE
KERNEL VULNERABILITIES AROUND BREAKOUT ARE USUALLY MITIGATED BY RUNNING SERVICES
NON- PRIVILEGED
THANKS! • @samkottler • https://github.com/skottler •
[email protected]