Slide 1

Slide 1 text

Implementing configuration management primitives (as of 2024) Alexis Mousset 6th February 2024

Slide 2

Slide 2 text

CfgMgmt Camp CfgMgmtCamp 2024 2

Slide 3

Slide 3 text

CfgMgmt Camp YAML Camp CfgMgmtCamp 2024 3

Slide 4

Slide 4 text

CfgMgmt Camp YAML Camp checkApply() Camp CfgMgmtCamp 2024 4

Slide 5

Slide 5 text

checkApply def checkApply(): if is_ok(state): do_nothing() else: fix(state) • Idempotency building block • State is a global variable • This is what infra automation is all about • What’s new in the checkApply world? • And is it still relevant? CfgMgmtCamp 2024 5

Slide 6

Slide 6 text

Who am i? • System developer • Developer at Rudder since 2015 • CFEngine contributor • Ansible & Puppet user CfgMgmtCamp 2024 6

Slide 7

Slide 7 text

Configuration management primitives CfgMgmtCamp 2024 7

Slide 8

Slide 8 text

Raw material? CfgMgmtCamp 2024 8

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Raw material: interfaces & configuration data • configuration data -> checkApply() -> system APIs • Interact with existing “programming” interfaces • Rather high-level • Glue, really • Configuration data can be declarative or a program • Not different at the lowest level CfgMgmtCamp 2024 10

Slide 11

Slide 11 text

Which interface do we use on the Linux systems? CfgMgmtCamp 2024 11

Slide 12

Slide 12 text

Which interface do we use on the Linux systems? Unix v7, really (actually ~ POSIX) CfgMgmtCamp 2024 11

Slide 13

Slide 13 text

The Unix programming interface • C (too difficult) • Text interfaces • Commands (in shells) • Maybe even sockets • Everything is a file • That’s (pretty much) it! CfgMgmtCamp 2024 12

Slide 14

Slide 14 text

Since Unix 7 • More orientation towards declarative configuration and data • init scripts -> systemd units • dedicated configuration languages -> YAML/JSON/TOML/etc. • Libraries • systemd • package managers CfgMgmtCamp 2024 13

Slide 15

Slide 15 text

What does Cfg Mgmt do? • All CM tools parse commands outputs with regexes and a lot of special cases • The internals are usually messy • Hides this pretty successfully from the user • But the abstractions are leaky CfgMgmtCamp 2024 14

Slide 16

Slide 16 text

Side note: configuration surface What does configuring a “system” mean? CfgMgmtCamp 2024 15

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Configuration surface of a Linux system • A kernel, including a lot of runtime config • 1629 values in sysctl • nftables rules • eBPF programs • A file system • States everywhere CfgMgmtCamp 2024 17

Slide 19

Slide 19 text

Configuration surface of a Linux system • Config Management only manages a tiny part of it • For the rest, we don’t know • Is it a problem? • Are containers a solution? CfgMgmtCamp 2024 18

Slide 20

Slide 20 text

Has Cfg Mgmt (partially) failed? • Immutable infrastructure • Dockerfiles are shell scripts • Trading flexibility and power for simplicity • Could have Cfg Mgmt provided an interoperable and reusable higher level interface as a commodity? • Was the promise of abstracting system configuration kept? CfgMgmtCamp 2024 19

Slide 21

Slide 21 text

Solution space • Config Mgmgt = an engine passing parameters to resources providers • What are the choices we make when implementing resources? • What are the tradeoffs? • Are there unexplored areas?

Slide 22

Slide 22 text

Engine • Handles data management • Load properties • String substitution • Out of scope here • Calls the resources • Usually a big checkApply • A stack structure usually • A graph sometimes? CfgMgmtCamp 2024 21

Slide 23

Slide 23 text

A basic resource implementation def checkApply(): if is_ok(state): do_nothing() else: fix(state) CfgMgmtCamp 2024 22

Slide 24

Slide 24 text

A first choice def engine(): if not check(): apply() #### def check(): is_ok(state) def apply(): fix(state) CfgMgmtCamp 2024 23

Slide 25

Slide 25 text

Dry-run / audit / check • Allow to see what would change if the agent ran def checkApply(audit): if is_ok(state): do_nothing() else: if audit: error() else: fix(state) CfgMgmtCamp 2024 24

Slide 26

Slide 26 text

Static checks fn checkApply(audit: bool, expectedState: DirectoryResource): if state.is_ok(expectedState) { do_nothing() } else { if audit { error() } else { state.fix(expectedState) } } CfgMgmtCamp 2024 25

Slide 27

Slide 27 text

Specs and typing for parameters • User experience • Reliability • Shifting problems left struct DirectoryResource { path: Path, state: Presence, permissions: Permissions, } CfgMgmtCamp 2024 26

Slide 28

Slide 28 text

Return value • We need a standardization • Required or proper user feedback and conditional actions • Most basic version enum ResourceOutcome { Ok(...), Repaired(...), Error(...), } CfgMgmtCamp 2024 27

Slide 29

Slide 29 text

Generalization • To what extent to we generalize? • Should resources be multi-platform? • Should we cover all options? • Or a subset and provides convenient escape hatches? • A “package” resource or dnf, apt, etc. • A “container” resource or “docker” CfgMgmtCamp 2024 28

Slide 30

Slide 30 text

Agent vs. Agentless • What is actually the difference? • There is always a kind of agent when the software runs • It can be copied over SSH and removed afterwards • At low level, no difference • Mainly the presence of a long running process on the system • Push/pull CfgMgmtCamp 2024 29

Slide 31

Slide 31 text

Engine • To what extent are the engine and the resource implementation interleaved? CfgMgmtCamp 2024 30

Slide 32

Slide 32 text

Connectivity • Are resources connected? • If so, how? • Can resource instances be grouped? • Install several packages at once automatically • Can a resource trigger something? • Can a resource get information from nother resource CfgMgmtCamp 2024 31

Slide 33

Slide 33 text

Programming language • Write the resources in language used for the policies • Better for “dev-oriented” users • Flexibility • Use data format or a DSL, and a different language for resources • Hides the complexity • Can limit user ability to hack the system CfgMgmtCamp 2024 32

Slide 34

Slide 34 text

Side node: Tech stack & values Technological choices matters • Properties given by the tech stack • Fast, Portable, Beginner-friendly, etc. • Values carried with the tech stack • Attracts different people • Participates to shape the community CfgMgmtCamp 2024 33

Slide 35

Slide 35 text

Language in infra software • C • Ruby • Python • Go • C++ • Rust CfgMgmtCamp 2024 34

Slide 36

Slide 36 text

Impact • System administrators are not developers • Some languages are oriented towards for software developers • Prevents extension by users • Worse is better? CfgMgmtCamp 2024 35

Slide 37

Slide 37 text

Add a custom resource to software • Hardcoded resources may be hard to add • Extension APIs • Sometimes a “language” version (library) and a “data” version (YAML, etc.) CfgMgmtCamp 2024 36

Slide 38

Slide 38 text

Ansible • AnsibleModule library for Python • JSON for other CfgMgmtCamp 2024 37

Slide 39

Slide 39 text

CFEngine • JSON protocol over stdin/stdout • Any binary can implement it • Spawn the binary at agent start, call N times, close. { "operation": "evaluate_promise", "log_level": "info", "promise_type": "git", "promiser": "/opt/cfengine/masterfiles", "attributes": {"repo": "https://github.com/cfengine/ masterfiles"} } CfgMgmtCamp 2024 38

Slide 40

Slide 40 text

CFEngine • Logging protocol • Outcome • kept • repaired • not_kept CfgMgmtCamp 2024 39

Slide 41

Slide 41 text

CFEngine • Called like a standard resource bundle agent main { my_custom_type: "promiser" paramerter1 => "value1", paramerter2 => "value2", paramerter3 => "value3"; } CfgMgmtCamp 2024 40

Slide 42

Slide 42 text

mgmt type Res interface { CheckApply(apply bool) (checkOK bool, err error) Default() Res Validate() error Init(*Init) error Close() error Watch() error ... } CfgMgmtCamp 2024 41

Slide 43

Slide 43 text

Jet • Ansible-like in Rust • Speed • Reliability • Interesting choices • Shell as unique system interface • (Now abandoned AFAIK) CfgMgmtCamp 2024 42

Slide 44

Slide 44 text

Windows • Interesting stuff is happening • DSC 3.0 • Multiplatform (Windows, macOS, Linux) • Open-source • No dependency on Powershell • Resources can be written in any language • JSON instead of MOF • YAML policies CfgMgmtCamp 2024 43

Slide 45

Slide 45 text

Observability • Growing infrastructure complexity • Describe state and changes in a structured way • Text-based in not enough • The hardest part in implementation • Extract information • Comprehensive error handling CfgMgmtCamp 2024 44

Slide 46

Slide 46 text

Security/Traceability • Supply-chain security • No news is not good news? CfgMgmtCamp 2024 45

Slide 47

Slide 47 text

Isn’t immutability the solution? • The infrastructures are not immutables • A way to model changes • Move the mutation to a higher abstraction layer • Mutability is light and fast • Immutable means frozen, not understood and observable CfgMgmtCamp 2024 46

Slide 48

Slide 48 text

• Parallel with programming language? • Mutability is a major source of bugs • Immutability is a way to prevent them • Most programs are written with mutability eveywhere • E.g. Rust aims at managing mutability better instead of preventing it CfgMgmtCamp 2024 47

Slide 49

Slide 49 text

In Rudder • We have a Unix agent (cf-agent) and a Windows agent (based on Windows native technologies) • We want to be able to mutualize between agents • We believe in strong typing (Scala, Rust, Elm) • We look for realiability and performance • Our users are (mostly) not devs • No big-bang, transition should be progressive • No new agent! CfgMgmtCamp 2024 48

Slide 50

Slide 50 text

In Rudder Choices • Standalone pluggable resources • Implemented in Rust • Using extensibility APIs • A resource: one or more checkApply functions • Try to enrich existing data models • Independent resources for now • No magic features like mgmt CfgMgmtCamp 2024 49

Slide 51

Slide 51 text

In Rudder CFEngine Custom Promise Types • Almost perfect for our needs • Allows moving the complexity from the DSL/data to a programming language • We hadd to add an intermediate API, permitted by the JSON arbitrary data passed • The first level in the parameters JSON is interpreted by the library • A subkey is passed to the resource implementation • Allows passing metadata (machine ID, public key, temp dir, etc.) • Not a problem as we compile .cf policies from YAML CfgMgmtCamp 2024 50

Slide 52

Slide 52 text

In Rudder Language Server Protocol • A standard for implementing IDE features for a language • From VS Code • JSON communication with a binary • Allows plugging to alot of editors with a single implementation • Could we take a form of inspiration from this? • But different incentives and economy CfgMgmtCamp 2024 51

Slide 53

Slide 53 text

In Rudder Rudder • Rust library allowing to build binaries • Implements several agents extension API • Observability and traceability focus • Integrate with existing engines • Additional static validation • YAML data with additional typing and validation CfgMgmtCamp 2024 52

Slide 54

Slide 54 text

In Rudder pub trait ModuleType { fn metadata(&self) -> ModuleTypeMetadata; fn init(&mut self) -> ProtocolResult; fn validate(&self, _parameters: &Parameters) -> ValidateResult; fn check_apply(&mut self, mode: PolicyMode, parameters: &Parameters) -> CheckApplyResult; fn terminate(&mut self) -> ProtocolResult; } pub type ValidateResult = Result<()>; pub enum Outcome { Success(Option), Repaired(String), } CfgMgmtCamp 2024 53

Slide 55

Slide 55 text

In Rudder Inventory • Module types can be source of inventories • Either targeted or global • Structured output in checkApply • Diff CfgMgmtCamp 2024 54

Slide 56

Slide 56 text

In Rudder Other perspectives • NixOS/NixOps, Guix • Controled mutability • Microkernel • Specialized systems • e.g.: Talos Linux • No SSH • Only an HTTP API • Specialized use case CfgMgmtCamp 2024 55

Slide 57

Slide 57 text

In Rudder Thank you! • Keep exploring • Not over yet • Have fun! These slides will be available at: amousset.me/checkApply CfgMgmtCamp 2024 56