Slide 1

Slide 1 text

© 2024 Wantedly, Inc. YAML comments  Psych Getting along with May. 16 2024 - Masaki Hara @ RubyKaigi 2024 with

Slide 2

Slide 2 text

© 2024 Wantedly, Inc. Masaki Hara @qnighy Masaki Hara

Slide 3

Slide 3 text

© 2024 Wantedly, Inc. Masaki Hara @ Wantedly Stop by Wantedly booth!

Slide 4

Slide 4 text

© 2024 Wantedly, Inc. Made a little neat library! That is what my talk is about.

Slide 5

Slide 5 text

© 2024 Wantedly, Inc. Overview ● Introducing psych-comments gem ● Intro to YAML ● Implementation ● Our use-case ● Story on the library’s scope

Slide 6

Slide 6 text

© 2024 Wantedly, Inc. Introducing psych-comments

Slide 7

Slide 7 text

© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN

Slide 8

Slide 8 text

© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN

Slide 9

Slide 9 text

© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN

Slide 10

Slide 10 text

© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) obj = YAML.load(input) obj["env"] << "GH_TOKEN" output = YAML.dump(obj)

Slide 11

Slide 11 text

© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) obj = Psych.load(input) obj["env"] << "GH_TOKEN" output = Psych.dump(obj)

Slide 12

Slide 12 text

© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) env: # build - NPM_TOKEN # deploy - GH_TOKEN --- env: - NPM_TOKEN - GH_TOKEN - NEW_TOKEN

Slide 13

Slide 13 text

© 2024 Wantedly, Inc. Psych-comments Psych-comments s = Psych::Comments.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = Psych::Comments.emit_yaml(s) bundle add psych-comments require "psych/comments"

Slide 14

Slide 14 text

© 2024 Wantedly, Inc. Psych-comments Psych-comments env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN

Slide 15

Slide 15 text

© 2024 Wantedly, Inc. A library for YAML comments That’s it!

Slide 16

Slide 16 text

© 2024 Wantedly, Inc. Intro to YAML

Slide 17

Slide 17 text

© 2024 Wantedly, Inc. YAML myth “YAML is a fancy JSON” YAML myth

Slide 18

Slide 18 text

© 2024 Wantedly, Inc. YAML myth “YAML is a fancy JSON” YAML myth 🙅

Slide 19

Slide 19 text

© 2024 Wantedly, Inc. YAML family tree XML Perl Marshaller YAML 1.0 YAML 1.2 YAML as “simple XML” YAML as “Portable Marshaller” YAML as “JSON upper-compat” JSON 1998 Data::Denter, Around 2001 2004 2001 (Launch of json.org) 2009

Slide 20

Slide 20 text

© 2024 Wantedly, Inc. YAML family tree XML Perl Marshaller YAML 1.0 YAML 1.2 YAML as “simple XML” YAML as “Portable Marshaller” YAML as “JSON upper-compat” JSON 1998 Data::Denter, Around 2001 2004 2001 (Launch of json.org) 2009

Slide 21

Slide 21 text

© 2024 Wantedly, Inc. YAML and Marshal YAML is aware of… ● custom objects ● aliases ● cyclic references ● streams

Slide 22

Slide 22 text

© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: custom objects YAML.unsafe_load("!ruby/regexp /[a-z]/") Marshal.load( "\x04\x08I/\x0A[a-z]\x00\x06:\x06EF") /[a-z]/

Slide 23

Slide 23 text

© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: aliases YAML.unsafe_load("[&x [], *x]") Marshal.load("\x04\x08[\x07[\x00@\x06") [[]] * 2

Slide 24

Slide 24 text

© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: cyclic references YAML.unsafe_load("&x [*x]") Marshal.load("\x04\x08[\x06@\x00") [[...]]

Slide 25

Slide 25 text

© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: streams "---\n1\n---\n2" # YAML "\x04\x08i\x06\x04\x08i\x07" # Marshal 1 2 NOTE: RGSS save data are one such example

Slide 26

Slide 26 text

© 2024 Wantedly, Inc. Abstraction! Fight against complexity

Slide 27

Slide 27 text

© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Representation Serialization Presentation Native

Slide 28

Slide 28 text

© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML "&a [1, true, *a]" [...] "1" "true" *a !!seq [...] !!int "1" !!str "true" * [1, true].tap { |a| a << a } Representation Serialization Presentation Native

Slide 29

Slide 29 text

© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Process aliases and anchors (Cycles etc.) Process Tags (Custom Objects) Representation Serialization Presentation Native

Slide 30

Slide 30 text

© 2024 Wantedly, Inc. YAML Kind Sequence Mapping Scalar

Slide 31

Slide 31 text

© 2024 Wantedly, Inc. YAML Kind Sequence Mapping Scalar [1, 2, 3] - 1 - 2 - 3 { a: b } a: b ? [1, 2] : [3, 4] foo 1 "2" > foo bar

Slide 32

Slide 32 text

© 2024 Wantedly, Inc. YAML Kind and Tags Sequence Mapping Scalar !!seq !!omap !!pairs !!map !!set !ruby/object !!str !!binary !!null !!bool !!int !!float !!timestamp !!yaml !ruby/symbol !!merge !!value

Slide 33

Slide 33 text

© 2024 Wantedly, Inc. Tag resolution Default for Sequences, Mappings, and quoted Scalars [1, 2, 3] !!seq [1, 2, 3] { a: b } !!map { a: b } "foo" !!str "foo"

Slide 34

Slide 34 text

© 2024 Wantedly, Inc. Plain scalar resolution Plain scalars, defined by a “Schema” null !!null "null" false !!bool "false" 123 !!int "123" 123.45 !!float "123.45"

Slide 35

Slide 35 text

© 2024 Wantedly, Inc. Plain scalar resolution Schema = Pairs of (Regexp, tag) Pattern Tag to be resolved /^(null|Null|NULL|~)?$/ !!null /^(true|True|TRUE|false|False|FALSE)$/ !!bool /^([-+]?[0-9]+|0o[0-7]+|0x[0-9a-fA-F]+)$/ !!int (omit) !!float /^.*$/ !!str

Slide 36

Slide 36 text

© 2024 Wantedly, Inc. Plain scalar resolution Application-specific schema Pattern Tag to be resolved /^(null|Null|NULL|~)?$/ !!null /^(true|True|TRUE|false|False|FALSE)$/ !!bool /^([-+]?[0-9]+|0o[0-7]+|0x[0-9a-fA-F]+)$/ !!int (omit) !!float /^:.*$/ !ruby/symbol /^<<$/ !!merge /^.*$/ !!str

Slide 37

Slide 37 text

© 2024 Wantedly, Inc. YAML is… ● YAML is (in a way) a portable Marshal ● Abstraction layers to sort out complexity ○ Parsing: remove syntax details ○ Deserializing: connect anchors and aliases ○ Interpreting: resolve and realize tags

Slide 38

Slide 38 text

© 2024 Wantedly, Inc. Implementation

Slide 39

Slide 39 text

© 2024 Wantedly, Inc. Recap: Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Representation Serialization Presentation Native

Slide 40

Slide 40 text

© 2024 Wantedly, Inc. Psych’s API levels Ruby Object Node Graph Event Tree YAML Psych.load Psych.dump Psych.parse #to_yaml Psych::Parser Psych::Emitter High-level API Mid-level API Low-level API Representation Serialization Presentation Native

Slide 41

Slide 41 text

© 2024 Wantedly, Inc. Psych’s API levels and psych-comments Ruby Object Node Graph Event Tree YAML Psych.parse #to_yaml Mid-level API psych-comments’ API Representation Serialization Presentation Native

Slide 42

Slide 42 text

© 2024 Wantedly, Inc. Recap: Psych (YAML) high-level API Recap: Psych (a.k.a. YAML): high-level API obj = YAML.load(input) obj["env"] << "GH_TOKEN" output = YAML.dump(obj)

Slide 43

Slide 43 text

© 2024 Wantedly, Inc. Recap: Psych (YAML) high-level API Recap: Psych (a.k.a. YAML): high-level API obj = Psych.load(input) obj["env"] << "GH_TOKEN" output = Psych.dump(obj)

Slide 44

Slide 44 text

© 2024 Wantedly, Inc. Psych (YAML) mid-level API Psych (a.k.a. YAML): mid-level API s = Psych.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = s.to_yaml

Slide 45

Slide 45 text

© 2024 Wantedly, Inc. Recap: Psych-comments Recap: Psych-comments s = Psych::Comments.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = Psych::Comments.emit_yaml(s)

Slide 46

Slide 46 text

© 2024 Wantedly, Inc. We cannot simply extend it!

Slide 47

Slide 47 text

© 2024 Wantedly, Inc. Psych (lack of) extensibility Psych uses libyaml (C library) Psych::Nodes::Node libyaml YAML text Extendable from Ruby Not extendable from Ruby

Slide 48

Slide 48 text

© 2024 Wantedly, Inc. Parser: divide the passes

Slide 49

Slide 49 text

© 2024 Wantedly, Inc. Extending the parser 2-pass parser Psych::Nodes::Node libyaml YAML text without comments Source location Psych::Nodes::Node with comments psych- comments Parse nodes Parse comments

Slide 50

Slide 50 text

© 2024 Wantedly, Inc. Comment scanning algorithm Remember cursor position - # egg # pork bar

Slide 51

Slide 51 text

© 2024 Wantedly, Inc. Comment scanning algorithm Select subtext running to the next position - # egg # pork bar

Slide 52

Slide 52 text

© 2024 Wantedly, Inc. Comment scanning algorithm Extract “#” - # egg # pork bar

Slide 53

Slide 53 text

© 2024 Wantedly, Inc. Comment scanning algorithm Attach them to the following node - # egg # pork bar

Slide 54

Slide 54 text

© 2024 Wantedly, Inc. Comment scanning algorithm Recurse into all descendants and repeat - # egg # pork bar

Slide 55

Slide 55 text

© 2024 Wantedly, Inc. Comment scanning edge case 1 Edge case 1: unwanted occurrence of # foo#bar: < # baz

Slide 56

Slide 56 text

© 2024 Wantedly, Inc. Comment scanning edge case 1 Edge case 1 solution: skip over scalars foo#bar: < # baz

Slide 57

Slide 57 text

© 2024 Wantedly, Inc. Comment scanning edge case 2 Edge case 2: comments before delimiters [ 1, 2, # foo ]

Slide 58

Slide 58 text

© 2024 Wantedly, Inc. Comment scanning edge case 2 Edge case 2 solution: attach as trailing comments [ 1, 2, # foo ]

Slide 59

Slide 59 text

© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3: comments on a key-value pair # foo foo: 1 bar: # bar 2 NOTE: Psych lacks a node type for key-value pairs, instead hanging keys and values alternatingly in a flat array

Slide 60

Slide 60 text

© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3: comments on a key-value pair Mapping Key 0 Val 0 Key 1 Val 1 Key 2 Val 2 … Flat array! (in Psych)

Slide 61

Slide 61 text

© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3 solution: attach them to the key # foo foo: 1 bar: # bar 2

Slide 62

Slide 62 text

© 2024 Wantedly, Inc. Comment scanning edge case 4 Edge case 4: comment on a bullet root: # foo - foo: 1 - # bar bar: 2

Slide 63

Slide 63 text

© 2024 Wantedly, Inc. Comment scanning edge case 4 Edge case 4 solution: attach to the whole element root: # foo - foo: 1 - # bar bar: 2 NOTE: “foo: 1” implicitly generates a Mapping node, which the comment attaches to.

Slide 64

Slide 64 text

© 2024 Wantedly, Inc. Psych-comments parser Those came down to only 100LOC! https://github.com/wantedly/psych-comments/blo b/v0.1.1/lib/psych/comments/parsing.rb

Slide 65

Slide 65 text

© 2024 Wantedly, Inc. Generator: reimplement it💪

Slide 66

Slide 66 text

© 2024 Wantedly, Inc. Recap: Psych (lack of) extensibility Psych uses libyaml (C library) Psych::Nodes::Node libyaml YAML text Extendable from Ruby Not extendable from Ruby

Slide 67

Slide 67 text

© 2024 Wantedly, Inc. Extending the generator Just reimplement the generator 💪 (except for scalars) Psych::Nodes::Node YAML text psych- comments Collections libyaml Scalars Delegate

Slide 68

Slide 68 text

© 2024 Wantedly, Inc. Internal commands for YAML formatting Prepared utilities for formatting print "foo" space! newline! indented do … end Print after generating reserved spaces and indentation Reserve spaces or indentation Bump indent level NOTE: there is one more “virtual indentation” util for bullets

Slide 69

Slide 69 text

© 2024 Wantedly, Inc. Internal commands for YAML formatting space! reserves a space foo: foo:

Slide 70

Slide 70 text

© 2024 Wantedly, Inc. Internal commands for YAML formatting space! reserves a space foo: - bar foo:␣bar

Slide 71

Slide 71 text

© 2024 Wantedly, Inc. Internal commands for YAML formatting newline! reserves indentation - foo: bar - foo: bar

Slide 72

Slide 72 text

© 2024 Wantedly, Inc. Internal commands for YAML formatting newline! reserves indentation - foo: bar - baz - foo: bar bar: baz

Slide 73

Slide 73 text

© 2024 Wantedly, Inc. Indentation Adjust Sequence in mapping is special - - baz - bar: baz foo: - baz foo: bar: baz 4 4 2 4

Slide 74

Slide 74 text

© 2024 Wantedly, Inc. Generating bullet comments Hoisting map comments above bullets root: # foo - foo: 1 - # bar bar: 2

Slide 75

Slide 75 text

© 2024 Wantedly, Inc. Generating bullet comments Hoisting map comments above bullets → ● Lookahead the tree and generate comments ● Then avoid duplication via a queue ○ Note that we should not mutate the input

Slide 76

Slide 76 text

© 2024 Wantedly, Inc. Psych-comments generator Reimplementation took only 300LOC 💪💪💪 https://github.com/wantedly/psych-comments/blo b/v0.1.1/lib/psych/comments/emitter.rb

Slide 77

Slide 77 text

© 2024 Wantedly, Inc. Our use-case

Slide 78

Slide 78 text

© 2024 Wantedly, Inc. config/locales I18n.t("our_new_service.try_it_out")

Slide 79

Slide 79 text

© 2024 Wantedly, Inc. config/locales en: our_new_service: try_if_out: "Try it out" ja: our_new_service: try_it_out: "試してみる"

Slide 80

Slide 80 text

© 2024 Wantedly, Inc. config/locales: translation mistakes Problem 1: mistakes and oversight en: our_new_service: try_if_out: "Try it out" ja: our_new_service: try_it_out: "試してみる"

Slide 81

Slide 81 text

© 2024 Wantedly, Inc. config/locales: translation mistakes Solution 1: check for correspondence Synchronizing locales... Generating en.our_new_service.try_it_out Generating ja.our_new_service.try_if_out

Slide 82

Slide 82 text

© 2024 Wantedly, Inc. config/locales: translation mistakes Solution 1: and generate boilerplates en: our_new_service: try_it_out: !todo "試してみる" ja: our_new_service: try_it_out: "試してみる"

Slide 83

Slide 83 text

© 2024 Wantedly, Inc. config/locales: intentional absense Problem 2: intentionally limit languages to support en: # No data due to # translation # cost ja: japan_only: start: "開始"

Slide 84

Slide 84 text

© 2024 Wantedly, Inc. config/locales: intentional absense Solution 2: (ab)use YAML tags en: # No data due to # translation # cost ja: japan_only: start: !only:ja "開始" cf. https://github.com/creasty/i18n_flow

Slide 85

Slide 85 text

© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3: roundtrip failure !only:en,ja ! YAML 1.1 / libyaml 0.2.4 YAML 1.2 / libyaml 0.2.5 ✅ ✅ ✅ ❌ cf. https://github.com/yaml/libyaml/pull/179

Slide 86

Slide 86 text

© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3: roundtrip failure !only:en,ja ! YAML 1.1 / libyaml 0.2.4 YAML 1.2 / libyaml 0.2.5 ✅ ✅ ✅ ❌ Libyaml still generates this!

Slide 87

Slide 87 text

© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3’s root cause …… tag abuse !ruby/symbol foo !!omap [foo: 1, bar: 2] !!set { foo, bar } Legitimate tagging examples

Slide 88

Slide 88 text

© 2024 Wantedly, Inc. config/locales: tag syntax error Solution 3: use comments for tags 👍 en: # No data due to # translation # cost ja: japan_only: # i18n:only:ja start: "開始"

Slide 89

Slide 89 text

© 2024 Wantedly, Inc. config/locales: tag syntax error Solution 3: use comments for tags 👍 en: # No data due to # translation # cost ja: japan_only: # i18n:only:ja start: "開始"

Slide 90

Slide 90 text

© 2024 Wantedly, Inc. config/locales: resolution Problem 4: Psych lacks comment support → Solution 4: implement it myself 💪💪💪💪

Slide 91

Slide 91 text

© 2024 Wantedly, Inc. Story on the library’s scope

Slide 92

Slide 92 text

© 2024 Wantedly, Inc. Comment position Psych-comments (0.1.1) comment position # foo foo # baz leading comments trailing comments

Slide 93

Slide 93 text

© 2024 Wantedly, Inc. Comment position Then line-end comments # foo foo # bar # baz leading comments trailing comments line-end comments

Slide 94

Slide 94 text

© 2024 Wantedly, Inc. PR for line-end comments Thanks! https://github.com/wantedly/psych-comments/pull/2

Slide 95

Slide 95 text

© 2024 Wantedly, Inc. Comment positioning problem Comment positioning problem - 1 # foo - 12 # bar - 1 # foo - 12 # bar - 1 # foo - 12 # bar Psych:: Nodes:: Node

Slide 96

Slide 96 text

© 2024 Wantedly, Inc. Comment positioning problem Comment positioning problem

Slide 97

Slide 97 text

© 2024 Wantedly, Inc. Scopes There is no end to spacing details foo: - 1 - 2 foo: - 1 - 2 - 1 - 12 - 1 - 12 # foo - a: b # foo - a: b foo: - 1 - 2

Slide 98

Slide 98 text

© 2024 Wantedly, Inc. Scopes There is no end to feature requests https://prettier.io/docs/en/option-philosophy

Slide 99

Slide 99 text

© 2024 Wantedly, Inc. Recap: Three-fold YAML processing Ruby Object Node Graph Event Tree YAML "&a [1, true, *a]" [...] "1" "true" *a !!seq [...] !!int "1" !!str "true" * [1, true].tap { |a| a << a } Representation Serialization Presentation Native

Slide 100

Slide 100 text

© 2024 Wantedly, Inc. Levels of abstraction Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content

Slide 101

Slide 101 text

© 2024 Wantedly, Inc. Levels of abstraction – Psych Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych Mid-level API

Slide 102

Slide 102 text

© 2024 Wantedly, Inc. Levels of abstraction – Psych Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych-comments

Slide 103

Slide 103 text

© 2024 Wantedly, Inc. Recap: Psych-comments layer Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych-comments

Slide 104

Slide 104 text

© 2024 Wantedly, Inc. Scopes Deciding and communicating

Slide 105

Slide 105 text

© 2024 Wantedly, Inc. Wrap up

Slide 106

Slide 106 text

© 2024 Wantedly, Inc. Wrap up ● I made psych-comments gem. ● It processes YAML comments. ● It neatly solves your problem, partly reusing Psych’s own algorithms. ● Thankfully people are interested in expanding it, but as a responsible maintainer, I’m going to limit its scope.