Getting along with YAML comments with Psych
by
Masaki Hara
×
Copy
Open
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Slide 1
Slide 1 text
© 2024 Wantedly, Inc. YAML comments Psych Getting along with May. 16 2024 - Masaki Hara @ RubyKaigi 2024 with
Slide 2
Slide 2 text
© 2024 Wantedly, Inc. Masaki Hara @qnighy Masaki Hara
Slide 3
Slide 3 text
© 2024 Wantedly, Inc. Masaki Hara @ Wantedly Stop by Wantedly booth!
Slide 4
Slide 4 text
© 2024 Wantedly, Inc. Made a little neat library! That is what my talk is about.
Slide 5
Slide 5 text
© 2024 Wantedly, Inc. Overview ● Introducing psych-comments gem ● Intro to YAML ● Implementation ● Our use-case ● Story on the library’s scope
Slide 6
Slide 6 text
© 2024 Wantedly, Inc. Introducing psych-comments
Slide 7
Slide 7 text
© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN
Slide 8
Slide 8 text
© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN
Slide 9
Slide 9 text
© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN
Slide 10
Slide 10 text
© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) obj = YAML.load(input) obj["env"] << "GH_TOKEN" output = YAML.dump(obj)
Slide 11
Slide 11 text
© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) obj = Psych.load(input) obj["env"] << "GH_TOKEN" output = Psych.dump(obj)
Slide 12
Slide 12 text
© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) env: # build - NPM_TOKEN # deploy - GH_TOKEN --- env: - NPM_TOKEN - GH_TOKEN - NEW_TOKEN
Slide 13
Slide 13 text
© 2024 Wantedly, Inc. Psych-comments Psych-comments s = Psych::Comments.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = Psych::Comments.emit_yaml(s) bundle add psych-comments require "psych/comments"
Slide 14
Slide 14 text
© 2024 Wantedly, Inc. Psych-comments Psych-comments env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN
Slide 15
Slide 15 text
© 2024 Wantedly, Inc. A library for YAML comments That’s it!
Slide 16
Slide 16 text
© 2024 Wantedly, Inc. Intro to YAML
Slide 17
Slide 17 text
© 2024 Wantedly, Inc. YAML myth “YAML is a fancy JSON” YAML myth
Slide 18
Slide 18 text
© 2024 Wantedly, Inc. YAML myth “YAML is a fancy JSON” YAML myth 🙅
Slide 19
Slide 19 text
© 2024 Wantedly, Inc. YAML family tree XML Perl Marshaller YAML 1.0 YAML 1.2 YAML as “simple XML” YAML as “Portable Marshaller” YAML as “JSON upper-compat” JSON 1998 Data::Denter, Around 2001 2004 2001 (Launch of json.org) 2009
Slide 20
Slide 20 text
© 2024 Wantedly, Inc. YAML family tree XML Perl Marshaller YAML 1.0 YAML 1.2 YAML as “simple XML” YAML as “Portable Marshaller” YAML as “JSON upper-compat” JSON 1998 Data::Denter, Around 2001 2004 2001 (Launch of json.org) 2009
Slide 21
Slide 21 text
© 2024 Wantedly, Inc. YAML and Marshal YAML is aware of… ● custom objects ● aliases ● cyclic references ● streams
Slide 22
Slide 22 text
© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: custom objects YAML.unsafe_load("!ruby/regexp /[a-z]/") Marshal.load( "\x04\x08I/\x0A[a-z]\x00\x06:\x06EF") /[a-z]/
Slide 23
Slide 23 text
© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: aliases YAML.unsafe_load("[&x [], *x]") Marshal.load("\x04\x08[\x07[\x00@\x06") [[]] * 2
Slide 24
Slide 24 text
© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: cyclic references YAML.unsafe_load("&x [*x]") Marshal.load("\x04\x08[\x06@\x00") [[...]]
Slide 25
Slide 25 text
© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: streams "---\n1\n---\n2" # YAML "\x04\x08i\x06\x04\x08i\x07" # Marshal 1 2 NOTE: RGSS save data are one such example
Slide 26
Slide 26 text
© 2024 Wantedly, Inc. Abstraction! Fight against complexity
Slide 27
Slide 27 text
© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Representation Serialization Presentation Native
Slide 28
Slide 28 text
© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML "&a [1, true, *a]" [...] "1" "true" *a !!seq [...] !!int "1" !!str "true" * [1, true].tap { |a| a << a } Representation Serialization Presentation Native
Slide 29
Slide 29 text
© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Process aliases and anchors (Cycles etc.) Process Tags (Custom Objects) Representation Serialization Presentation Native
Slide 30
Slide 30 text
© 2024 Wantedly, Inc. YAML Kind Sequence Mapping Scalar
Slide 31
Slide 31 text
© 2024 Wantedly, Inc. YAML Kind Sequence Mapping Scalar [1, 2, 3] - 1 - 2 - 3 { a: b } a: b ? [1, 2] : [3, 4] foo 1 "2" > foo bar
Slide 32
Slide 32 text
© 2024 Wantedly, Inc. YAML Kind and Tags Sequence Mapping Scalar !!seq !!omap !!pairs !!map !!set !ruby/object !!str !!binary !!null !!bool !!int !!float !!timestamp !!yaml !ruby/symbol !!merge !!value
Slide 33
Slide 33 text
© 2024 Wantedly, Inc. Tag resolution Default for Sequences, Mappings, and quoted Scalars [1, 2, 3] !!seq [1, 2, 3] { a: b } !!map { a: b } "foo" !!str "foo"
Slide 34
Slide 34 text
© 2024 Wantedly, Inc. Plain scalar resolution Plain scalars, defined by a “Schema” null !!null "null" false !!bool "false" 123 !!int "123" 123.45 !!float "123.45"
Slide 35
Slide 35 text
© 2024 Wantedly, Inc. Plain scalar resolution Schema = Pairs of (Regexp, tag) Pattern Tag to be resolved /^(null|Null|NULL|~)?$/ !!null /^(true|True|TRUE|false|False|FALSE)$/ !!bool /^([-+]?[0-9]+|0o[0-7]+|0x[0-9a-fA-F]+)$/ !!int (omit) !!float /^.*$/ !!str
Slide 36
Slide 36 text
© 2024 Wantedly, Inc. Plain scalar resolution Application-specific schema Pattern Tag to be resolved /^(null|Null|NULL|~)?$/ !!null /^(true|True|TRUE|false|False|FALSE)$/ !!bool /^([-+]?[0-9]+|0o[0-7]+|0x[0-9a-fA-F]+)$/ !!int (omit) !!float /^:.*$/ !ruby/symbol /^<<$/ !!merge /^.*$/ !!str
Slide 37
Slide 37 text
© 2024 Wantedly, Inc. YAML is… ● YAML is (in a way) a portable Marshal ● Abstraction layers to sort out complexity ○ Parsing: remove syntax details ○ Deserializing: connect anchors and aliases ○ Interpreting: resolve and realize tags
Slide 38
Slide 38 text
© 2024 Wantedly, Inc. Implementation
Slide 39
Slide 39 text
© 2024 Wantedly, Inc. Recap: Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Representation Serialization Presentation Native
Slide 40
Slide 40 text
© 2024 Wantedly, Inc. Psych’s API levels Ruby Object Node Graph Event Tree YAML Psych.load Psych.dump Psych.parse #to_yaml Psych::Parser Psych::Emitter High-level API Mid-level API Low-level API Representation Serialization Presentation Native
Slide 41
Slide 41 text
© 2024 Wantedly, Inc. Psych’s API levels and psych-comments Ruby Object Node Graph Event Tree YAML Psych.parse #to_yaml Mid-level API psych-comments’ API Representation Serialization Presentation Native
Slide 42
Slide 42 text
© 2024 Wantedly, Inc. Recap: Psych (YAML) high-level API Recap: Psych (a.k.a. YAML): high-level API obj = YAML.load(input) obj["env"] << "GH_TOKEN" output = YAML.dump(obj)
Slide 43
Slide 43 text
© 2024 Wantedly, Inc. Recap: Psych (YAML) high-level API Recap: Psych (a.k.a. YAML): high-level API obj = Psych.load(input) obj["env"] << "GH_TOKEN" output = Psych.dump(obj)
Slide 44
Slide 44 text
© 2024 Wantedly, Inc. Psych (YAML) mid-level API Psych (a.k.a. YAML): mid-level API s = Psych.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = s.to_yaml
Slide 45
Slide 45 text
© 2024 Wantedly, Inc. Recap: Psych-comments Recap: Psych-comments s = Psych::Comments.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = Psych::Comments.emit_yaml(s)
Slide 46
Slide 46 text
© 2024 Wantedly, Inc. We cannot simply extend it!
Slide 47
Slide 47 text
© 2024 Wantedly, Inc. Psych (lack of) extensibility Psych uses libyaml (C library) Psych::Nodes::Node libyaml YAML text Extendable from Ruby Not extendable from Ruby
Slide 48
Slide 48 text
© 2024 Wantedly, Inc. Parser: divide the passes
Slide 49
Slide 49 text
© 2024 Wantedly, Inc. Extending the parser 2-pass parser Psych::Nodes::Node libyaml YAML text without comments Source location Psych::Nodes::Node with comments psych- comments Parse nodes Parse comments
Slide 50
Slide 50 text
© 2024 Wantedly, Inc. Comment scanning algorithm Remember cursor position - # egg # pork bar
Slide 51
Slide 51 text
© 2024 Wantedly, Inc. Comment scanning algorithm Select subtext running to the next position - # egg # pork bar
Slide 52
Slide 52 text
© 2024 Wantedly, Inc. Comment scanning algorithm Extract “#” - # egg # pork bar
Slide 53
Slide 53 text
© 2024 Wantedly, Inc. Comment scanning algorithm Attach them to the following node - # egg # pork bar
Slide 54
Slide 54 text
© 2024 Wantedly, Inc. Comment scanning algorithm Recurse into all descendants and repeat - # egg # pork bar
Slide 55
Slide 55 text
© 2024 Wantedly, Inc. Comment scanning edge case 1 Edge case 1: unwanted occurrence of # foo#bar: < # baz
Slide 56
Slide 56 text
© 2024 Wantedly, Inc. Comment scanning edge case 1 Edge case 1 solution: skip over scalars foo#bar: < # baz
Slide 57
Slide 57 text
© 2024 Wantedly, Inc. Comment scanning edge case 2 Edge case 2: comments before delimiters [ 1, 2, # foo ]
Slide 58
Slide 58 text
© 2024 Wantedly, Inc. Comment scanning edge case 2 Edge case 2 solution: attach as trailing comments [ 1, 2, # foo ]
Slide 59
Slide 59 text
© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3: comments on a key-value pair # foo foo: 1 bar: # bar 2 NOTE: Psych lacks a node type for key-value pairs, instead hanging keys and values alternatingly in a flat array
Slide 60
Slide 60 text
© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3: comments on a key-value pair Mapping Key 0 Val 0 Key 1 Val 1 Key 2 Val 2 … Flat array! (in Psych)
Slide 61
Slide 61 text
© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3 solution: attach them to the key # foo foo: 1 bar: # bar 2
Slide 62
Slide 62 text
© 2024 Wantedly, Inc. Comment scanning edge case 4 Edge case 4: comment on a bullet root: # foo - foo: 1 - # bar bar: 2
Slide 63
Slide 63 text
© 2024 Wantedly, Inc. Comment scanning edge case 4 Edge case 4 solution: attach to the whole element root: # foo - foo: 1 - # bar bar: 2 NOTE: “foo: 1” implicitly generates a Mapping node, which the comment attaches to.
Slide 64
Slide 64 text
© 2024 Wantedly, Inc. Psych-comments parser Those came down to only 100LOC! https://github.com/wantedly/psych-comments/blo b/v0.1.1/lib/psych/comments/parsing.rb
Slide 65
Slide 65 text
© 2024 Wantedly, Inc. Generator: reimplement it💪
Slide 66
Slide 66 text
© 2024 Wantedly, Inc. Recap: Psych (lack of) extensibility Psych uses libyaml (C library) Psych::Nodes::Node libyaml YAML text Extendable from Ruby Not extendable from Ruby
Slide 67
Slide 67 text
© 2024 Wantedly, Inc. Extending the generator Just reimplement the generator 💪 (except for scalars) Psych::Nodes::Node YAML text psych- comments Collections libyaml Scalars Delegate
Slide 68
Slide 68 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting Prepared utilities for formatting print "foo" space! newline! indented do … end Print after generating reserved spaces and indentation Reserve spaces or indentation Bump indent level NOTE: there is one more “virtual indentation” util for bullets
Slide 69
Slide 69 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting space! reserves a space foo: foo:
Slide 70
Slide 70 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting space! reserves a space foo: - bar foo:␣bar
Slide 71
Slide 71 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting newline! reserves indentation - foo: bar - foo: bar
Slide 72
Slide 72 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting newline! reserves indentation - foo: bar - baz - foo: bar bar: baz
Slide 73
Slide 73 text
© 2024 Wantedly, Inc. Indentation Adjust Sequence in mapping is special - - baz - bar: baz foo: - baz foo: bar: baz 4 4 2 4
Slide 74
Slide 74 text
© 2024 Wantedly, Inc. Generating bullet comments Hoisting map comments above bullets root: # foo - foo: 1 - # bar bar: 2
Slide 75
Slide 75 text
© 2024 Wantedly, Inc. Generating bullet comments Hoisting map comments above bullets → ● Lookahead the tree and generate comments ● Then avoid duplication via a queue ○ Note that we should not mutate the input
Slide 76
Slide 76 text
© 2024 Wantedly, Inc. Psych-comments generator Reimplementation took only 300LOC 💪💪💪 https://github.com/wantedly/psych-comments/blo b/v0.1.1/lib/psych/comments/emitter.rb
Slide 77
Slide 77 text
© 2024 Wantedly, Inc. Our use-case
Slide 78
Slide 78 text
© 2024 Wantedly, Inc. config/locales I18n.t("our_new_service.try_it_out")
Slide 79
Slide 79 text
© 2024 Wantedly, Inc. config/locales en: our_new_service: try_if_out: "Try it out" ja: our_new_service: try_it_out: "試してみる"
Slide 80
Slide 80 text
© 2024 Wantedly, Inc. config/locales: translation mistakes Problem 1: mistakes and oversight en: our_new_service: try_if_out: "Try it out" ja: our_new_service: try_it_out: "試してみる"
Slide 81
Slide 81 text
© 2024 Wantedly, Inc. config/locales: translation mistakes Solution 1: check for correspondence Synchronizing locales... Generating en.our_new_service.try_it_out Generating ja.our_new_service.try_if_out
Slide 82
Slide 82 text
© 2024 Wantedly, Inc. config/locales: translation mistakes Solution 1: and generate boilerplates en: our_new_service: try_it_out: !todo "試してみる" ja: our_new_service: try_it_out: "試してみる"
Slide 83
Slide 83 text
© 2024 Wantedly, Inc. config/locales: intentional absense Problem 2: intentionally limit languages to support en: # No data due to # translation # cost ja: japan_only: start: "開始"
Slide 84
Slide 84 text
© 2024 Wantedly, Inc. config/locales: intentional absense Solution 2: (ab)use YAML tags en: # No data due to # translation # cost ja: japan_only: start: !only:ja "開始" cf. https://github.com/creasty/i18n_flow
Slide 85
Slide 85 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3: roundtrip failure !only:en,ja ! YAML 1.1 / libyaml 0.2.4 YAML 1.2 / libyaml 0.2.5 ✅ ✅ ✅ ❌ cf. https://github.com/yaml/libyaml/pull/179
Slide 86
Slide 86 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3: roundtrip failure !only:en,ja ! YAML 1.1 / libyaml 0.2.4 YAML 1.2 / libyaml 0.2.5 ✅ ✅ ✅ ❌ Libyaml still generates this!
Slide 87
Slide 87 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3’s root cause …… tag abuse !ruby/symbol foo !!omap [foo: 1, bar: 2] !!set { foo, bar } Legitimate tagging examples
Slide 88
Slide 88 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Solution 3: use comments for tags 👍 en: # No data due to # translation # cost ja: japan_only: # i18n:only:ja start: "開始"
Slide 89
Slide 89 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Solution 3: use comments for tags 👍 en: # No data due to # translation # cost ja: japan_only: # i18n:only:ja start: "開始"
Slide 90
Slide 90 text
© 2024 Wantedly, Inc. config/locales: resolution Problem 4: Psych lacks comment support → Solution 4: implement it myself 💪💪💪💪
Slide 91
Slide 91 text
© 2024 Wantedly, Inc. Story on the library’s scope
Slide 92
Slide 92 text
© 2024 Wantedly, Inc. Comment position Psych-comments (0.1.1) comment position # foo foo # baz leading comments trailing comments
Slide 93
Slide 93 text
© 2024 Wantedly, Inc. Comment position Then line-end comments # foo foo # bar # baz leading comments trailing comments line-end comments
Slide 94
Slide 94 text
© 2024 Wantedly, Inc. PR for line-end comments Thanks! https://github.com/wantedly/psych-comments/pull/2
Slide 95
Slide 95 text
© 2024 Wantedly, Inc. Comment positioning problem Comment positioning problem - 1 # foo - 12 # bar - 1 # foo - 12 # bar - 1 # foo - 12 # bar Psych:: Nodes:: Node
Slide 96
Slide 96 text
© 2024 Wantedly, Inc. Comment positioning problem Comment positioning problem
Slide 97
Slide 97 text
© 2024 Wantedly, Inc. Scopes There is no end to spacing details foo: - 1 - 2 foo: - 1 - 2 - 1 - 12 - 1 - 12 # foo - a: b # foo - a: b foo: - 1 - 2
Slide 98
Slide 98 text
© 2024 Wantedly, Inc. Scopes There is no end to feature requests https://prettier.io/docs/en/option-philosophy
Slide 99
Slide 99 text
© 2024 Wantedly, Inc. Recap: Three-fold YAML processing Ruby Object Node Graph Event Tree YAML "&a [1, true, *a]" [...] "1" "true" *a !!seq [...] !!int "1" !!str "true" * [1, true].tap { |a| a << a } Representation Serialization Presentation Native
Slide 100
Slide 100 text
© 2024 Wantedly, Inc. Levels of abstraction Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content
Slide 101
Slide 101 text
© 2024 Wantedly, Inc. Levels of abstraction – Psych Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych Mid-level API
Slide 102
Slide 102 text
© 2024 Wantedly, Inc. Levels of abstraction – Psych Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych-comments
Slide 103
Slide 103 text
© 2024 Wantedly, Inc. Recap: Psych-comments layer Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych-comments
Slide 104
Slide 104 text
© 2024 Wantedly, Inc. Scopes Deciding and communicating
Slide 105
Slide 105 text
© 2024 Wantedly, Inc. Wrap up
Slide 106
Slide 106 text
© 2024 Wantedly, Inc. Wrap up ● I made psych-comments gem. ● It processes YAML comments. ● It neatly solves your problem, partly reusing Psych’s own algorithms. ● Thankfully people are interested in expanding it, but as a responsible maintainer, I’m going to limit its scope.