Getting along with YAML comments with Psych
by
Masaki Hara
Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
© 2024 Wantedly, Inc. YAML comments Psych Getting along with May. 16 2024 - Masaki Hara @ RubyKaigi 2024 with
Slide 2
Slide 2 text
© 2024 Wantedly, Inc. Masaki Hara @qnighy Masaki Hara
Slide 3
Slide 3 text
© 2024 Wantedly, Inc. Masaki Hara @ Wantedly Stop by Wantedly booth!
Slide 4
Slide 4 text
© 2024 Wantedly, Inc. Made a little neat library! That is what my talk is about.
Slide 5
Slide 5 text
© 2024 Wantedly, Inc. Overview ● Introducing psych-comments gem ● Intro to YAML ● Implementation ● Our use-case ● Story on the library’s scope
Slide 6
Slide 6 text
© 2024 Wantedly, Inc. Introducing psych-comments
Slide 7
Slide 7 text
© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN
Slide 8
Slide 8 text
© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN
Slide 9
Slide 9 text
© 2024 Wantedly, Inc. Example You want this (simplified example) env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN
Slide 10
Slide 10 text
© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) obj = YAML.load(input) obj["env"] << "GH_TOKEN" output = YAML.dump(obj)
Slide 11
Slide 11 text
© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) obj = Psych.load(input) obj["env"] << "GH_TOKEN" output = Psych.dump(obj)
Slide 12
Slide 12 text
© 2024 Wantedly, Inc. Psych (YAML) Psych (a.k.a. YAML) env: # build - NPM_TOKEN # deploy - GH_TOKEN --- env: - NPM_TOKEN - GH_TOKEN - NEW_TOKEN
Slide 13
Slide 13 text
© 2024 Wantedly, Inc. Psych-comments Psych-comments s = Psych::Comments.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = Psych::Comments.emit_yaml(s) bundle add psych-comments require "psych/comments"
Slide 14
Slide 14 text
© 2024 Wantedly, Inc. Psych-comments Psych-comments env: # build - NPM_TOKEN # deploy - GH_TOKEN env: # build - NPM_TOKEN # deploy - GH_TOKEN - NEW_TOKEN
Slide 15
Slide 15 text
© 2024 Wantedly, Inc. A library for YAML comments That’s it!
Slide 16
Slide 16 text
© 2024 Wantedly, Inc. Intro to YAML
Slide 17
Slide 17 text
© 2024 Wantedly, Inc. YAML myth “YAML is a fancy JSON” YAML myth
Slide 18
Slide 18 text
© 2024 Wantedly, Inc. YAML myth “YAML is a fancy JSON” YAML myth 🙅
Slide 19
Slide 19 text
© 2024 Wantedly, Inc. YAML family tree XML Perl Marshaller YAML 1.0 YAML 1.2 YAML as “simple XML” YAML as “Portable Marshaller” YAML as “JSON upper-compat” JSON 1998 Data::Denter, Around 2001 2004 2001 (Launch of json.org) 2009
Slide 20
Slide 20 text
© 2024 Wantedly, Inc. YAML family tree XML Perl Marshaller YAML 1.0 YAML 1.2 YAML as “simple XML” YAML as “Portable Marshaller” YAML as “JSON upper-compat” JSON 1998 Data::Denter, Around 2001 2004 2001 (Launch of json.org) 2009
Slide 21
Slide 21 text
© 2024 Wantedly, Inc. YAML and Marshal YAML is aware of… ● custom objects ● aliases ● cyclic references ● streams
Slide 22
Slide 22 text
© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: custom objects YAML.unsafe_load("!ruby/regexp /[a-z]/") Marshal.load( "\x04\x08I/\x0A[a-z]\x00\x06:\x06EF") /[a-z]/
Slide 23
Slide 23 text
© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: aliases YAML.unsafe_load("[&x [], *x]") Marshal.load("\x04\x08[\x07[\x00@\x06") [[]] * 2
Slide 24
Slide 24 text
© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: cyclic references YAML.unsafe_load("&x [*x]") Marshal.load("\x04\x08[\x06@\x00") [[...]]
Slide 25
Slide 25 text
© 2024 Wantedly, Inc. YAML and Marshal YAML and Marshal: streams "---\n1\n---\n2" # YAML "\x04\x08i\x06\x04\x08i\x07" # Marshal 1 2 NOTE: RGSS save data are one such example
Slide 26
Slide 26 text
© 2024 Wantedly, Inc. Abstraction! Fight against complexity
Slide 27
Slide 27 text
© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Representation Serialization Presentation Native
Slide 28
Slide 28 text
© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML "&a [1, true, *a]" [...] "1" "true" *a !!seq [...] !!int "1" !!str "true" * [1, true].tap { |a| a << a } Representation Serialization Presentation Native
Slide 29
Slide 29 text
© 2024 Wantedly, Inc. Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Process aliases and anchors (Cycles etc.) Process Tags (Custom Objects) Representation Serialization Presentation Native
Slide 30
Slide 30 text
© 2024 Wantedly, Inc. YAML Kind Sequence Mapping Scalar
Slide 31
Slide 31 text
© 2024 Wantedly, Inc. YAML Kind Sequence Mapping Scalar [1, 2, 3] - 1 - 2 - 3 { a: b } a: b ? [1, 2] : [3, 4] foo 1 "2" > foo bar
Slide 32
Slide 32 text
© 2024 Wantedly, Inc. YAML Kind and Tags Sequence Mapping Scalar !!seq !!omap !!pairs !!map !!set !ruby/object !!str !!binary !!null !!bool !!int !!float !!timestamp !!yaml !ruby/symbol !!merge !!value
Slide 33
Slide 33 text
© 2024 Wantedly, Inc. Tag resolution Default for Sequences, Mappings, and quoted Scalars [1, 2, 3] !!seq [1, 2, 3] { a: b } !!map { a: b } "foo" !!str "foo"
Slide 34
Slide 34 text
© 2024 Wantedly, Inc. Plain scalar resolution Plain scalars, defined by a “Schema” null !!null "null" false !!bool "false" 123 !!int "123" 123.45 !!float "123.45"
Slide 35
Slide 35 text
© 2024 Wantedly, Inc. Plain scalar resolution Schema = Pairs of (Regexp, tag) Pattern Tag to be resolved /^(null|Null|NULL|~)?$/ !!null /^(true|True|TRUE|false|False|FALSE)$/ !!bool /^([-+]?[0-9]+|0o[0-7]+|0x[0-9a-fA-F]+)$/ !!int (omit) !!float /^.*$/ !!str
Slide 36
Slide 36 text
© 2024 Wantedly, Inc. Plain scalar resolution Application-specific schema Pattern Tag to be resolved /^(null|Null|NULL|~)?$/ !!null /^(true|True|TRUE|false|False|FALSE)$/ !!bool /^([-+]?[0-9]+|0o[0-7]+|0x[0-9a-fA-F]+)$/ !!int (omit) !!float /^:.*$/ !ruby/symbol /^<<$/ !!merge /^.*$/ !!str
Slide 37
Slide 37 text
© 2024 Wantedly, Inc. YAML is… ● YAML is (in a way) a portable Marshal ● Abstraction layers to sort out complexity ○ Parsing: remove syntax details ○ Deserializing: connect anchors and aliases ○ Interpreting: resolve and realize tags
Slide 38
Slide 38 text
© 2024 Wantedly, Inc. Implementation
Slide 39
Slide 39 text
© 2024 Wantedly, Inc. Recap: Three-fold YAML processing Ruby Object Node Graph Event Tree YAML Representation Serialization Presentation Native
Slide 40
Slide 40 text
© 2024 Wantedly, Inc. Psych’s API levels Ruby Object Node Graph Event Tree YAML Psych.load Psych.dump Psych.parse #to_yaml Psych::Parser Psych::Emitter High-level API Mid-level API Low-level API Representation Serialization Presentation Native
Slide 41
Slide 41 text
© 2024 Wantedly, Inc. Psych’s API levels and psych-comments Ruby Object Node Graph Event Tree YAML Psych.parse #to_yaml Mid-level API psych-comments’ API Representation Serialization Presentation Native
Slide 42
Slide 42 text
© 2024 Wantedly, Inc. Recap: Psych (YAML) high-level API Recap: Psych (a.k.a. YAML): high-level API obj = YAML.load(input) obj["env"] << "GH_TOKEN" output = YAML.dump(obj)
Slide 43
Slide 43 text
© 2024 Wantedly, Inc. Recap: Psych (YAML) high-level API Recap: Psych (a.k.a. YAML): high-level API obj = Psych.load(input) obj["env"] << "GH_TOKEN" output = Psych.dump(obj)
Slide 44
Slide 44 text
© 2024 Wantedly, Inc. Psych (YAML) mid-level API Psych (a.k.a. YAML): mid-level API s = Psych.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = s.to_yaml
Slide 45
Slide 45 text
© 2024 Wantedly, Inc. Recap: Psych-comments Recap: Psych-comments s = Psych::Comments.parse_stream(input) env = s.children[0].children[0].children[1] env.children << Psych::Nodes::Scalar.new("NEW_TOKEN") output = Psych::Comments.emit_yaml(s)
Slide 46
Slide 46 text
© 2024 Wantedly, Inc. We cannot simply extend it!
Slide 47
Slide 47 text
© 2024 Wantedly, Inc. Psych (lack of) extensibility Psych uses libyaml (C library) Psych::Nodes::Node libyaml YAML text Extendable from Ruby Not extendable from Ruby
Slide 48
Slide 48 text
© 2024 Wantedly, Inc. Parser: divide the passes
Slide 49
Slide 49 text
© 2024 Wantedly, Inc. Extending the parser 2-pass parser Psych::Nodes::Node libyaml YAML text without comments Source location Psych::Nodes::Node with comments psych- comments Parse nodes Parse comments
Slide 50
Slide 50 text
© 2024 Wantedly, Inc. Comment scanning algorithm Remember cursor position - # egg # pork bar
Slide 51
Slide 51 text
© 2024 Wantedly, Inc. Comment scanning algorithm Select subtext running to the next position - # egg # pork bar
Slide 52
Slide 52 text
© 2024 Wantedly, Inc. Comment scanning algorithm Extract “#” - # egg # pork bar
Slide 53
Slide 53 text
© 2024 Wantedly, Inc. Comment scanning algorithm Attach them to the following node - # egg # pork bar
Slide 54
Slide 54 text
© 2024 Wantedly, Inc. Comment scanning algorithm Recurse into all descendants and repeat - # egg # pork bar
Slide 55
Slide 55 text
© 2024 Wantedly, Inc. Comment scanning edge case 1 Edge case 1: unwanted occurrence of # foo#bar: < # baz
Slide 56
Slide 56 text
© 2024 Wantedly, Inc. Comment scanning edge case 1 Edge case 1 solution: skip over scalars foo#bar: < # baz
Slide 57
Slide 57 text
© 2024 Wantedly, Inc. Comment scanning edge case 2 Edge case 2: comments before delimiters [ 1, 2, # foo ]
Slide 58
Slide 58 text
© 2024 Wantedly, Inc. Comment scanning edge case 2 Edge case 2 solution: attach as trailing comments [ 1, 2, # foo ]
Slide 59
Slide 59 text
© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3: comments on a key-value pair # foo foo: 1 bar: # bar 2 NOTE: Psych lacks a node type for key-value pairs, instead hanging keys and values alternatingly in a flat array
Slide 60
Slide 60 text
© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3: comments on a key-value pair Mapping Key 0 Val 0 Key 1 Val 1 Key 2 Val 2 … Flat array! (in Psych)
Slide 61
Slide 61 text
© 2024 Wantedly, Inc. Comment scanning edge case 3 Edge case 3 solution: attach them to the key # foo foo: 1 bar: # bar 2
Slide 62
Slide 62 text
© 2024 Wantedly, Inc. Comment scanning edge case 4 Edge case 4: comment on a bullet root: # foo - foo: 1 - # bar bar: 2
Slide 63
Slide 63 text
© 2024 Wantedly, Inc. Comment scanning edge case 4 Edge case 4 solution: attach to the whole element root: # foo - foo: 1 - # bar bar: 2 NOTE: “foo: 1” implicitly generates a Mapping node, which the comment attaches to.
Slide 64
Slide 64 text
© 2024 Wantedly, Inc. Psych-comments parser Those came down to only 100LOC! https://github.com/wantedly/psych-comments/blo b/v0.1.1/lib/psych/comments/parsing.rb
Slide 65
Slide 65 text
© 2024 Wantedly, Inc. Generator: reimplement it💪
Slide 66
Slide 66 text
© 2024 Wantedly, Inc. Recap: Psych (lack of) extensibility Psych uses libyaml (C library) Psych::Nodes::Node libyaml YAML text Extendable from Ruby Not extendable from Ruby
Slide 67
Slide 67 text
© 2024 Wantedly, Inc. Extending the generator Just reimplement the generator 💪 (except for scalars) Psych::Nodes::Node YAML text psych- comments Collections libyaml Scalars Delegate
Slide 68
Slide 68 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting Prepared utilities for formatting print "foo" space! newline! indented do … end Print after generating reserved spaces and indentation Reserve spaces or indentation Bump indent level NOTE: there is one more “virtual indentation” util for bullets
Slide 69
Slide 69 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting space! reserves a space foo: foo:
Slide 70
Slide 70 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting space! reserves a space foo: - bar foo:␣bar
Slide 71
Slide 71 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting newline! reserves indentation - foo: bar - foo: bar
Slide 72
Slide 72 text
© 2024 Wantedly, Inc. Internal commands for YAML formatting newline! reserves indentation - foo: bar - baz - foo: bar bar: baz
Slide 73
Slide 73 text
© 2024 Wantedly, Inc. Indentation Adjust Sequence in mapping is special - - baz - bar: baz foo: - baz foo: bar: baz 4 4 2 4
Slide 74
Slide 74 text
© 2024 Wantedly, Inc. Generating bullet comments Hoisting map comments above bullets root: # foo - foo: 1 - # bar bar: 2
Slide 75
Slide 75 text
© 2024 Wantedly, Inc. Generating bullet comments Hoisting map comments above bullets → ● Lookahead the tree and generate comments ● Then avoid duplication via a queue ○ Note that we should not mutate the input
Slide 76
Slide 76 text
© 2024 Wantedly, Inc. Psych-comments generator Reimplementation took only 300LOC 💪💪💪 https://github.com/wantedly/psych-comments/blo b/v0.1.1/lib/psych/comments/emitter.rb
Slide 77
Slide 77 text
© 2024 Wantedly, Inc. Our use-case
Slide 78
Slide 78 text
© 2024 Wantedly, Inc. config/locales I18n.t("our_new_service.try_it_out")
Slide 79
Slide 79 text
© 2024 Wantedly, Inc. config/locales en: our_new_service: try_if_out: "Try it out" ja: our_new_service: try_it_out: "試してみる"
Slide 80
Slide 80 text
© 2024 Wantedly, Inc. config/locales: translation mistakes Problem 1: mistakes and oversight en: our_new_service: try_if_out: "Try it out" ja: our_new_service: try_it_out: "試してみる"
Slide 81
Slide 81 text
© 2024 Wantedly, Inc. config/locales: translation mistakes Solution 1: check for correspondence Synchronizing locales... Generating en.our_new_service.try_it_out Generating ja.our_new_service.try_if_out
Slide 82
Slide 82 text
© 2024 Wantedly, Inc. config/locales: translation mistakes Solution 1: and generate boilerplates en: our_new_service: try_it_out: !todo "試してみる" ja: our_new_service: try_it_out: "試してみる"
Slide 83
Slide 83 text
© 2024 Wantedly, Inc. config/locales: intentional absense Problem 2: intentionally limit languages to support en: # No data due to # translation # cost ja: japan_only: start: "開始"
Slide 84
Slide 84 text
© 2024 Wantedly, Inc. config/locales: intentional absense Solution 2: (ab)use YAML tags en: # No data due to # translation # cost ja: japan_only: start: !only:ja "開始" cf. https://github.com/creasty/i18n_flow
Slide 85
Slide 85 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3: roundtrip failure !only:en,ja ! YAML 1.1 / libyaml 0.2.4 YAML 1.2 / libyaml 0.2.5 ✅ ✅ ✅ ❌ cf. https://github.com/yaml/libyaml/pull/179
Slide 86
Slide 86 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3: roundtrip failure !only:en,ja ! YAML 1.1 / libyaml 0.2.4 YAML 1.2 / libyaml 0.2.5 ✅ ✅ ✅ ❌ Libyaml still generates this!
Slide 87
Slide 87 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Problem 3’s root cause …… tag abuse !ruby/symbol foo !!omap [foo: 1, bar: 2] !!set { foo, bar } Legitimate tagging examples
Slide 88
Slide 88 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Solution 3: use comments for tags 👍 en: # No data due to # translation # cost ja: japan_only: # i18n:only:ja start: "開始"
Slide 89
Slide 89 text
© 2024 Wantedly, Inc. config/locales: tag syntax error Solution 3: use comments for tags 👍 en: # No data due to # translation # cost ja: japan_only: # i18n:only:ja start: "開始"
Slide 90
Slide 90 text
© 2024 Wantedly, Inc. config/locales: resolution Problem 4: Psych lacks comment support → Solution 4: implement it myself 💪💪💪💪
Slide 91
Slide 91 text
© 2024 Wantedly, Inc. Story on the library’s scope
Slide 92
Slide 92 text
© 2024 Wantedly, Inc. Comment position Psych-comments (0.1.1) comment position # foo foo # baz leading comments trailing comments
Slide 93
Slide 93 text
© 2024 Wantedly, Inc. Comment position Then line-end comments # foo foo # bar # baz leading comments trailing comments line-end comments
Slide 94
Slide 94 text
© 2024 Wantedly, Inc. PR for line-end comments Thanks! https://github.com/wantedly/psych-comments/pull/2
Slide 95
Slide 95 text
© 2024 Wantedly, Inc. Comment positioning problem Comment positioning problem - 1 # foo - 12 # bar - 1 # foo - 12 # bar - 1 # foo - 12 # bar Psych:: Nodes:: Node
Slide 96
Slide 96 text
© 2024 Wantedly, Inc. Comment positioning problem Comment positioning problem
Slide 97
Slide 97 text
© 2024 Wantedly, Inc. Scopes There is no end to spacing details foo: - 1 - 2 foo: - 1 - 2 - 1 - 12 - 1 - 12 # foo - a: b # foo - a: b foo: - 1 - 2
Slide 98
Slide 98 text
© 2024 Wantedly, Inc. Scopes There is no end to feature requests https://prettier.io/docs/en/option-philosophy
Slide 99
Slide 99 text
© 2024 Wantedly, Inc. Recap: Three-fold YAML processing Ruby Object Node Graph Event Tree YAML "&a [1, true, *a]" [...] "1" "true" *a !!seq [...] !!int "1" !!str "true" * [1, true].tap { |a| a << a } Representation Serialization Presentation Native
Slide 100
Slide 100 text
© 2024 Wantedly, Inc. Levels of abstraction Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content
Slide 101
Slide 101 text
© 2024 Wantedly, Inc. Levels of abstraction – Psych Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych Mid-level API
Slide 102
Slide 102 text
© 2024 Wantedly, Inc. Levels of abstraction – Psych Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych-comments
Slide 103
Slide 103 text
© 2024 Wantedly, Inc. Recap: Psych-comments layer Presentation Serialization Representation Anchors & aliases Non-specific tags Scalar content formatting Directives Node style Comments Spacing Key ordering Tag style Escapes Tags Node links Scalar content Psych-comments
Slide 104
Slide 104 text
© 2024 Wantedly, Inc. Scopes Deciding and communicating
Slide 105
Slide 105 text
© 2024 Wantedly, Inc. Wrap up
Slide 106
Slide 106 text
© 2024 Wantedly, Inc. Wrap up ● I made psych-comments gem. ● It processes YAML comments. ● It neatly solves your problem, partly reusing Psych’s own algorithms. ● Thankfully people are interested in expanding it, but as a responsible maintainer, I’m going to limit its scope.