Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Data-Reader's Guide to the Galaxy

miner
March 18, 2013

The Data-Reader's Guide to the Galaxy

Presented at Clojure/West 2013.

Don't panic if you're unsure about picking up data-readers. It turns out that tagged literals are mostly harmless. Much as Douglas Adams produced "The Hitchhicker's Guide" material in multiple forms, a Clojure programmer can assign data-readers to process tagged literals with customized implementations. Clojure 1.5 adds a new feature that makes dealing with unknown tags simple and convenient. We'll also talk about the Extensible Data Notation (EDN), which aims to be the Babel fish of data transfer. Finally, we will explore a few unorthodox uses for data-readers.

video: http://www.infoq.com/presentations/Clojure-Data-Reader

miner

March 18, 2013
Tweet

More Decks by miner

Other Decks in Programming

Transcript

  1. The Data-Reader’s
    Guide to the Galaxy
    Steve Miner
    @miner
    1

    View Slide

  2. The Data-Reader’s
    Guide to the Galaxy
    Steve Miner
    @miner
    #inst “2013-03-18T09:50-07:00”
    2

    View Slide

  3. Douglas Adams
    3

    View Slide

  4. 4

    View Slide

  5. The Guide
    • The standard repository for all knowledge and
    wisdom
    • Basics
    • New in Clojure 1.5
    • EDN
    • Unorthodox ideas
    5

    View Slide

  6. Tagged Literals
    • #my.ns/tag literal-data
    • #inst “2013-03-18”
    • Self-describing data
    • Loosely coupled to implementation
    • Data transfer
    6

    View Slide

  7. Extensible Reader
    • Add new data literals
    • Customized to application
    • Limited form of CL reader macros
    • Read-time vs. Compile time
    7

    View Slide

  8. *data-readers*
    • data_readers.clj
    • literal map
    • tag symbols to var symbols
    • (binding [*data-readers* {...}] body)
    8

    View Slide

  9. (defn my-reader [[a b]]
    (+ (* 10 a) b))
    (binding [*data-readers*
    {‘my.ns/tag #’my-reader}]
    (read-string “#my.ns/tag [4 2]”))
    ;=> 42
    9

    View Slide

  10. • pre-defined by Clojure
    • #uuid
    • #inst
    default-data-readers
    10

    View Slide

  11. Time is an illusion.
    Lunchtime doubly so.
    11

    View Slide

  12. #inst
    • Instant in time per RFC 3339
    • #inst “2013-03-18”
    • #inst “2013-03-18T09:50-07:00”
    • #inst “2013-03-18T16:50:00.000-00:00”
    12

    View Slide

  13. Date/Time
    • java.util.Date
    • java.util.Calendar
    • java.sql.Timestamp
    • Joda Time - clj-time
    • JSR 310 - java.time in JDK 8
    13

    View Slide

  14. #uuid
    • #uuid
    "4e0f694b-1901-4886-9101-5479fc0c9720"
    • (java.util.UUID/randomUUID)
    14

    View Slide

  15. Examples
    • #x/url “http://clojurewest.org”
    • #x/base64 “ZnJvbSBvdGhlciBhbmltY”
    • #x/coords [45.5241, -122.6820]
    15

    View Slide

  16. *default-data-reader-fn*
    • new in Clojure 1.5
    • defaults to nil, throws on unknown tag
    • fn receives a tag (symbol) and a value
    • should return a literal value
    16

    View Slide

  17. Example *ddrf*
    • (defrecord TaggedValue [tag value])
    • print-method TaggedValue
    • #tag value
    • *default-data-reader-fn*
    • (->TaggedValue tag value)
    17

    View Slide

  18. • record: #my.ns.Rec{:a 42}
    • tagged: #my.ns/Rec {:a 42}
    • *default-data-reader-fn*
    • factory for tag: my.ns/map->Rec
    • maybe check tag ns or Capital
    Records vs. Tags
    18

    View Slide

  19. (defn record-tag-factory [tag]
    (resolve (symbol (str (namespace tag) "/map->" (name tag)))))
    (defn default-reader [tag value]
    (if-let [factory (and (map? value)
    (Character/isUpperCase (first (name tag)))
    (record-tag-factory tag))]
    (factory value)
    (->TaggedValue tag value)))
    19

    View Slide

  20. Library Authors
    • #my.ns/tag semantics
    • Base literal value
    • Implementation types
    • Data-reader functions
    • Print methods (maybe)
    • data_readers.clj (maybe)
    20

    View Slide

  21. Printing
    • See instant.clj
    • Clojure prints using #inst for ju.Date,
    ju.Calendar and js.Timestamp
    • *print-dup* ignored
    21

    View Slide

  22. Gotchas
    • CLJ-1138 returning nil throws.
    • Return '(quote nil) instead.
    • CLJ-1100 period in tag throws.
    • Java Dates are mutable
    22

    View Slide

  23. (= #inst "2013-03-18T16:50Z"
    #inst "2013-03-18T09:50-07:00")
    ;=> true
    (set! *data-readers* {'inst
    #'clojure.instant/read-instant-
    calendar})
    (= #inst "2013-03-18T16:50Z"
    #inst "2013-03-18T09:50-07:00")
    ;=> false
    23

    View Slide

  24. *read-eval* Kerfuffel
    • Ruby on Rails vulnerability
    • Clojure *read-eval* true by default
    • #=(dangerously) can execute code
    • not safe for reading untrusted data
    • CLJ-1153 and CLJ-904
    24

    View Slide

  25. 25

    View Slide

  26. clojure.core/read
    • Designed by hyper intelligent, pan dimensional
    beings
    • read is for trusted input
    • binding *read-eval* false
    • (pre 1.5) allowed Java constructors
    • #my.ns.Rec[42]
    26

    View Slide

  27. A common mistake that people make
    when trying to design something
    completely foolproof is to underestimate
    the ingenuity of complete fools.
    27

    View Slide

  28. clojure.edn
    • new EdnReader in Clojure 1.5
    • no #=()
    • no Java constructors, no records
    • just safe EDN elements
    28

    View Slide

  29. clojure.edn/read
    • arities - [ ] [stream] [opts stream]
    • opts map - :eof, :readers, :default
    • defaults to *in* and default-data-readers
    29

    View Slide

  30. clojure.edn/read-string
    • arities - [string] [opts string]
    • opts map - :eof, :readers, :default
    • defaults to default-data-readers
    30

    View Slide

  31. clojure.tools.reader
    • Written in Clojure
    • A complete Clojure reader
    • An EDN-only reader
    • Works with Clojure 1.3+
    31

    View Slide

  32. EDN
    • Extensible Data Notation
    • edn-format.org
    • Clojure style values: { } [ ] ( ) sym :kw
    • #tagged literals
    • Implementations for other languages
    32

    View Slide

  33. 33

    View Slide

  34. The Guide on XML
    In the beginning XML was created. This has
    made a lot of people very angry and has been
    widely regarded as a bad move.
    34

    View Slide

  35. XML
    • XML 1.0 (1998) - working group of 11
    experts and an interest group of 150 within
    W3C
    • The Encyclopedia Galactica of data formats
    • is (s-expr) with better marketing
    35

    View Slide

  36. JSON
    • “The Fat-Free Alternative to XML”
    • “The good thing about reinventing the
    wheel is that you can get a round one.” -
    Douglas Crockford
    • not extensible
    36

    View Slide

  37. EDN vs. JSON
    • extensible
    • more types
    • a little more syntax
    • conveyance of values, not objects
    • slightly cheaper
    37

    View Slide

  38. Unorthodox Ideas
    • #x/roman
    • #x/rpn
    • spyscope
    • #feature/condf
    38

    View Slide

  39. (defn roman-numeral-reader [s]
    (parse-roman (name s)))
    #x/roman “XLII”
    ;=> 42
    #x/roman XLII
    ;=> 42
    39

    View Slide

  40. #x/rpn [3 4 + 8 2 - *]
    ; compiler sees: (* (+ 3 4) (- 8 2))
    ;=> 42
    40

    View Slide

  41. Spyscope
    • github.com/dgrnbrg/spyscope
    • #spy/d (form)
    • better than println
    • minimally invasive
    41

    View Slide

  42. ; Conditional feature reader
    ; github.com/miner/wilkins
    (println #feature/condf
    [(and jdk1.6+ clj1.5.*) "reducers OK"
    else "Don’t push that button"])
    42

    View Slide

  43. Summary
    • #tagged literals are mostly harmless
    • clojure.edn is your safe place
    • data-readers are open to crazy ideas
    43

    View Slide

  44. Towel Day
    • A towel is about the most massively useful
    thing an interstellar hitchhiker can have.
    • Tribute to Douglas Adams
    • towelday.org - 25th of May
    • There's a frood who really knows where his
    towel is.
    44

    View Slide

  45. The Data-Reader’s
    Guide to the Galaxy
    Steve Miner
    @miner
    #inst “2013-03-18T10:30-07:00”
    So long and thanks for all the fish.
    45

    View Slide

  46. 46

    View Slide

  47. Extras
    47

    View Slide

  48. Google
    • The answer to life the universe and
    everything
    • Douglas Adams doodle
    48

    View Slide

  49. History of every major
    Galactic Civilization
    • Phases: Survival, Inquiry and Sophistication
    • How can we eat?
    • Why do we eat?
    • Where shall we have lunch?
    49

    View Slide

  50. Programming
    • Phases: Survival, Inquiry and Sophistication
    • How can we execute code?
    • Why do we write code?
    • Where shall we host our code?
    50

    View Slide

  51. Marvin
    I’m fifty thousand times more intelligent than
    you and even I don’t know the answer. It gives
    me a headache just trying to think down to your
    level.
    51

    View Slide

  52. Earth
    An utterly insignificant little blue green planet
    whose ape-descended life forms are so
    amazingly primitive that they still think XML is a
    pretty neat idea.
    52

    View Slide