The Data-Reader's Guide to the Galaxy

925e5591fbb086894c2b09b54a18f1e2?s=47 miner
March 18, 2013

The Data-Reader's Guide to the Galaxy

Presented at Clojure/West 2013.

Don't panic if you're unsure about picking up data-readers. It turns out that tagged literals are mostly harmless. Much as Douglas Adams produced "The Hitchhicker's Guide" material in multiple forms, a Clojure programmer can assign data-readers to process tagged literals with customized implementations. Clojure 1.5 adds a new feature that makes dealing with unknown tags simple and convenient. We'll also talk about the Extensible Data Notation (EDN), which aims to be the Babel fish of data transfer. Finally, we will explore a few unorthodox uses for data-readers.

video: http://www.infoq.com/presentations/Clojure-Data-Reader

925e5591fbb086894c2b09b54a18f1e2?s=128

miner

March 18, 2013
Tweet

Transcript

  1. The Data-Reader’s Guide to the Galaxy Steve Miner @miner 1

  2. The Data-Reader’s Guide to the Galaxy Steve Miner @miner #inst

    “2013-03-18T09:50-07:00” 2
  3. Douglas Adams 3

  4. 4

  5. The Guide • The standard repository for all knowledge and

    wisdom • Basics • New in Clojure 1.5 • EDN • Unorthodox ideas 5
  6. Tagged Literals • #my.ns/tag literal-data • #inst “2013-03-18” • Self-describing

    data • Loosely coupled to implementation • Data transfer 6
  7. Extensible Reader • Add new data literals • Customized to

    application • Limited form of CL reader macros • Read-time vs. Compile time 7
  8. *data-readers* • data_readers.clj • literal map • tag symbols to

    var symbols • (binding [*data-readers* {...}] body) 8
  9. (defn my-reader [[a b]] (+ (* 10 a) b)) (binding

    [*data-readers* {‘my.ns/tag #’my-reader}] (read-string “#my.ns/tag [4 2]”)) ;=> 42 9
  10. • pre-defined by Clojure • #uuid • #inst default-data-readers 10

  11. Time is an illusion. Lunchtime doubly so. 11

  12. #inst • Instant in time per RFC 3339 • #inst

    “2013-03-18” • #inst “2013-03-18T09:50-07:00” • #inst “2013-03-18T16:50:00.000-00:00” 12
  13. Date/Time • java.util.Date • java.util.Calendar • java.sql.Timestamp • Joda Time

    - clj-time • JSR 310 - java.time in JDK 8 13
  14. #uuid • #uuid "4e0f694b-1901-4886-9101-5479fc0c9720" • (java.util.UUID/randomUUID) 14

  15. Examples • #x/url “http://clojurewest.org” • #x/base64 “ZnJvbSBvdGhlciBhbmltY” • #x/coords [45.5241,

    -122.6820] 15
  16. *default-data-reader-fn* • new in Clojure 1.5 • defaults to nil,

    throws on unknown tag • fn receives a tag (symbol) and a value • should return a literal value 16
  17. Example *ddrf* • (defrecord TaggedValue [tag value]) • print-method TaggedValue

    • #tag value • *default-data-reader-fn* • (->TaggedValue tag value) 17
  18. • record: #my.ns.Rec{:a 42} • tagged: #my.ns/Rec {:a 42} •

    *default-data-reader-fn* • factory for tag: my.ns/map->Rec • maybe check tag ns or Capital Records vs. Tags 18
  19. (defn record-tag-factory [tag] (resolve (symbol (str (namespace tag) "/map->" (name

    tag))))) (defn default-reader [tag value] (if-let [factory (and (map? value) (Character/isUpperCase (first (name tag))) (record-tag-factory tag))] (factory value) (->TaggedValue tag value))) 19
  20. Library Authors • #my.ns/tag semantics • Base literal value •

    Implementation types • Data-reader functions • Print methods (maybe) • data_readers.clj (maybe) 20
  21. Printing • See instant.clj • Clojure prints using #inst for

    ju.Date, ju.Calendar and js.Timestamp • *print-dup* ignored 21
  22. Gotchas • CLJ-1138 returning nil throws. • Return '(quote nil)

    instead. • CLJ-1100 period in tag throws. • Java Dates are mutable 22
  23. (= #inst "2013-03-18T16:50Z" #inst "2013-03-18T09:50-07:00") ;=> true (set! *data-readers* {'inst

    #'clojure.instant/read-instant- calendar}) (= #inst "2013-03-18T16:50Z" #inst "2013-03-18T09:50-07:00") ;=> false 23
  24. *read-eval* Kerfuffel • Ruby on Rails vulnerability • Clojure *read-eval*

    true by default • #=(dangerously) can execute code • not safe for reading untrusted data • CLJ-1153 and CLJ-904 24
  25. 25

  26. clojure.core/read • Designed by hyper intelligent, pan dimensional beings •

    read is for trusted input • binding *read-eval* false • (pre 1.5) allowed Java constructors • #my.ns.Rec[42] 26
  27. A common mistake that people make when trying to design

    something completely foolproof is to underestimate the ingenuity of complete fools. 27
  28. clojure.edn • new EdnReader in Clojure 1.5 • no #=()

    • no Java constructors, no records • just safe EDN elements 28
  29. clojure.edn/read • arities - [ ] [stream] [opts stream] •

    opts map - :eof, :readers, :default • defaults to *in* and default-data-readers 29
  30. clojure.edn/read-string • arities - [string] [opts string] • opts map

    - :eof, :readers, :default • defaults to default-data-readers 30
  31. clojure.tools.reader • Written in Clojure • A complete Clojure reader

    • An EDN-only reader • Works with Clojure 1.3+ 31
  32. EDN • Extensible Data Notation • edn-format.org • Clojure style

    values: { } [ ] ( ) sym :kw • #tagged literals • Implementations for other languages 32
  33. 33

  34. The Guide on XML In the beginning XML was created.

    This has made a lot of people very angry and has been widely regarded as a bad move. 34
  35. XML • XML 1.0 (1998) - working group of 11

    experts and an interest group of 150 within W3C • The Encyclopedia Galactica of data formats • <xml /> is (s-expr) with better marketing 35
  36. JSON • “The Fat-Free Alternative to XML” • “The good

    thing about reinventing the wheel is that you can get a round one.” - Douglas Crockford • not extensible 36
  37. EDN vs. JSON • extensible • more types • a

    little more syntax • conveyance of values, not objects • slightly cheaper 37
  38. Unorthodox Ideas • #x/roman • #x/rpn • spyscope • #feature/condf

    38
  39. (defn roman-numeral-reader [s] (parse-roman (name s))) #x/roman “XLII” ;=> 42

    #x/roman XLII ;=> 42 39
  40. #x/rpn [3 4 + 8 2 - *] ; compiler

    sees: (* (+ 3 4) (- 8 2)) ;=> 42 40
  41. Spyscope • github.com/dgrnbrg/spyscope • #spy/d (form) • better than println

    • minimally invasive 41
  42. ; Conditional feature reader ; github.com/miner/wilkins (println #feature/condf [(and jdk1.6+

    clj1.5.*) "reducers OK" else "Don’t push that button"]) 42
  43. Summary • #tagged literals are mostly harmless • clojure.edn is

    your safe place • data-readers are open to crazy ideas 43
  44. Towel Day • A towel is about the most massively

    useful thing an interstellar hitchhiker can have. • Tribute to Douglas Adams • towelday.org - 25th of May • There's a frood who really knows where his towel is. 44
  45. The Data-Reader’s Guide to the Galaxy Steve Miner @miner #inst

    “2013-03-18T10:30-07:00” So long and thanks for all the fish. 45
  46. 46

  47. Extras 47

  48. Google • The answer to life the universe and everything

    • Douglas Adams doodle 48
  49. History of every major Galactic Civilization • Phases: Survival, Inquiry

    and Sophistication • How can we eat? • Why do we eat? • Where shall we have lunch? 49
  50. Programming • Phases: Survival, Inquiry and Sophistication • How can

    we execute code? • Why do we write code? • Where shall we host our code? 50
  51. Marvin I’m fifty thousand times more intelligent than you and

    even I don’t know the answer. It gives me a headache just trying to think down to your level. 51
  52. Earth An utterly insignificant little blue green planet whose ape-descended

    life forms are so amazingly primitive that they still think XML is a pretty neat idea. 52