Slide 1

Slide 1 text

Lazy Sequences - Why are they so lazy? By Ramsharan Gorur Jayaraman Director Of Engineering Quintype - Introduction. - Pardon me if I am not good at public speaking because the last time I did this I was talking in a Ruby conference about moving to Golang and it did not go well. - I joined QT and talk about first task.

Slide 2

Slide 2 text

First Experience (let [batches (->> items (partition-all batch-size))] (doall [batch-of-items batches] (result-set-fn-wrapper batch-of-items))) - I am a big time emacs user and I am used to pressing a lot of keys. - I did this the first time and pressed c-c c-k 100 times, saved the file a 100 times and hot patched it in prod. - OOM Kill

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

First Experience (let [batches (->> items (partition-all batch-size))] (doall [batch-of-items batches] (result-set-fn-wrapper batch-of-items))) (let [batches (->> items (partition-all batch-size))] (doseq [batch-of-items batches] (result-set-fn-wrapper batch-of-items))) - OOM Kill - I with my emacs fingers went to one of my colleagues (Foobar). - Then I replaced the doall with doseq and ran it on prod, emacsed for sometime and it worked like a charm. - I started googling differences between doseq and doall. Then I realised that all of them were operations on something called a lazy sequence.

Slide 5

Slide 5 text

What is a Lazy Seq? - I went to Foobar and asked him what a lazy seq was. - He gave me the snarkiest look in his whole life and left the office thinking people like me should not exist in this world. - Then I missed my gym session and started googling.

Slide 6

Slide 6 text

Enter Lisp Cons Cells Lisp cons cells takes me back to when I was studying computer science trying to become the useless programmer Foobar stated. Linked Lists.

Slide 7

Slide 7 text

(setf x '(b)) - Car is the value of the Cons cell - CDR is the pointer to the next cons cell or it is nil in case it is the last or the first.

Slide 8

Slide 8 text

(setf x '(a b)) (setf y (cons d x)) (setf (car x) '(r t)) A list of cons cells can consist of another list of cons cells.

Slide 9

Slide 9 text

Lazy Seqs are not data structures. - Coming back to lazy seqs they are not data structures. - Just because they return a list when realised we should not be looking at them as data structures. - They are a sequence of algorithms in my opinion.

Slide 10

Slide 10 text

They are very simple cells like Cons Cells with each element consisting of a value and a function. “first” calls a function which returns a value and another function and then returns the value from the called function. “rest” calls a function which returns a value and another function and then returns the value from the called function.

Slide 11

Slide 11 text

Why is it efficient?

Slide 12

Slide 12 text

A lazy seq is an algorithmically generated logical sequence that does not have to reside in memory user> (time (def partition-large-no-set (partition 3 (range 1000000)))) ;;”Elapsed time: 0.165415 msecs" user> (time (def partition-large-no-set (doall (partition 3 (range 1000000))))) ;;"Elapsed time: 3289.272597 msecs"

Slide 13

Slide 13 text

–Anonymous “Lazy Seqs make impossible possible.”

Slide 14

Slide 14 text

Lazy Seqs at Quintype

Slide 15

Slide 15 text

• We defined one of our products around the basic behaviour of lazy seqs. • Quintype imports content from other platforms for client acquisition. • We built our file structure with one json a line and in small files so that we can lazily read data and import them.

Slide 16

Slide 16 text

Input {"external-id": "external-id-4","headline": "Attractive headline","subheadline": “Attractive subheadline","slug": "slug","body": "

Some Body

”}\n {"external-id": "external-id-4","headline": "Attractive headline","subheadline": “Attractive subheadline","slug": "slug","body": "

Some Body 1

”}\n

Slide 17

Slide 17 text

Processing (defn- push-content-to-channel "Push content to channel" [channel] (doseq [file (get-list-of-files)] (log/info {:message (str "[READING-FILE] " file)}) (s3/read-file file #(async/>!! channel %))) (async/close! channel)) (def filter-and-decorate "Filter & decorate" (comp (filter verify-content?) (map decorate))) (defn import-content "Import content" [] (let [parallel 5 batch 100 input-channel (async/chan 5) output-channel (async/chan parallel (partition-all (* parallel batch)))] (future (push-content-to-channel input-channel)) (async/pipeline 1 output-channel filter-and-decorate input-channel) (let [stats (async/

Slide 18

Slide 18 text

Mr Foobar’s recent commit

Slide 19

Slide 19 text

References • http://theatticlight.net/posts/Lazy-Sequences-in-Clojure/ • https://cs.gmu.edu/~sean/lisp/cons/

Slide 20

Slide 20 text

Thank You. Questions?