Practical Elixir Streams — To Infinity and Beyond

Practical Elixir Streams: To Infinity and Beyond Evadne Wu Occasional
Fixer-Upper [email protected] twitter.com/evadne

Myself Another Elixir Programmer Inquiries: [email protected] Arguments: twitter.com/evadne

1 Basics

Enumeration Enum: reduce, each, map, etc. for … in for
… in, into: for … in, reduce:

Finite Enumerables [1, 2, 3] %{a: 1, b: 2} [a:
1, b: 2] 1 .. 10

Finite Enumerables Lists, Maps, Keywords, etc. are finite Enumerables, with
contents generated ahead of time All of them can be enumerated in the same manner — in linear or constant time, depending on the structure — because they all implement the Enumerable protocol

Lazy Enumerables Any data type which has an implementation of
the Enumerable protocol can be enumerated. This attribute can in turn be used to create lazily evaluated Enumerables.

Lazy Enumerables Lazy Enumerables can act as placeholders for actual
content which minimises blocking Computation is spread throughout the lifetime of the enumerable and can be halted if the results are no longer required

Lazy Enumerables Stream.map(1..3, &munge/1) File.stream!(path) Task.async_stream(paths, &{&1, File.read(&1)})

Using Lazy Enumerables Enum.to_list/1: to return a List Stream.run/1: to
consume the Stream NB: Calling Enum functions on Streams forces evaluation

Anything is Enumerable > Since enumerables can have different shapes
(structs, anonymous functions, and so on), the functions in this module may return any of those shapes and this may change at any time.

2 Reduction

Enum.reduce/3 The core concept which drives all enumeration Can be
used to implement all other Enum functions Can be replaced with optimisation paths Calls Enumerable.reduce/3

Enumerable Protocol Defines a series of functions to be implemented
by any data type that wishes to become enumerable Includes reduce/3, count/1, member?/2, slice/1 with reduce/3 providing core functionality

Enumerable.reduce/3 The core building block of enumeration, which is implemented
for all enumerable types Continuable, i.e. can be suspended

Enumerable.reduce/3 @type acc :: {:cont | :halt | :suspend, term}
@type reducer :: (acc -> result) @spec reduce(t, acc, reducer) :: result

Infinite Enumerables Any data type that implements Enumerable, and always
returns new items in its reduce/3 implementation, is infinite. The Stream module provides convenience functions to construct such Enumerables!

Infinite Enumerables Stream.cycle([1, 2, 3]) Stream.repeatedly(fn -> :hi end)

Infinite Enumerables Stream.unfold({0, 0}, fn {0, 0} -> {1, {1,
0}} {1, 0} -> {1, {1, 1}} {c, p} -> {c + p, {c + p, c}} end)

Stream Module = Convenience iex> Stream.map([1, 2, 3], & &1)
#Stream<[ enum: [1, 2, 3], funs: [ #Function<48.68317796/1 in Stream.map/2> ] ]>

Stream Module = Convenience iex(4)> Stream.cycle([1, 2, 3]) #Function<64.68317796/2 in
Stream.unfold/2>

Incremental Peeling Abuse of Enumerable.reduce/3 allows incremental consumption of any
Enumerable, including Streams! Trick: Start the enumeration in a suspended state, accumulate into nil, but keep returning items Note: Streams might be non-reentrant

Incremental Peeling reduce_fun = fn item, _ -> {:suspend, item}
end acc = {:suspend, nil} result = Enumerable.reduce(enum, acc, reduce_fun) {:suspended, nil, next_fun} = result {:suspended, _, next_fun} = next_fun.({:cont, nil}) … {:done, nil} = next_fun.({:cont, nil})

Note Streams are pull-based. No computation is done unless if
new values are pulled from it. Collections are push-based. They can be used to collect output (e.g. File.Stream!/1)

3 Convenience

Stream.transform/3 Similar to flat map Returns results

Stream.transform/3 iex> enum = 1001..9999 iex> n = 3 iex>
stream = Stream.transform(enum, 0, fn i, acc -> ...> if acc < n, do: {[i], acc + 1}, else: {:halt, acc} ...> end) iex> Enum.to_list(stream) [1001, 1002, 1003]

Stream.resource/3 Creates initial accumulator in start_fun Calls next_fun repeatedly until
completion Calls after_fun at the end of enumeration Can be used for single-pass (streaming) file generation

Stream.resource/3 Stream.resource( fn -> File.open!("sample") end, fn file -> case
IO.read(file, :line) do data when is_binary(data) -> {[data], file} _ -> {:halt, file} end end, fn file -> File.close(file) end )

Stream.into/2 Used to redirect output of a stream into a
Collectable Useful when the Collectable represents the outside world (e.g. a File Handle)

Stream.into/2 File.stream!(path_input) |> Stream.map(&String.replace(&1, "#", "%")) |> Stream.into(File.stream!(path_output)) |> Stream.run()

Task.async_stream/3 Emits a Stream which runs the given function once
for each element in the Enumerable With options: max_concurrency, ordered Executed with Stream.run/1

Task.async_stream/3 1..100 |> Stream.take(10) |> Task.async_stream(&IO.puts(to_string(&1))) |> Stream.run()

Plug.Conn.chunk/2 Sends the response to the client incrementally Requires a
connection which has been configured with send_chunked/2 Leaves enumeration to the programmer

Plug.Conn.chunk/2 chunked_conn = conn |> put_resp_content_type(type) |> send_chunked(200)

Plug.Conn.chunk/2 Enum.reduce_while(enum, conn, fn (chunk, conn) -> case Plug.Conn.chunk(conn, chunk)
do {:ok, conn} -> {:cont, conn} {:error, :closed} -> {:halt, conn} end end)

Ecto.Repo.stream/2 Returns all matching entries from the database Lazily enumerated
but may require a transaction

Ecto.Repo.stream/2 Repo.transaction(fn -> query = from p in Post, select:
p.title stream = Repo.stream() end)

4 Projection

StreamData Provides value generators in the form of Streams Can
be used for data generation Very helpful for property-based testing

StreamData StreamData.integer() |> Stream.map(&abs/1) |> Enum.take(3) #=> [1, 0, 2]

StreamData use ExUnitProperties property "bin1 <> bin2 always starts with
bin1" do check all bin1 <- binary(), bin2 <- binary() do assert String.starts_with?(bin1 <> bin2, bin1) end end

Packmatic Generates a Zip64 archive stream Comes with Plug integration
to enable fast download Added refinements based on prior community work

Packmatic entries = [ [source: {:file, "/tmp/foo.pdf"}, path: "foo/bar.pdf"], [source:
{:url, "https://example.com/baz.pdf"}, path: "baz.pdf"] ] Packmatic.build_stream(entries) |> Packmatic.Conn.send_chunked(conn, "download.zip")

5 Conclusion

Why Use Streams? Minimise latency: Reduce time-to-first-byte and minimise jank
risk (system saturation) Reduce resource usage: Eliminate peaks and avoid wasted work Reduce complexity: Leverage composition for shorter and more succinct programs

Practical Elixir Streams — To Infinity and Beyond

Practical Elixir Streams — To Infinity and Beyond

More Decks by Evadne Wu

Other Decks in Technology

Featured

Transcript