Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Perils of Large Files

The Perils of Large Files

Dealing with files with size measurable in hundreds of MB on the BEAM requires some extra care.

This talk aims at covering techniques to implement, profile and troubleshoot scenarios involving large files with a focus on performance and reliability.

Claudio Ortolina

October 07, 2020
Tweet

More Decks by Claudio Ortolina

Other Decks in Programming

Transcript

  1. Context • Engineer at PSPDFKit • Working on PSPDFKit Server,

    an application that manipulates PDF documents Claudio Ortolina - ElixirConf EU 2020
  2. TL;DR on PDF files • PDFs can be variable in

    size (20KB to 1+GB) • Can be extremely complex/expensive to render Claudio Ortolina - ElixirConf EU 2020
  3. Operations • File upload and retrieval (S3, PostgreSQL, HTTP) •

    SHA256 checksum • Filesystem cache and temp copies Claudio Ortolina - ElixirConf EU 2020
  4. Danger! • Memory starvation • Memory leaks • CPU/IO starvation

    when operating on the file contents Claudio Ortolina - ElixirConf EU 2020
  5. In scope for this talk • List possible issues •

    Show how to identify/monitor symptoms • Provide solutions Claudio Ortolina - ElixirConf EU 2020
  6. Out of scope for this talk • A one-size, fits-all

    solution • A packaged library and/or large snippets Claudio Ortolina - ElixirConf EU 2020
  7. 1. Reading files • We have a file-system based cache

    store • Read the file by name • Process controls filesystem location Claudio Ortolina - ElixirConf EU 2020
  8. defmodule Perils.Examples.Cache do use GenServer # start link/1 and init/1

    omitted for brevity def read(file_name) do GenServer.call(__MODULE__, {:read, file_name}) end def handle_call({:read, file_name}, _from, base_path) do full_path = Path.join(base_path, file_name) contents = File.read!(full_path) {:reply, contents, base_path} end end Claudio Ortolina - ElixirConf EU 2020
  9. Test 1. Check total memory 2. Read a file 3.

    Check total memory 4. Use :recon.bin_leak/1 to trigger GC and report delta 5. See if cache process leaked Claudio Ortolina - ElixirConf EU 2020
  10. :erlang.memory(:total) |> IO.inspect(label: "Total before read") Perils.Examples.Cache.read("large.dat") :erlang.memory(:total) |> IO.inspect(label:

    "Total after read") :recon.bin_leak(10) |> Enum.filter(fn {_pid, _delta, [m, _f, _a]} -> m == Perils.Examples.Cache _other -> false end) |> IO.inspect(label: "GC delta") :erlang.memory(:total) |> IO.inspect(label: "Total after GC") Claudio Ortolina - ElixirConf EU 2020
  11. Total before read: 31581496 Total after read: 166084696 GC delta:

    [ {#PID<0.174.0>, -2, [ Perils.Examples.Cache, {:current_function, {:gen_server, :loop, 7}}, {:initial_call, {:proc_lib, :init_p, 5}} ]} ] Total after GC: 30268152 Notice that 2 references were dropped. Claudio Ortolina - ElixirConf EU 2020
  12. Why • The binary is larger than 64KB, so it's

    reference counted • The cache server reads the content (holding a reference) but not much else • The runtime doesn't trigger a GC Claudio Ortolina - ElixirConf EU 2020
  13. Possible fixes 1. Hibernate the process with {:reply, contents, base_path,

    :hibernate}. 2. Don't read in the process, read in the caller (e.g. returning the complete path instead of the file contents) Claudio Ortolina - ElixirConf EU 2020
  14. 2. Processing files • Given a file, we want to

    calculate its sha256 sum Claudio Ortolina - ElixirConf EU 2020
  15. defmodule Perils.Examples.Sha do def sha256(file_name) do contents = File.read!(file_name) :sha256

    |> :crypto.hash(contents) |> Base.encode32(case: :lower, padding: false) end end Claudio Ortolina - ElixirConf EU 2020
  16. Test 1. Check total memory 2. Calculate sha for a

    file 3. Check total memory 4. Use :recon.bin_leak/1 to trigger GC 5. See memory dropping down Claudio Ortolina - ElixirConf EU 2020
  17. :erlang.memory(:total) |> IO.inspect(label: "Total before sha") file_name = Path.join(:code.priv_dir(:perils), "large.dat")

    file_name |> Perils.Examples.Sha.sha256() |> IO.inspect(label: "File sha") :erlang.memory(:total) |> IO.inspect(label: "Total after sha") :recon.bin_leak(10) :erlang.memory(:total) |> IO.inspect(label: "Total after GC") Claudio Ortolina - ElixirConf EU 2020
  18. Total before sha: 31639536 File sha: "evf4yp6e6jyxey3n6s7tfxu7cb7wedkvtmqnoyazpzcsxf2fhelq" Total after sha:

    166487096 Total after GC: 30668392 Claudio Ortolina - ElixirConf EU 2020
  19. defmodule Perils.Examples.Sha do def sha256(file_name) do # 8MB line_or_bytes =

    8_000_000 stream = File.stream!(file_name, [], line_or_bytes) initial_digest = :crypto.hash_init(:sha256) stream |> Enum.reduce(initial_digest, fn chunk, digest -> :crypto.hash_update(digest, chunk) end) |> :crypto.hash_final() |> Base.encode32(case: :lower, padding: false) end end Claudio Ortolina - ElixirConf EU 2020
  20. Total before sha: 31361384 File sha: "evf4yp6e6jyxey3n6s7tfxu7cb7wedkvtmqnoyazpzcsxf2fhelq" Total after sha:

    32296312 Total after GC: 30696640 Claudio Ortolina - ElixirConf EU 2020
  21. 3. Writing files • Given a URL, fetch it and

    write the response to a file Claudio Ortolina - ElixirConf EU 2020
  22. defmodule Perils.Examples.Store do def write(file_name, url) do with {:ok, data}

    <- get(url) do File.write!(file_name, data) end end defp get(url) do :httpc.request(:get, {String.to_charlist(url), []}, [], []) |> case do {:ok, result} -> {{_, 200, _}, _headers, body} = result {:ok, body} error -> error end end end Claudio Ortolina - ElixirConf EU 2020
  23. def stream_write(file_name, url) do {:ok, ref} = :httpc.request(:get, {String.to_charlist(url), []},

    [], stream: :self, sync: false) stream_loop(ref, file_name) end Claudio Ortolina - ElixirConf EU 2020
  24. defp stream_loop(ref, file_name) do receive do {:http, {^ref, :stream_start, _headers}}

    -> file = File.open!(file_name, [:write, :raw]) write_loop(ref, file) after 5000 -> {:error, :timeout} end end defp write_loop(ref, file) do receive do {:http, {^ref, :stream, chunk}} -> IO.binwrite(file, chunk) write_loop(ref, file) {:http, {^ref, :stream_end, _headers}} -> File.close(file) after 5000 -> File.close(file) {:error, :timeout} end end Claudio Ortolina - ElixirConf EU 2020
  25. Why • The response is streamed and every chunk written

    to file • The entire file is never loaded in memory Claudio Ortolina - ElixirConf EU 2020
  26. Advantages of raw mode • We can use :raw to

    skip the intermediary process that gives access to the file • If we perform a lot of write operations concurrently, raw mode can be faster Claudio Ortolina - ElixirConf EU 2020
  27. Limitations of raw mode • Without intermediary process, we need

    to manually close the io device with File.close/1 • Only works on the same node Claudio Ortolina - ElixirConf EU 2020
  28. Some general recommendations 1. Collect metrics around file operations 2.

    Control amount of processes that can use system resources (see duomark/epocxy for some patterns) 3. Perform Soak testing Claudio Ortolina - ElixirConf EU 2020
  29. References • Erlang in Anger: a book to understand the

    pitfalls of the BEAM in production • recon: the diagnostics library used in the examples • telemetry packages: telemetry and related projects on Hex Claudio Ortolina - ElixirConf EU 2020