Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to Use In-Memory Streams

HayaoSuzuki
August 29, 2020

How to Use In-Memory Streams

PyCon JP 2020

HayaoSuzuki

August 29, 2020
Tweet

More Decks by HayaoSuzuki

Other Decks in Technology

Transcript

  1. ΠϯϝϞϦʔετϦʔϜ׆༻ज़
    How to Use In-Memory Streams
    Hayao Suzuki
    PyCon JP 2020
    August 29, 2020

    View Slide

  2. ൃදʹࡍͯ͠
    GitHub ʹࢿྉ͕͋Γ·͢
    › https://github.com/HayaoSuzuki/pyconjp2020
    Twitter ͷϋογϡλά
    › #pyconjp_1
    PyCon JP Fellow Slack
    › #jp-2020-track-1
    2 / 27

    View Slide

  3. Who am I ?
    ͓લ୭Α
    Name Hayao Suzukiʢླ໦ɹॣʣ
    Twitter @CardinalXaro
    Work Python Programmer at iRidge, Inc.
    3 / 27

    View Slide

  4. Who am I ?
    Technical Reviewer
    › Effective Python ୈ 2 ൛ (O’Reilly Japan)
    › ಈֶ͔ͯ͠Ϳྔࢠίϯϐϡʔλϓϩάϥϛϯά (O’Reilly
    Japan)
    https://xaro.hatenablog.jp/ ʹϦετ͕͋Γ·͢ɻ
    4 / 27

    View Slide

  5. Who am I ?
    Selected Talks
    › ϨΨγʔ Django ΞϓϦέʔγϣϯͷݱ୅Խ
    (DjangoCongress JP 2018)
    › SymPy ʹΑΔ਺ࣜॲཧ (PyCon JP 2018)
    › Python ͱָ͠Ήॳ౳੔਺࿦ (PyCon mini Hiroshima
    2019)
    › ܅͸ cmath Λ஌͍ͬͯΔ͔ (PyCon mini Shizuoka 2020)
    https://xaro.hatenablog.jp/ ʹϦετ͕͋Γ·͢ɻ
    5 / 27

    View Slide

  6. ࠓ೔ͷ໨ඪ
    ͜Μͳ՝୊Λղܾ͍ͨ͠ʂ
    › Πϯλʔωοτܦ༝Ͱ਺ GB αΠζͷσʔλΛऔಘ͠ɺCSV
    ϑΝΠϧʹՃ޻͢Δ
    › Ϋϥ΢υ্ʹߏஙͨ͠طଘͷγεςϜʹ௥Ճ͢ΔܗͰ࣮૷͢Δ
    › ຖ೔࣮ߦ͢Δ
    Ϋϥ΢υαʔϏε͸ैྔ՝ۚ
    ͳΔ΂͘ਝ଎ʹॲཧ͍ͨ͠ʂ
    6 / 27

    View Slide

  7. ࠓ೔ͷ໨ඪ
    ॲཧͷྲྀΕ
    › Πϯλʔωοτܦ༝Ͱ਺ GB αΠζͷσʔλΛऔಘ͢Δ
    › ਺ GB αΠζͷσʔλΛ CSV ϑΝΠϧʹՃ޻͢Δ
    › CSV ϑΝΠϧΛ ZIP ѹॖ͢Δ
    › ZIP ѹॖσʔλΛΫϥ΢υετϨʔδʹΞοϓϩʔυ͢Δ
    ෼ੳ
    › σʔλαΠζ͕େ͖͍
    › σʔλͷՃ޻͸୯७ͳॲཧ
    7 / 27

    View Slide

  8. ࠓ೔ͷ໨ඪ
    ϘτϧωοΫ͸Ͳ͔͜
    › ZIP ѹॖ͸ͦΕ΄ͲେมͰ͸ͳ͍
    › σʔλՃ޻͸୯७ͳॲཧ
    › ϘτϧωοΫ͸ I/O ॲཧʹ͋Γͦ͏
    Կͱ͔ͯ͠ I/O ॲཧΛਝ଎ʹॲཧ͍ͨ͠ʂʂʂ
    8 / 27

    View Slide

  9. Today’s Theme
    In-Memory Streams
    9 / 27

    View Slide

  10. Stream?
    ͦ΋ͦ΋ετϦʔϜͬͯԿʁ
    ετϦʔϜ͸ϑΝΠϧΦϒδΣΫτͰ͋Δɻ
    10 / 27

    View Slide

  11. File Object?
    ϑΝΠϧΦϒδΣΫτͬͯԿʁ
    › read() ΍ write() ͳͲͷϝιουΛ࣋ͭΦϒδΣΫτ
    › σΟεΫ্ͷϑΝΠϧ΍ผͷ৔ॴʹ͋ΔετϨʔδɺೖग़ྗػثͱ
    ΍ΓͱΓ͕Ͱ͖Δ
    11 / 27

    View Slide

  12. File Object?
    ϑΝΠϧΦϒδΣΫτͨͪ
    › ੜόΠφϦϑΝΠϧ
    › όοϑΝ෇͖όΠφϦϑΝΠϧ
    › ςΩετϑΝΠϧ
    12 / 27

    View Slide

  13. ࢖͍ํ
    ςΩετϑΝΠϧ
    f = open("myfile.txt", "r")
    όοϑΝ෇͖όΠφϦ
    f = open("myfile.jpg", "rb")
    13 / 27

    View Slide

  14. open ؔ਺ͷཪଆ
    open ͸ԿΛ͍ͯ͠Δͷ͔ʁ
    OS ͷγεςϜίʔϧ API ΛݺͿ
    14 / 27

    View Slide

  15. open ؔ਺ͷཪଆ
    ྫɿCSV ʹՃ޻͢Δ
    with open("events.csv", "w") as csv_file:
    fieldnames = ["title", "started_at", "ended_at"]
    writer = csv.DictWriter(csv_file, fieldnames)
    writer.writeheader()
    writer.writerows(events)
    15 / 27

    View Slide

  16. open ؔ਺ͷཪଆ
    ྫɿWindows
    › CreateFileʢϑΝΠϧͷΞΫηεݖऔಘʣ
    › QueryAllInformationFileʢϑΝΠϧ৘ใͷऔಘʣ
    › WriteFileʢϑΝΠϧ΁ॻ͖ࠐΉʣ
    › CloseFileʢϑΝΠϧΛด͡Δʣ
    Process Monitor ܦ༝Ͱ֬ೝͨ͠ɻ
    16 / 27

    View Slide

  17. open ؔ਺ͷཪଆ
    ྫɿUbuntu on WSL
    › openat ʢϑΝΠϧͷΦʔϓϯʣ
    › fstatʢϑΝΠϧ৘ใͷऔಘʣ
    › ioctlʢσόΠε੍ޚʣ
    › lseekʢϑΝΠϧͷγʔΫʣ
    › writeʢϑΝΠϧ΁ॻ͖ࠐΉʣ
    › closeʢϑΝΠϧΛด͡Δʣ
    strace ܦ༝Ͱ֬ೝͨ͠ɻ
    17 / 27

    View Slide

  18. ࠷ޙʹস͏ͷ͸୭ͩ
    ࠷ऴతͳ੒Ռ෺͸Ͳ͜ʹஔ͘ʁ
    › ϑΝΠϧΛϩʔΧϧʹอଘ͢Δͷ͕ΰʔϧͰ͸ͳ͍
    › ϑΝΠϧΛ AWS S3 ͳͲͷ֎෦ʹஔ͖͍ͨ
    ϩʔΧϧσόΠεʹϑΝΠϧΛॻ͖ࠐΈͨ͘ͳ͍ʂ
    18 / 27

    View Slide

  19. Today’s Theme
    In-Memory Streams
    19 / 27

    View Slide

  20. ΠϯϝϞϦʔετϦʔϜ
    ΠϯϝϞϦʔετϦʔϜͱ͸
    › str ΍ bytes ΛϑΝΠϧΦϒδΣΫτͷΑ͏ʹѻ͑Δ
    › ಡΈॻ͖ՄೳɺϥϯμϜΞΫηεՄೳ
    20 / 27

    View Slide

  21. StringIO
    StringIO
    ςΩετϑΝΠϧͷͨΊͷΠϯϝϞϦετϦʔϜ
    ྫɿCSV Λ StringIO ͰऔΓѻ͏
    import io
    with io.StringIO() as csv_file:
    fieldnames = ["title", "started_at", "ended_at"]
    writer = csv.DictWriter(csv_file, fieldnames)
    writer.writeheader()
    writer.writerows(events)
    21 / 27

    View Slide

  22. BytesIO
    BytesIO
    όοϑΝ෇͖όΠφϦϑΝΠϧͷͨΊͷΠϯϝϞϦετϦʔϜ
    ྫɿPNG Λ BytesIO ͰऔΓѻ͏
    import io
    with io.BytesIO(png_bytes) as f:
    png_header = f.read(8)
    print(png_header) # b'\x89PNG\r\n\x1a\n'
    22 / 27

    View Slide

  23. ෮शɿࠓ೔ͷ໨ඪ
    ॲཧͷྲྀΕ
    › Πϯλʔωοτܦ༝Ͱ਺ GB αΠζͷσʔλΛऔಘ͢Δ
    › ਺ GB αΠζͷσʔλΛ CSV ϑΝΠϧʹՃ޻͢Δ
    › CSV ϑΝΠϧΛ ZIP ѹॖ͢Δ
    › ZIP ѹॖσʔλΛΫϥ΢υετϨʔδʹΞοϓϩʔυ͢Δ
    23 / 27

    View Slide

  24. σʔλΛΠϯλʔωοτܦ༝Ͱऔಘ͢Δ
    ྫɿConnpass API Λίʔϧ͢Δ
    with urllib.request.urlopen(url) as response:
    events = json.load(response)["events"]
    24 / 27

    View Slide

  25. σʔλΛՃ޻͢Δ
    ྫɿAPI ͷऔಘ݁ՌΛ CSV ʹ͢Δ
    with io.StringIO() as ts:
    header = ["title", "started_at", "ended_at"]
    writer = csv.DictWriter(ts, fieldnames=header)
    writer.writeheader()
    writer.writerows(events)
    25 / 27

    View Slide

  26. σʔλΛѹॖ&Ξοϓϩʔυ
    ྫɿZIP ʹѹॖͯ͠ AWS S3 ʹΞοϓϩʔυ
    with io.BytesIO() as bs:
    with zipfile.ZipFile(bytes_stream, "w") as zf:
    zf.writestr("events.csv", ts.getvalue())
    bs.seek(0) # ϑΝΠϧγʔΫ͕ϙΠϯτ
    s3.upload_fileobj(bs, "bucket", "events.zip")
    26 / 27

    View Slide

  27. Conclusion
    ·ͱΊ
    › io Ϟδϡʔϧʹ͸ΠϯϝϞϦʔετϦʔϜؚ͕·ΕΔɻ
    › str ΍ bytes ΛϑΝΠϧΦϒδΣΫτͷΑ͏ʹѻ͏͜ͱ͕Ͱ͖Δɻ
    › ௨ৗͷ open ͱҟͳΓγεςϜίʔϧ͕ݺ͹Εͳ͍ɻ
    › σΟεΫ΁ͷ I/O ΛݮΒ͍ͨ͠ɺ·ͨ͸Ͱ͖ͳ͍ঢ়گԼͰͷར༻
    ͕࠷దͰ͋Δɻ
    io ϞδϡʔϧΛօ༷ͷಓ۩ശʹೖΕ͍ͯͩ͘͞ʂ
    27 / 27

    View Slide