Slide 1

Slide 1 text

ΠϯϝϞϦʔετϦʔϜ׆༻ज़ How to Use In-Memory Streams Hayao Suzuki PyCon JP 2020 August 29, 2020

Slide 2

Slide 2 text

ൃදʹࡍͯ͠ GitHub ʹࢿྉ͕͋Γ·͢ › https://github.com/HayaoSuzuki/pyconjp2020 Twitter ͷϋογϡλά › #pyconjp_1 PyCon JP Fellow Slack › #jp-2020-track-1 2 / 27

Slide 3

Slide 3 text

Who am I ? ͓લ୭Α Name Hayao Suzukiʢླ໦ɹॣʣ Twitter @CardinalXaro Work Python Programmer at iRidge, Inc. 3 / 27

Slide 4

Slide 4 text

Who am I ? Technical Reviewer › Effective Python ୈ 2 ൛ (O’Reilly Japan) › ಈֶ͔ͯ͠Ϳྔࢠίϯϐϡʔλϓϩάϥϛϯά (O’Reilly Japan) https://xaro.hatenablog.jp/ ʹϦετ͕͋Γ·͢ɻ 4 / 27

Slide 5

Slide 5 text

Who am I ? Selected Talks › ϨΨγʔ Django ΞϓϦέʔγϣϯͷݱ୅Խ (DjangoCongress JP 2018) › SymPy ʹΑΔ਺ࣜॲཧ (PyCon JP 2018) › Python ͱָ͠Ήॳ౳੔਺࿦ (PyCon mini Hiroshima 2019) › ܅͸ cmath Λ஌͍ͬͯΔ͔ (PyCon mini Shizuoka 2020) https://xaro.hatenablog.jp/ ʹϦετ͕͋Γ·͢ɻ 5 / 27

Slide 6

Slide 6 text

ࠓ೔ͷ໨ඪ ͜Μͳ՝୊Λղܾ͍ͨ͠ʂ › Πϯλʔωοτܦ༝Ͱ਺ GB αΠζͷσʔλΛऔಘ͠ɺCSV ϑΝΠϧʹՃ޻͢Δ › Ϋϥ΢υ্ʹߏஙͨ͠طଘͷγεςϜʹ௥Ճ͢ΔܗͰ࣮૷͢Δ › ຖ೔࣮ߦ͢Δ Ϋϥ΢υαʔϏε͸ैྔ՝ۚ ͳΔ΂͘ਝ଎ʹॲཧ͍ͨ͠ʂ 6 / 27

Slide 7

Slide 7 text

ࠓ೔ͷ໨ඪ ॲཧͷྲྀΕ › Πϯλʔωοτܦ༝Ͱ਺ GB αΠζͷσʔλΛऔಘ͢Δ › ਺ GB αΠζͷσʔλΛ CSV ϑΝΠϧʹՃ޻͢Δ › CSV ϑΝΠϧΛ ZIP ѹॖ͢Δ › ZIP ѹॖσʔλΛΫϥ΢υετϨʔδʹΞοϓϩʔυ͢Δ ෼ੳ › σʔλαΠζ͕େ͖͍ › σʔλͷՃ޻͸୯७ͳॲཧ 7 / 27

Slide 8

Slide 8 text

ࠓ೔ͷ໨ඪ ϘτϧωοΫ͸Ͳ͔͜ › ZIP ѹॖ͸ͦΕ΄ͲେมͰ͸ͳ͍ › σʔλՃ޻͸୯७ͳॲཧ › ϘτϧωοΫ͸ I/O ॲཧʹ͋Γͦ͏ Կͱ͔ͯ͠ I/O ॲཧΛਝ଎ʹॲཧ͍ͨ͠ʂʂʂ 8 / 27

Slide 9

Slide 9 text

Today’s Theme In-Memory Streams 9 / 27

Slide 10

Slide 10 text

Stream? ͦ΋ͦ΋ετϦʔϜͬͯԿʁ ετϦʔϜ͸ϑΝΠϧΦϒδΣΫτͰ͋Δɻ 10 / 27

Slide 11

Slide 11 text

File Object? ϑΝΠϧΦϒδΣΫτͬͯԿʁ › read() ΍ write() ͳͲͷϝιουΛ࣋ͭΦϒδΣΫτ › σΟεΫ্ͷϑΝΠϧ΍ผͷ৔ॴʹ͋ΔετϨʔδɺೖग़ྗػثͱ ΍ΓͱΓ͕Ͱ͖Δ 11 / 27

Slide 12

Slide 12 text

File Object? ϑΝΠϧΦϒδΣΫτͨͪ › ੜόΠφϦϑΝΠϧ › όοϑΝ෇͖όΠφϦϑΝΠϧ › ςΩετϑΝΠϧ 12 / 27

Slide 13

Slide 13 text

࢖͍ํ ςΩετϑΝΠϧ f = open("myfile.txt", "r") όοϑΝ෇͖όΠφϦ f = open("myfile.jpg", "rb") 13 / 27

Slide 14

Slide 14 text

open ؔ਺ͷཪଆ open ͸ԿΛ͍ͯ͠Δͷ͔ʁ OS ͷγεςϜίʔϧ API ΛݺͿ 14 / 27

Slide 15

Slide 15 text

open ؔ਺ͷཪଆ ྫɿCSV ʹՃ޻͢Δ with open("events.csv", "w") as csv_file: fieldnames = ["title", "started_at", "ended_at"] writer = csv.DictWriter(csv_file, fieldnames) writer.writeheader() writer.writerows(events) 15 / 27

Slide 16

Slide 16 text

open ؔ਺ͷཪଆ ྫɿWindows › CreateFileʢϑΝΠϧͷΞΫηεݖऔಘʣ › QueryAllInformationFileʢϑΝΠϧ৘ใͷऔಘʣ › WriteFileʢϑΝΠϧ΁ॻ͖ࠐΉʣ › CloseFileʢϑΝΠϧΛด͡Δʣ Process Monitor ܦ༝Ͱ֬ೝͨ͠ɻ 16 / 27

Slide 17

Slide 17 text

open ؔ਺ͷཪଆ ྫɿUbuntu on WSL › openat ʢϑΝΠϧͷΦʔϓϯʣ › fstatʢϑΝΠϧ৘ใͷऔಘʣ › ioctlʢσόΠε੍ޚʣ › lseekʢϑΝΠϧͷγʔΫʣ › writeʢϑΝΠϧ΁ॻ͖ࠐΉʣ › closeʢϑΝΠϧΛด͡Δʣ strace ܦ༝Ͱ֬ೝͨ͠ɻ 17 / 27

Slide 18

Slide 18 text

࠷ޙʹস͏ͷ͸୭ͩ ࠷ऴతͳ੒Ռ෺͸Ͳ͜ʹஔ͘ʁ › ϑΝΠϧΛϩʔΧϧʹอଘ͢Δͷ͕ΰʔϧͰ͸ͳ͍ › ϑΝΠϧΛ AWS S3 ͳͲͷ֎෦ʹஔ͖͍ͨ ϩʔΧϧσόΠεʹϑΝΠϧΛॻ͖ࠐΈͨ͘ͳ͍ʂ 18 / 27

Slide 19

Slide 19 text

Today’s Theme In-Memory Streams 19 / 27

Slide 20

Slide 20 text

ΠϯϝϞϦʔετϦʔϜ ΠϯϝϞϦʔετϦʔϜͱ͸ › str ΍ bytes ΛϑΝΠϧΦϒδΣΫτͷΑ͏ʹѻ͑Δ › ಡΈॻ͖ՄೳɺϥϯμϜΞΫηεՄೳ 20 / 27

Slide 21

Slide 21 text

StringIO StringIO ςΩετϑΝΠϧͷͨΊͷΠϯϝϞϦετϦʔϜ ྫɿCSV Λ StringIO ͰऔΓѻ͏ import io with io.StringIO() as csv_file: fieldnames = ["title", "started_at", "ended_at"] writer = csv.DictWriter(csv_file, fieldnames) writer.writeheader() writer.writerows(events) 21 / 27

Slide 22

Slide 22 text

BytesIO BytesIO όοϑΝ෇͖όΠφϦϑΝΠϧͷͨΊͷΠϯϝϞϦετϦʔϜ ྫɿPNG Λ BytesIO ͰऔΓѻ͏ import io with io.BytesIO(png_bytes) as f: png_header = f.read(8) print(png_header) # b'\x89PNG\r\n\x1a\n' 22 / 27

Slide 23

Slide 23 text

෮शɿࠓ೔ͷ໨ඪ ॲཧͷྲྀΕ › Πϯλʔωοτܦ༝Ͱ਺ GB αΠζͷσʔλΛऔಘ͢Δ › ਺ GB αΠζͷσʔλΛ CSV ϑΝΠϧʹՃ޻͢Δ › CSV ϑΝΠϧΛ ZIP ѹॖ͢Δ › ZIP ѹॖσʔλΛΫϥ΢υετϨʔδʹΞοϓϩʔυ͢Δ 23 / 27

Slide 24

Slide 24 text

σʔλΛΠϯλʔωοτܦ༝Ͱऔಘ͢Δ ྫɿConnpass API Λίʔϧ͢Δ with urllib.request.urlopen(url) as response: events = json.load(response)["events"] 24 / 27

Slide 25

Slide 25 text

σʔλΛՃ޻͢Δ ྫɿAPI ͷऔಘ݁ՌΛ CSV ʹ͢Δ with io.StringIO() as ts: header = ["title", "started_at", "ended_at"] writer = csv.DictWriter(ts, fieldnames=header) writer.writeheader() writer.writerows(events) 25 / 27

Slide 26

Slide 26 text

σʔλΛѹॖ&Ξοϓϩʔυ ྫɿZIP ʹѹॖͯ͠ AWS S3 ʹΞοϓϩʔυ with io.BytesIO() as bs: with zipfile.ZipFile(bytes_stream, "w") as zf: zf.writestr("events.csv", ts.getvalue()) bs.seek(0) # ϑΝΠϧγʔΫ͕ϙΠϯτ s3.upload_fileobj(bs, "bucket", "events.zip") 26 / 27

Slide 27

Slide 27 text

Conclusion ·ͱΊ › io Ϟδϡʔϧʹ͸ΠϯϝϞϦʔετϦʔϜؚ͕·ΕΔɻ › str ΍ bytes ΛϑΝΠϧΦϒδΣΫτͷΑ͏ʹѻ͏͜ͱ͕Ͱ͖Δɻ › ௨ৗͷ open ͱҟͳΓγεςϜίʔϧ͕ݺ͹Εͳ͍ɻ › σΟεΫ΁ͷ I/O ΛݮΒ͍ͨ͠ɺ·ͨ͸Ͱ͖ͳ͍ঢ়گԼͰͷར༻ ͕࠷దͰ͋Δɻ io ϞδϡʔϧΛօ༷ͷಓ۩ശʹೖΕ͍ͯͩ͘͞ʂ 27 / 27