PyCon JP 2020
ΠϯϝϞϦʔετϦʔϜ׆༻ज़How to Use In-Memory StreamsHayao SuzukiPyCon JP 2020August 29, 2020
View Slide
ൃදʹࡍͯ͠GitHub ʹࢿྉ͕͋Γ·͢› https://github.com/HayaoSuzuki/pyconjp2020Twitter ͷϋογϡλά› #pyconjp_1PyCon JP Fellow Slack› #jp-2020-track-12 / 27
Who am I ?͓લ୭ΑName Hayao SuzukiʢླɹॣʣTwitter @CardinalXaroWork Python Programmer at iRidge, Inc.3 / 27
Who am I ?Technical Reviewer› Effective Python ୈ 2 ൛ (O’Reilly Japan)› ಈֶ͔ͯ͠Ϳྔࢠίϯϐϡʔλϓϩάϥϛϯά (O’ReillyJapan)https://xaro.hatenablog.jp/ ʹϦετ͕͋Γ·͢ɻ4 / 27
Who am I ?Selected Talks› ϨΨγʔ Django ΞϓϦέʔγϣϯͷݱԽ(DjangoCongress JP 2018)› SymPy ʹΑΔࣜॲཧ (PyCon JP 2018)› Python ͱָ͠Ήॳ (PyCon mini Hiroshima2019)› ܅ cmath Λ͍ͬͯΔ͔ (PyCon mini Shizuoka 2020)https://xaro.hatenablog.jp/ ʹϦετ͕͋Γ·͢ɻ5 / 27
ࠓͷඪ͜Μͳ՝Λղܾ͍ͨ͠ʂ› Πϯλʔωοτܦ༝Ͱ GB αΠζͷσʔλΛऔಘ͠ɺCSVϑΝΠϧʹՃ͢Δ› Ϋϥυ্ʹߏஙͨ͠طଘͷγεςϜʹՃ͢ΔܗͰ࣮͢Δ› ຖ࣮ߦ͢ΔΫϥυαʔϏεैྔ՝ۚͳΔ͘ਝʹॲཧ͍ͨ͠ʂ6 / 27
ࠓͷඪॲཧͷྲྀΕ› Πϯλʔωοτܦ༝Ͱ GB αΠζͷσʔλΛऔಘ͢Δ› GB αΠζͷσʔλΛ CSV ϑΝΠϧʹՃ͢Δ› CSV ϑΝΠϧΛ ZIP ѹॖ͢Δ› ZIP ѹॖσʔλΛΫϥυετϨʔδʹΞοϓϩʔυ͢Δੳ› σʔλαΠζ͕େ͖͍› σʔλͷՃ୯७ͳॲཧ7 / 27
ࠓͷඪϘτϧωοΫͲ͔͜› ZIP ѹॖͦΕ΄ͲେมͰͳ͍› σʔλՃ୯७ͳॲཧ› ϘτϧωοΫ I/O ॲཧʹ͋Γͦ͏Կͱ͔ͯ͠ I/O ॲཧΛਝʹॲཧ͍ͨ͠ʂʂʂ8 / 27
Today’s ThemeIn-Memory Streams9 / 27
Stream?ͦͦετϦʔϜͬͯԿʁετϦʔϜϑΝΠϧΦϒδΣΫτͰ͋Δɻ10 / 27
File Object?ϑΝΠϧΦϒδΣΫτͬͯԿʁ› read() write() ͳͲͷϝιουΛ࣋ͭΦϒδΣΫτ› σΟεΫ্ͷϑΝΠϧผͷॴʹ͋ΔετϨʔδɺೖग़ྗػثͱΓͱΓ͕Ͱ͖Δ11 / 27
File Object?ϑΝΠϧΦϒδΣΫτͨͪ› ੜόΠφϦϑΝΠϧ› όοϑΝ͖όΠφϦϑΝΠϧ› ςΩετϑΝΠϧ12 / 27
͍ํςΩετϑΝΠϧf = open("myfile.txt", "r")όοϑΝ͖όΠφϦf = open("myfile.jpg", "rb")13 / 27
open ؔͷཪଆopen ԿΛ͍ͯ͠Δͷ͔ʁOS ͷγεςϜίʔϧ API ΛݺͿ14 / 27
open ؔͷཪଆྫɿCSV ʹՃ͢Δwith open("events.csv", "w") as csv_file:fieldnames = ["title", "started_at", "ended_at"]writer = csv.DictWriter(csv_file, fieldnames)writer.writeheader()writer.writerows(events)15 / 27
open ؔͷཪଆྫɿWindows› CreateFileʢϑΝΠϧͷΞΫηεݖऔಘʣ› QueryAllInformationFileʢϑΝΠϧใͷऔಘʣ› WriteFileʢϑΝΠϧॻ͖ࠐΉʣ› CloseFileʢϑΝΠϧΛด͡ΔʣProcess Monitor ܦ༝Ͱ֬ೝͨ͠ɻ16 / 27
open ؔͷཪଆྫɿUbuntu on WSL› openat ʢϑΝΠϧͷΦʔϓϯʣ› fstatʢϑΝΠϧใͷऔಘʣ› ioctlʢσόΠε੍ޚʣ› lseekʢϑΝΠϧͷγʔΫʣ› writeʢϑΝΠϧॻ͖ࠐΉʣ› closeʢϑΝΠϧΛด͡Δʣstrace ܦ༝Ͱ֬ೝͨ͠ɻ17 / 27
࠷ޙʹস͏ͷ୭ͩ࠷ऴతͳՌͲ͜ʹஔ͘ʁ› ϑΝΠϧΛϩʔΧϧʹอଘ͢Δͷ͕ΰʔϧͰͳ͍› ϑΝΠϧΛ AWS S3 ͳͲͷ֎෦ʹஔ͖͍ͨϩʔΧϧσόΠεʹϑΝΠϧΛॻ͖ࠐΈͨ͘ͳ͍ʂ18 / 27
Today’s ThemeIn-Memory Streams19 / 27
ΠϯϝϞϦʔετϦʔϜΠϯϝϞϦʔετϦʔϜͱ› str bytes ΛϑΝΠϧΦϒδΣΫτͷΑ͏ʹѻ͑Δ› ಡΈॻ͖ՄೳɺϥϯμϜΞΫηεՄೳ20 / 27
StringIOStringIOςΩετϑΝΠϧͷͨΊͷΠϯϝϞϦετϦʔϜྫɿCSV Λ StringIO ͰऔΓѻ͏import iowith io.StringIO() as csv_file:fieldnames = ["title", "started_at", "ended_at"]writer = csv.DictWriter(csv_file, fieldnames)writer.writeheader()writer.writerows(events)21 / 27
BytesIOBytesIOόοϑΝ͖όΠφϦϑΝΠϧͷͨΊͷΠϯϝϞϦετϦʔϜྫɿPNG Λ BytesIO ͰऔΓѻ͏import iowith io.BytesIO(png_bytes) as f:png_header = f.read(8)print(png_header) # b'\x89PNG\r\n\x1a\n'22 / 27
෮शɿࠓͷඪॲཧͷྲྀΕ› Πϯλʔωοτܦ༝Ͱ GB αΠζͷσʔλΛऔಘ͢Δ› GB αΠζͷσʔλΛ CSV ϑΝΠϧʹՃ͢Δ› CSV ϑΝΠϧΛ ZIP ѹॖ͢Δ› ZIP ѹॖσʔλΛΫϥυετϨʔδʹΞοϓϩʔυ͢Δ23 / 27
σʔλΛΠϯλʔωοτܦ༝Ͱऔಘ͢ΔྫɿConnpass API Λίʔϧ͢Δwith urllib.request.urlopen(url) as response:events = json.load(response)["events"]24 / 27
σʔλΛՃ͢ΔྫɿAPI ͷऔಘ݁ՌΛ CSV ʹ͢Δwith io.StringIO() as ts:header = ["title", "started_at", "ended_at"]writer = csv.DictWriter(ts, fieldnames=header)writer.writeheader()writer.writerows(events)25 / 27
σʔλΛѹॖ&ΞοϓϩʔυྫɿZIP ʹѹॖͯ͠ AWS S3 ʹΞοϓϩʔυwith io.BytesIO() as bs:with zipfile.ZipFile(bytes_stream, "w") as zf:zf.writestr("events.csv", ts.getvalue())bs.seek(0) # ϑΝΠϧγʔΫ͕ϙΠϯτs3.upload_fileobj(bs, "bucket", "events.zip")26 / 27
Conclusion·ͱΊ› io ϞδϡʔϧʹΠϯϝϞϦʔετϦʔϜؚ͕·ΕΔɻ› str bytes ΛϑΝΠϧΦϒδΣΫτͷΑ͏ʹѻ͏͜ͱ͕Ͱ͖Δɻ› ௨ৗͷ open ͱҟͳΓγεςϜίʔϧ͕ݺΕͳ͍ɻ› σΟεΫͷ I/O ΛݮΒ͍ͨ͠ɺ·ͨͰ͖ͳ͍ঢ়گԼͰͷར༻͕࠷దͰ͋Δɻio ϞδϡʔϧΛօ༷ͷಓ۩ശʹೖΕ͍ͯͩ͘͞ʂ27 / 27