Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FastAPI を活用した オープンデータAPI の作成

1ac022a81a17900b312d8f7b03d2b4db?s=47 Yuuki Shimizu
September 11, 2021

FastAPI を活用した オープンデータAPI の作成

2021.9.11
Python Charity Talks in Japan 2021.09

1ac022a81a17900b312d8f7b03d2b4db?s=128

Yuuki Shimizu

September 11, 2021
Tweet

Transcript

  1. 2021. 9. 11 ͠Έͣ Ώ͏͖ 1 ʲPython Charity Talks in

    Japan 2021.09ʳ FastAPI Λ׆༻ͨ͠ ΦʔϓϯσʔλAPI ͷ࡞੒
  2. ͓·͑୭Αʁ ͠Έͣ Ώ͏͖ • ϞόΠϧΞϓϦ ΤϯδχΞ ◦ Android/iOS • ࢁསݝߕ෎ࢢ

    ग़਎ • Python ͸ Shingen.py Ͱ৮ΕΔ͘Β͍ 2
  3. sli.do Ͱ࣭໰Λड͚෇͚͍ͯ·͢ʂ 3

  4. ͸͡Ίʹ • ݱࡏɺࢁསݝͷΦʔϓϯσʔλαΠτͰ͸ 12,000݅Λ௒͑Δσʔλ͕ެ։͞Ε͍ͯΔ • ϞόΠϧΞϓϦͰར༻͍ͨ͠৔߹ɺWeb API Ͱఏڙ͞Ε͍ͯΔͱखܰʹࢼ͢͜ͱ͕Ͱ͖ͯ خ͍͠ •

    Shingen.py ͷษڧձͰ FastAPI Λࢼ͢ػձ͕ ͋ΓɺΦʔϓϯσʔλར༻ʹ׆༻Ͱ͖ͳ͍͔ ݕ౼ͨ͠ 4
  5. FastAPI ͱ͸ • OpenAPI ʹج͍ͮͯ࡞ΒΕ͍ͯ Δ Python ϑϨʔϜϫʔΫ • ࡞Γ΍͢͞Λҙࣝ

    • ʮૣ͍ɾ଎͍ɾ؆୯ʯ ◦ ։ൃ͕ૣ͍ ◦ ͦΕͳΓʹύϑΥʔϚϯε͕ग़Δ ʢ଎͍ʣ ◦ ؆୯ʹ࡞ΕΔ 5 from typing import Optional from fastapi import FastAPI app = FastAPI() @app.get("/") def read_root(): return {"Hello": "World"} @app.get("/items/{item_id}") def read_item(item_id: int, q: Optional[str] = None): return return {"item_id": item_id, "q": q}
  6. ྫͱͯ͠ɺ ࢁསͷԹઘࢪઃΛฦ٫͢ΔAPI Λ FastAPI Ͱ࡞੒͢Δ 6 ͪͳΈʹ ࢁས͸ ઘ࣭͕ଟ༷ɺܠ؍ͷྑ͍Թઘ͕ͱͯ΋ଟ͍Ͱ͢ʂ

  7. ݝͷΦʔϓϯσʔλΛར༻ 7

  8. 8 PDF !!

  9. PDF Ͱ΋େৎ෉ʂ PDF Λͦͷ··ϩʔυͯ͠ม׵ޙɺ FastAPI Ͱฦ٫͢ΔΑ͏࣮૷͠·ͨ͠ 9 Python ศརͰ͢Ͷ

  10. chezou/tabula-py • PDF ϑΝΠϧ಺ͷදΛ pandas ͷ DataFrame ΦϒδΣΫτʹ ม׵͢ΔϥΠϒϥϦ ◦

    CSVɺTSVɺJSON ϑΝΠϧʹม ׵͢Δ͜ͱ΋Մೳ • OCR πʔϧͰ͸ͳ͍ • Java 8 Ҏ͕߱ඞཁ 10
  11. main.py - ᶃ PDF ಡΈࠐΈ def check_columns(df, previous_df): difference1 =

    set(df.keys()) - set(previous_df.keys()) difference2 = set(previous_df.keys()) - set(df.keys()) return (len(difference1) == 0 and len(difference2) == 0) 11 def get_data(pdf_path): previous_df = pd.DataFrame() dfs = tabula.read_pdf(pdf_path, lattice=True, pages = 'all') for df in dfs: # ෳ਺ϖʔδͷදΛ݁߹͢Δ if (check_columns(df, previous_df)): df = pd.concat([previous_df, df]) previous_df = df return previous_df PDFΛಡΈࠐΈɺDataFrame Φϒ δΣΫτΛฦ٫͢Δ ෳ਺ϖʔδʹ·͕ͨΔදͷ߲໨໊ Λൺֱ͠ɺಉ͡ද͔Ͳ͏͔Λ൑ఆ ͢Δʢ্ͷؔ਺͔Βݺ͹ΕΔʣ
  12. main.py - ᶄ API ࡞੒ 12 app = FastAPI() pdf_path

    = "h3012011.pdf" @app.get("/") def read_root(): data = get_data(pdf_path) json_data = data.to_json(orient = 'records') return json.loads(json_data) @app.get("/area/{area}") def read_item(area: str): data = get_data(pdf_path) df_mask = data['ࢢொଜ໊'] == area data = data[df_mask] json_data = data.to_json(orient = 'records') return json.loads(json_data) [get] / શ݅ฦ٫͢Δ API [get] /area/{area} ࢦఆ͞ΕͨࢢொଜͷΈฦ٫͢Δ API
  13. Docker Λ࢖༻ 13 VPS Nginx (ϦόʔεϓϩΩγ) opendata.yamanashi.dev /api/onsen Docker コンテナ

    FastAPI localhost:xxxxx main.py ࢁསݝ WebαΠτ CSV PDF tiangolo/uvicorn-gunicorn-fastapi :python3.8-alpine3.10 ্هͷΠϝʔδΛϕʔεʹ openjdk11 ΛΠϯετʔϧ ͨ͠΋ͷΛ࢖༻
  14. DEMO 14 https://opendata.yamanashi.dev/api/onsen

  15. ࢁསݝΦʔϓϯσʔλAPIϓϩδΣΫτ • ϓϩδΣΫτ αΠτ ◦ ݝͷΦʔϓϯσʔλαΠτͰެ։ ͞Ε͍ͯΔσʔλΛ API Ͱఏڙ •

    GitHub ◦ ιʔείʔυΛެ։ ◦ ߋ৽͢Δ͜ͱͰࣗಈσϓϩΠ • DockerHub ◦ FastAPIɺTabula ͕࣮ߦՄೳͳ Docker ΠϝʔδΛఏڙ 15 https://opendata.yamanashi.dev
  16. ຊϓϩδΣΫτ͕໨ࢦ͢ͱ͜Ζ 16 • ଞͷΦʔϓϯσʔλ΁ͷAPIల։Λ༰қʹ͢Δ ◦ ϦϙδτϦΛෳ੡͠ɺmain.py ΛΧελϚΠζ͢Ε͹ OK • ΦʔϓϯσʔλAPIαʔόͷى্ͪ͛Λ༰қʹ͢Δ

    ◦ Docker ؀ڥ͕͋Ε͹ϫϯϥΠφʔͰى্ͪ͛Մೳ σʔλར༻ʹ͍ͭͯ͸ ࢁསݝΦʔϓϯσʔλαΠτར༻ن໿ ʹै͏ඞཁ͕͋Γ·͢
  17. ͍͞͝ʹ 17

  18. ·ͱΊ 18 ʮFastAPI Λ׆༻ͨ͠ ΦʔϓϯσʔλAPI ͷ࡞੒ʯ • FastAPI ͸Φʔϓϯσʔλ͔Βखܰʹ API

    Λ࡞੒͢Δͷʹ޲͍͍ͯΔ ◦ Tabula ͱ૊Έ߹ΘͤΔ͜ͱͰɺPDF ϑΝΠϧ΋ FastAPI ʹࡌͤΔ͜ͱ͕ Մೳ • ࢁསݝΦʔϓϯσʔλAPI ϓϩδΣΫτΛى্ͪ͛ ◦ API࡞੒ɾAPIαʔόى্ͪ͛Λ༰қʹ͠ɺΦʔϓϯσʔλͷར༻ଅਐʹ ܨ͍͛ͨ