Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FastAPI を活用した オープンデータAPI の作成

Yuuki Shimizu
September 11, 2021

FastAPI を活用した オープンデータAPI の作成

2021.9.11
Python Charity Talks in Japan 2021.09

Yuuki Shimizu

September 11, 2021
Tweet

More Decks by Yuuki Shimizu

Other Decks in Programming

Transcript

  1. 2021. 9. 11 ͠Έͣ Ώ͏͖ 1 ʲPython Charity Talks in

    Japan 2021.09ʳ FastAPI Λ׆༻ͨ͠ ΦʔϓϯσʔλAPI ͷ࡞੒
  2. FastAPI ͱ͸ • OpenAPI ʹج͍ͮͯ࡞ΒΕ͍ͯ Δ Python ϑϨʔϜϫʔΫ • ࡞Γ΍͢͞Λҙࣝ

    • ʮૣ͍ɾ଎͍ɾ؆୯ʯ ◦ ։ൃ͕ૣ͍ ◦ ͦΕͳΓʹύϑΥʔϚϯε͕ग़Δ ʢ଎͍ʣ ◦ ؆୯ʹ࡞ΕΔ 5 from typing import Optional from fastapi import FastAPI app = FastAPI() @app.get("/") def read_root(): return {"Hello": "World"} @app.get("/items/{item_id}") def read_item(item_id: int, q: Optional[str] = None): return return {"item_id": item_id, "q": q}
  3. chezou/tabula-py • PDF ϑΝΠϧ಺ͷදΛ pandas ͷ DataFrame ΦϒδΣΫτʹ ม׵͢ΔϥΠϒϥϦ ◦

    CSVɺTSVɺJSON ϑΝΠϧʹม ׵͢Δ͜ͱ΋Մೳ • OCR πʔϧͰ͸ͳ͍ • Java 8 Ҏ͕߱ඞཁ 10
  4. main.py - ᶃ PDF ಡΈࠐΈ def check_columns(df, previous_df): difference1 =

    set(df.keys()) - set(previous_df.keys()) difference2 = set(previous_df.keys()) - set(df.keys()) return (len(difference1) == 0 and len(difference2) == 0) 11 def get_data(pdf_path): previous_df = pd.DataFrame() dfs = tabula.read_pdf(pdf_path, lattice=True, pages = 'all') for df in dfs: # ෳ਺ϖʔδͷදΛ݁߹͢Δ if (check_columns(df, previous_df)): df = pd.concat([previous_df, df]) previous_df = df return previous_df PDFΛಡΈࠐΈɺDataFrame Φϒ δΣΫτΛฦ٫͢Δ ෳ਺ϖʔδʹ·͕ͨΔදͷ߲໨໊ Λൺֱ͠ɺಉ͡ද͔Ͳ͏͔Λ൑ఆ ͢Δʢ্ͷؔ਺͔Βݺ͹ΕΔʣ
  5. main.py - ᶄ API ࡞੒ 12 app = FastAPI() pdf_path

    = "h3012011.pdf" @app.get("/") def read_root(): data = get_data(pdf_path) json_data = data.to_json(orient = 'records') return json.loads(json_data) @app.get("/area/{area}") def read_item(area: str): data = get_data(pdf_path) df_mask = data['ࢢொଜ໊'] == area data = data[df_mask] json_data = data.to_json(orient = 'records') return json.loads(json_data) [get] / શ݅ฦ٫͢Δ API [get] /area/{area} ࢦఆ͞ΕͨࢢொଜͷΈฦ٫͢Δ API
  6. Docker Λ࢖༻ 13 VPS Nginx (ϦόʔεϓϩΩγ) opendata.yamanashi.dev /api/onsen Docker コンテナ

    FastAPI localhost:xxxxx main.py ࢁསݝ WebαΠτ CSV PDF tiangolo/uvicorn-gunicorn-fastapi :python3.8-alpine3.10 ্هͷΠϝʔδΛϕʔεʹ openjdk11 ΛΠϯετʔϧ ͨ͠΋ͷΛ࢖༻
  7. ࢁསݝΦʔϓϯσʔλAPIϓϩδΣΫτ • ϓϩδΣΫτ αΠτ ◦ ݝͷΦʔϓϯσʔλαΠτͰެ։ ͞Ε͍ͯΔσʔλΛ API Ͱఏڙ •

    GitHub ◦ ιʔείʔυΛެ։ ◦ ߋ৽͢Δ͜ͱͰࣗಈσϓϩΠ • DockerHub ◦ FastAPIɺTabula ͕࣮ߦՄೳͳ Docker ΠϝʔδΛఏڙ 15 https://opendata.yamanashi.dev
  8. ຊϓϩδΣΫτ͕໨ࢦ͢ͱ͜Ζ 16 • ଞͷΦʔϓϯσʔλ΁ͷAPIల։Λ༰қʹ͢Δ ◦ ϦϙδτϦΛෳ੡͠ɺmain.py ΛΧελϚΠζ͢Ε͹ OK • ΦʔϓϯσʔλAPIαʔόͷى্ͪ͛Λ༰қʹ͢Δ

    ◦ Docker ؀ڥ͕͋Ε͹ϫϯϥΠφʔͰى্ͪ͛Մೳ σʔλར༻ʹ͍ͭͯ͸ ࢁསݝΦʔϓϯσʔλαΠτར༻ن໿ ʹै͏ඞཁ͕͋Γ·͢
  9. ·ͱΊ 18 ʮFastAPI Λ׆༻ͨ͠ ΦʔϓϯσʔλAPI ͷ࡞੒ʯ • FastAPI ͸Φʔϓϯσʔλ͔Βखܰʹ API

    Λ࡞੒͢Δͷʹ޲͍͍ͯΔ ◦ Tabula ͱ૊Έ߹ΘͤΔ͜ͱͰɺPDF ϑΝΠϧ΋ FastAPI ʹࡌͤΔ͜ͱ͕ Մೳ • ࢁསݝΦʔϓϯσʔλAPI ϓϩδΣΫτΛى্ͪ͛ ◦ API࡞੒ɾAPIαʔόى্ͪ͛Λ༰қʹ͠ɺΦʔϓϯσʔλͷར༻ଅਐʹ ܨ͍͛ͨ