Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FastAPI を活用した オープンデータAPI の作成

Yuuki Shimizu
September 11, 2021

FastAPI を活用した オープンデータAPI の作成

2021.9.11
Python Charity Talks in Japan 2021.09

Yuuki Shimizu

September 11, 2021
Tweet

More Decks by Yuuki Shimizu

Other Decks in Programming

Transcript

 1. 2021. 9. 11 ͠Έͣ Ώ͏͖
  1
  ʲPython Charity Talks in Japan 2021.09ʳ
  FastAPI Λ׆༻ͨ͠
  ΦʔϓϯσʔλAPI ͷ࡞੒

  View full-size slide

 2. ͓·͑୭Αʁ
  ͠Έͣ Ώ͏͖
  ● ϞόΠϧΞϓϦ ΤϯδχΞ
  ○ Android/iOS
  ● ࢁསݝߕ෎ࢢ ग़਎
  ● Python ͸ Shingen.py Ͱ৮ΕΔ͘Β͍
  2

  View full-size slide

 3. sli.do Ͱ࣭໰Λड͚෇͚͍ͯ·͢ʂ
  3

  View full-size slide

 4. ͸͡Ίʹ
  ● ݱࡏɺࢁསݝͷΦʔϓϯσʔλαΠτͰ͸
  12,000݅Λ௒͑Δσʔλ͕ެ։͞Ε͍ͯΔ
  ● ϞόΠϧΞϓϦͰར༻͍ͨ͠৔߹ɺWeb API
  Ͱఏڙ͞Ε͍ͯΔͱखܰʹࢼ͢͜ͱ͕Ͱ͖ͯ
  خ͍͠
  ● Shingen.py ͷษڧձͰ FastAPI Λࢼ͢ػձ͕
  ͋ΓɺΦʔϓϯσʔλར༻ʹ׆༻Ͱ͖ͳ͍͔
  ݕ౼ͨ͠
  4

  View full-size slide

 5. FastAPI ͱ͸
  ● OpenAPI ʹج͍ͮͯ࡞ΒΕ͍ͯ
  Δ Python ϑϨʔϜϫʔΫ
  ● ࡞Γ΍͢͞Λҙࣝ
  ● ʮૣ͍ɾ଎͍ɾ؆୯ʯ
  ○ ։ൃ͕ૣ͍
  ○ ͦΕͳΓʹύϑΥʔϚϯε͕ग़Δ
  ʢ଎͍ʣ
  ○ ؆୯ʹ࡞ΕΔ
  5
  from typing import Optional
  from fastapi import FastAPI
  app = FastAPI()
  @app.get("/")
  def read_root():
  return {"Hello": "World"}
  @app.get("/items/{item_id}")
  def read_item(item_id: int, q: Optional[str] = None):
  return return {"item_id": item_id, "q": q}

  View full-size slide

 6. ྫͱͯ͠ɺ
  ࢁསͷԹઘࢪઃΛฦ٫͢ΔAPI
  Λ FastAPI Ͱ࡞੒͢Δ
  6
  ͪͳΈʹ
  ࢁས͸ ઘ࣭͕ଟ༷ɺܠ؍ͷྑ͍Թઘ͕ͱͯ΋ଟ͍Ͱ͢ʂ

  View full-size slide

 7. ݝͷΦʔϓϯσʔλΛར༻
  7

  View full-size slide

 8. PDF Ͱ΋େৎ෉ʂ
  PDF Λͦͷ··ϩʔυͯ͠ม׵ޙɺ
  FastAPI Ͱฦ٫͢ΔΑ͏࣮૷͠·ͨ͠
  9
  Python ศརͰ͢Ͷ

  View full-size slide

 9. chezou/tabula-py
  ● PDF ϑΝΠϧ಺ͷදΛ pandas
  ͷ DataFrame ΦϒδΣΫτʹ
  ม׵͢ΔϥΠϒϥϦ
  ○ CSVɺTSVɺJSON ϑΝΠϧʹม
  ׵͢Δ͜ͱ΋Մೳ
  ● OCR πʔϧͰ͸ͳ͍
  ● Java 8 Ҏ͕߱ඞཁ
  10

  View full-size slide

 10. main.py - ᶃ PDF ಡΈࠐΈ
  def check_columns(df, previous_df):
  difference1 = set(df.keys()) - set(previous_df.keys())
  difference2 = set(previous_df.keys()) - set(df.keys())
  return (len(difference1) == 0 and len(difference2) == 0)
  11
  def get_data(pdf_path):
  previous_df = pd.DataFrame()
  dfs = tabula.read_pdf(pdf_path, lattice=True, pages = 'all')
  for df in dfs: # ෳ਺ϖʔδͷදΛ݁߹͢Δ
  if (check_columns(df, previous_df)):
  df = pd.concat([previous_df, df])
  previous_df = df
  return previous_df
  PDFΛಡΈࠐΈɺDataFrame Φϒ
  δΣΫτΛฦ٫͢Δ
  ෳ਺ϖʔδʹ·͕ͨΔදͷ߲໨໊
  Λൺֱ͠ɺಉ͡ද͔Ͳ͏͔Λ൑ఆ
  ͢Δʢ্ͷؔ਺͔Βݺ͹ΕΔʣ

  View full-size slide

 11. main.py - ᶄ API ࡞੒
  12
  app = FastAPI()
  pdf_path = "h3012011.pdf"
  @app.get("/")
  def read_root():
  data = get_data(pdf_path)
  json_data = data.to_json(orient = 'records')
  return json.loads(json_data)
  @app.get("/area/{area}")
  def read_item(area: str):
  data = get_data(pdf_path)
  df_mask = data['ࢢொଜ໊'] == area
  data = data[df_mask]
  json_data = data.to_json(orient = 'records')
  return json.loads(json_data)
  [get] /
  શ݅ฦ٫͢Δ API
  [get] /area/{area}
  ࢦఆ͞ΕͨࢢொଜͷΈฦ٫͢Δ
  API

  View full-size slide

 12. Docker Λ࢖༻
  13
  VPS
  Nginx
  (ϦόʔεϓϩΩγ)
  opendata.yamanashi.dev
  /api/onsen
  Docker コンテナ
  FastAPI
  localhost:xxxxx
  main.py
  ࢁསݝ
  WebαΠτ
  CSV PDF
  tiangolo/uvicorn-gunicorn-fastapi
  :python3.8-alpine3.10
  ্هͷΠϝʔδΛϕʔεʹ openjdk11 ΛΠϯετʔϧ
  ͨ͠΋ͷΛ࢖༻

  View full-size slide

 13. DEMO
  14
  https://opendata.yamanashi.dev/api/onsen

  View full-size slide

 14. ࢁསݝΦʔϓϯσʔλAPIϓϩδΣΫτ
  ● ϓϩδΣΫτ αΠτ
  ○ ݝͷΦʔϓϯσʔλαΠτͰެ։
  ͞Ε͍ͯΔσʔλΛ API Ͱఏڙ
  ● GitHub
  ○ ιʔείʔυΛެ։
  ○ ߋ৽͢Δ͜ͱͰࣗಈσϓϩΠ
  ● DockerHub
  ○ FastAPIɺTabula ͕࣮ߦՄೳͳ
  Docker ΠϝʔδΛఏڙ
  15
  https://opendata.yamanashi.dev

  View full-size slide

 15. ຊϓϩδΣΫτ͕໨ࢦ͢ͱ͜Ζ
  16
  ● ଞͷΦʔϓϯσʔλ΁ͷAPIల։Λ༰қʹ͢Δ
  ○ ϦϙδτϦΛෳ੡͠ɺmain.py ΛΧελϚΠζ͢Ε͹ OK
  ● ΦʔϓϯσʔλAPIαʔόͷى্ͪ͛Λ༰қʹ͢Δ
  ○ Docker ؀ڥ͕͋Ε͹ϫϯϥΠφʔͰى্ͪ͛Մೳ
  σʔλར༻ʹ͍ͭͯ͸
  ࢁསݝΦʔϓϯσʔλαΠτར༻ن໿
  ʹै͏ඞཁ͕͋Γ·͢

  View full-size slide

 16. ·ͱΊ
  18
  ʮFastAPI Λ׆༻ͨ͠ ΦʔϓϯσʔλAPI ͷ࡞੒ʯ
  ● FastAPI ͸Φʔϓϯσʔλ͔Βखܰʹ API Λ࡞੒͢Δͷʹ޲͍͍ͯΔ
  ○ Tabula ͱ૊Έ߹ΘͤΔ͜ͱͰɺPDF ϑΝΠϧ΋ FastAPI ʹࡌͤΔ͜ͱ͕
  Մೳ
  ● ࢁསݝΦʔϓϯσʔλAPI ϓϩδΣΫτΛى্ͪ͛
  ○ API࡞੒ɾAPIαʔόى্ͪ͛Λ༰қʹ͠ɺΦʔϓϯσʔλͷར༻ଅਐʹ
  ܨ͍͛ͨ

  View full-size slide