Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
Weather Data Scraping
Keiichiro
October 27, 2017
Programming
0
110
Weather Data Scraping
2017年10月27日に開催された "Pythonスクレイピング勉強会(APIによるデータの収集と活用)" で発表したスライドです。
Keiichiro
October 27, 2017
Tweet
Share
More Decks by Keiichiro
See All by Keiichiro
Let's try using AkiCart!!
9sq
0
660
Getting Started with ESP8266
9sq
1
160
Devsumi2016 OuchHackLT RESTful Toilet
9sq
0
71
An Attempt to Volcanic Activity Information Delivery using a Push Notification Service.
9sq
0
180
how to use ZY-FGD1442701V1 with mbed
9sq
0
5.7k
Qemb #01 Lightning Talk
9sq
2
130
Security SAKURA #04 Lightning Talk
9sq
1
2.6k
Other Decks in Programming
See All in Programming
新卒2年目がデータ分析API開発に挑戦【Stapy#88】/data-science-api-begginer
matsuik
0
330
Enumを自動で網羅的にテストしてみた
estie
0
1.2k
Functional Data Engineering - A Blueprint for adopting functional principles in data pipeline
vananth22
0
150
PHPアプリケーションにおけるアーキテクチャメトリクスについて / Architecture Metrics in PHP Applications
isanasan
1
200
Ruby Pattern Matching
bkuhlmann
0
600
OIDC仕様に準拠した Makuake ID連携基盤構築の裏側
ymtdzzz
0
120
SHOWROOMの分析目的を意識した伝え方・コミュニケーション
hatapu
0
230
低レイヤーから始める GUI
fadis
18
9.2k
監視せなあかんし、五大紙だけにオオカミってな🐺🐺🐺🐺🐺
sadnessojisan
2
1.1k
kakutanitalk2022_opening_act
shirotamaki
0
100
ペパカレで入社した私が感じた2つのギャップと向き合い方
kosuke_ito
0
110
eBPF와 함께 이해하는 Cilium 네트워킹
hadaney
3
830
Featured
See All Featured
Large-scale JavaScript Application Architecture
addyosmani
499
110k
Ruby is Unlike a Banana
tanoku
93
9.5k
Thoughts on Productivity
jonyablonski
49
2.7k
The World Runs on Bad Software
bkeepers
PRO
59
5.7k
10 Git Anti Patterns You Should be Aware of
lemiorhan
643
54k
StorybookのUI Testing Handbookを読んだ
zakiyama
8
3.2k
Automating Front-end Workflow
addyosmani
1351
200k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
270
12k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
239
19k
Bash Introduction
62gerente
601
210k
The Power of CSS Pseudo Elements
geoffreycrofte
52
4.3k
Three Pipe Problems
jasonvnalue
89
8.9k
Transcript
ຊࠃͷؾσʔλΛ εΫϨΠϐϯάͯ͠༡ΜͰΈΔ ,FJ:04)*.63"
ࣗݾհ w ٢ଜܓҰ ͚ʔ͍ͪ!42 w ήώϧϯͱ͍͏ձࣾͰࡂʹؔ͢ΔγεςϜΛ࡞͍ͬͯ·͢ ˠ+BWB 1ZUIPO +BWB4DSJQU
/PEFKT ͳͲΛ༻ w ͰઐҰԠిؾిࢠܥͷԿͰͰ͢ w ݄ճిࢠ࡞*P5ษڧձΛͬͯ·͢
None
None
εΫϨΠϐϯάରͱͯ͠ͷ ؾσʔλ
৭ʑͳใݯ w ؾிؔ ᵓؾி8FCαΠτ ᵓؾிࡂใ9.-ϑΥʔϚοτిจ ᵓ-Ξϥʔτ ެڞใίϞϯζ ᵓؾۀࢧԉηϯλʔ FUDʜ
w :BIPP"1*
৭ʑͳใݯ w ؾிؔ ᵓؾி8FCαΠτ ᵓؾிࡂใ9.-ϑΥʔϚοτిจ ᵓ-Ξϥʔτ ެڞใίϞϯζ ᵓؾۀࢧԉηϯλʔ FUDʜ
w :BIPP"1*
߱ਫྔ ".&%"4 ͷ σʔλΛऔಘͯ͠ΈΔ
ϑϩʔ w ʮ࠷৽ͷؾσʔλʯ$47μϯϩʔυ͔Βऔಘ w (FP+40/ʹม͢Δ w ਤ্ʹϚοϐϯάͯ͠ΈΔ 2(*4 ώʔτϚοϓ
w ؾிʮ࠷৽ͷؾσʔλʯ$47μϯϩʔυʹ͍ͭͯ w IUUQXXXEBUBKNBHPKQPCETUBUTEBUBNESS EPDT
[email protected]
@SFBENFIUNM ߱ਫྔͷऔಘ
߱ਫྔͷऔಘ ߱ਫྔશཁૉ࠷৽ IUUQXXXEBUBKNBHPKQPCETUBUTEBUBNESS
[email protected]
BMMUBCMF QSFBMM@SDUDTW ߱ਫྔશཁૉ࣌ࠁࢦఆ ݄࣌ͷ߹ IUUQXXXEBUBKNBHPKQPCETUBUTEBUBNESS
[email protected]
BMMUBCMF QSFBMM@DTW
؍ଌॴ൪߸,ಓݝ,,ࠃࡍ൪߸,ݱࡏ࣌ࠁ(),ݱࡏ࣌ࠁ(݄),ݱࡏ࣌ࠁ(),ݱࡏ࣌ࠁ(࣌),ݱࡏ࣌ࠁ (),1࣌ؒ߱ਫྔۃߋ৽,1࣌ؒ߱ਫྔۃߋ৽(10ະຬ),3࣌ؒ߱ਫྔۃߋ৽,3࣌ؒ߱ਫྔۃߋ৽(10 ະຬ),24࣌ؒ߱ਫྔۃߋ৽,24࣌ؒ߱ਫྔۃߋ৽(10ະຬ),48࣌ؒ߱ਫྔۃߋ৽,48࣌ؒ߱ਫྔۃߋ৽ (10ະຬ),72࣌ؒ߱ਫྔۃߋ৽,72࣌ؒ߱ਫྔۃߋ৽(10ະຬ),1࣌ؒ߱ਫྔ ݱࡏ(mm),1࣌ؒ߱ਫྔ ݱࡏͷ࣭ใ,1࣌ؒ߱ਫྔ ࠓͷ࠷େ(mm),1࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ใ,3࣌ؒ߱ਫྔ ݱࡏ (mm),3࣌ؒ߱ਫྔ
ݱࡏͷ࣭ใ,3࣌ؒ߱ਫྔ ࠓͷ࠷େ(mm),3࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ ใ,24࣌ؒ߱ਫྔ ݱࡏ(mm),24࣌ؒ߱ਫྔ ݱࡏͷ࣭ใ,24࣌ؒ߱ਫྔ ࠓͷ࠷େ(mm),24࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ใ,48࣌ؒ߱ਫྔ ݱࡏ(mm),48࣌ؒ߱ਫྔ ݱࡏͷ࣭ใ,48࣌ؒ߱ਫྔ ࠓͷ࠷େ (mm),48࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ใ,72࣌ؒ߱ਫྔ ݱࡏ(mm),72࣌ؒ߱ਫྔ ݱࡏͷ࣭ใ,72 ࣌ؒ߱ਫྔ ࠓͷ࠷େ(mm),72࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ใ 11001,ւಓ फ୩ํ,फ୩ິ,, 2017,10,27,19,30,,,,,,,,,,,0.0,8,0.0,5,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4 11016,ւಓ फ୩ํ,ஓ, 47401,2017,10,27,19,30,,,,,,,,,,,0.0,8,0.0,5,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8 ,0.0,4 11046,ւಓ फ୩ํ,ྱจ,, 2017,10,27,19,30,,,,,,,,,,,0.0,8,0.0,5,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4 11061,ւಓ फ୩ํ,,,2017,10,27,19,30,,,,,,,,,,,,1,,1,,1,,1,,1,,1,,1,,1,,1,,1 ߱ਫྔͷऔಘ
w ϚοϓʹΦʔόʔϨΠ͢Δʹใ͕Γͳ͍ ؍ଌίʔυɺ؍ଌ໊͋Δ͚Ͳ࠲ඪ͕ͳ͍ ߱ਫྔͷऔಘ
w Ҭؾ؍ଌγεςϜ Ξϝμε ͷ֓ཁ w IUUQXXXKNBHPKQKNBLJTIPVLOPX BNFEBTLBJTFUTVIUNM $47ܗࣜ Ҭؾ؍ଌॴҰཡ<;*1ѹॖܗࣜ>
؍ଌใͷ४උ
None
ද फ୩ फ୩ິ ιϠϛαΩ ஓࢢफ୩ິ
फ୩ ஓ ϫοΧφΠ ஓࢢ։ӡɹஓํؾ फ୩ ྱจ Ϩϒϯ ྱจ܊ྱจொେࣈ߳ਂଜࣈτϯφΠ फ୩ ίΤτΠ ஓࢢେࣈଜࣈɹஓߤۭؾ؍ଌॴ फ୩ َࢤผ ϋϚΦχγϕπ फ୩܊Ԑଜَࢤผ फ୩ ຊധ ϞτυϚϦ ར৲܊ར৲࢜ொԗധࣈຊധɹར৲ߤۭؾ؍ଌॴ फ୩ প ψϚΧϫ ஓࢢଜࣈপ फ୩ ۹ܗ ΫπΨλ ར৲܊ར৲ொ۹ܗࣈઘொ फ୩ ๛ τϤτϛ ఱԘ܊๛ொࣈ্αϩϕπ फ୩ ผ ϋϚτϯϕπ ࢬ܊ผொΫονϟϩބ൞ फ୩ தผ φΧτϯϕπ ࢬ܊தผொ্ۨ फ୩ ݟࢬ ΩλϛΤαγ ࢬ܊ࢬொຊொɹݟࢬಛผҬؾ؍ଌॴ फ୩ Վొ λϊϘϦ ࢬ܊ࢬொՎొ౦ொ फ୩ ຈԆ ϗϩϊϕ ఱԘ܊ຈԆொࣈ্ຈԆ ্ த φΧΨϫ த܊தொத ্ ԻҖࢠ ΦτΠωοϓ த܊ԻҖࢠଜԻҖࢠ ্ খं ΦάϧϚ த܊ඒਂொࣈখं ্ ඒਂ ϏϑΧ த܊ඒਂொொ ্ ໊د φϤϩ ໊دࢢେڮ ্ ෩࿈ χγϑϨϯ ໊دࢢ෩࿈ொ෩࿈ ඞཁͳใ͚ͩൈ͖ग़͢
import sys, shutil import csv, json import urllib.request import sqlite3
import codecs argvs = sys.argv argc = len(argvs) if (argc > 1): datetime = argvs[1] else: datetime = 'rct' # SQLiteʹΞϝμε؍ଌςʔϒϧΛల։ connection = sqlite3.connect(":memory:") cursor = connection.cursor() cursor.execute("CREATE TABLE amedas (code TEXT PRIMARY KEY, pref TEXT, name TEXT, kana TEXT, address TEXT, lat_d INTEGER, lat_m REAL, lon_d INTEGER, lon_m REAL);") with open("amedas_point.csv",'r') as fin: dr = csv.DictReader(fin, fieldnames = ('code', 'pref', 'name', 'kana', 'address', 'lat_d', 'lat_m', 'lon_d', 'lon_m')) to_db = [(c['code'], c['pref'], c['name'], c['kana'], c['address'], c['lat_d'], c['lat_m'], c['lon_d'], c['lon_m']) for c in dr] cursor.executemany("INSERT INTO amedas(code, pref, name, kana, address, lat_d, lat_m, lon_d, lon_m) VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?);", to_db) connection.commit()
# ؾிWebαΠτ͔ΒcsvΛऔಘ data_url = 'http://www.data.jma.go.jp/obd/stats/data/mdrr/pre_rct/alltable/ preall00_'+datetime+'.csv' data_req = urllib.request.Request(data_url) with
urllib.request.urlopen(data_req) as response: reader = csv.reader(response.read().decode('shift-jis').splitlines()) header = next(reader)
for r in reader: cursor.execute("SELECT lat_d, lat_m, lon_d, lon_m FROM
amedas WHERE code=" + r[0]) row = cursor.fetchone() if(row is not None): lat = float(row[0]) + float(row[1])/60 lon = float(row[2]) + float(row[3])/60 value1h = float(r[19]) if r[19] else None value3h = float(r[23]) if r[23] else None value24h = float(r[27]) if r[27] else None value48h = float(r[31]) if r[31] else None value72h = float(r[35]) if r[35] else None feature = { "geometry": { "type": "Point", "coordinates": [ float(lon), float(lat) ] }, "type": "Feature", "properties": { "code": r[0], "pref": r[1], "name": r[2], "value1h": value1h, "value3h": value3h, "value24h": value24h, "value48h": value48h, "value72h": value72h } } features.append(feature) featurecollection = {"type":"FeatureCollection","features":features}
f = open(datetime + ".json", "w") f.write(json.dumps(featurecollection, ensure_ascii=False)) f.close() shutil.copy(datetime
+ ".json", "recent.json") print("save to " + datetime + ".json")
%FNP ࣮ࡍʹσʔλΛऔಘͯ͠ɺ+40/Λ֬ೝͯ͠ΈΔ
None
None
(FP+40/Λඳըͯ͠ΈΔ
%FNP 2(*4Ͱ(FP+40/ΛಡΈࠐΜͰɺਤʹΦʔόʔϨΠͯ͠ΈΔ
None
མཕใ -*%&/ ͷ σʔλΛऔಘͯ͠ΈΔ
ϑϩʔ w ߴղ૾߱ਫφΩϟετ͔ΒམཕͱछྨΛऔಘ w (FP+40/ʹม͢Δ w ਤ্ʹϚοϐϯάͯ͠ΈΔ 0QFO-BZFST
མཕใͷऔಘ
མཕใͷऔಘ w σʔλҰཡ IUUQXXXKNBHPKQKQIJHISFTPSBE
[email protected]
[email protected]
YNM w ݸʑͷσʔλ IUUQTXXXKNBHPKQKQIJHISFTPSBE
[email protected]
-*%&/ OPOFEBUBYNM
import sys, shutil import json import urllib.request import xml.etree.ElementTree as
et argvs = sys.argv argc = len(argvs) if (argc > 1): datetime = argvs[1] else: basetime_url = 'http://www.jma.go.jp/jp/highresorad/highresorad_tile/ tile_basetime.xml' basetime_req = urllib.request.Request(basetime_url) with urllib.request.urlopen(basetime_req) as response: basetime_xml = response.read() basetime = et.fromstring(basetime_xml) datetime = basetime[0].text data_url = 'http://www.jma.go.jp/jp/highresorad/highresorad_tile/ LIDEN/'+datetime+'/'+datetime+'/none/data.xml' data_req = urllib.request.Request(data_url) with urllib.request.urlopen(data_req) as response: data_xml = response.read() data = et.fromstring(data_xml) features = []
for i,child in enumerate(data): if i is not 0: feature
= { "geometry": { "type": "Point", "coordinates": [ float(child.attrib["lon"]), float(child.attrib["lat"]) ] }, "type": "Feature", "properties": { "type": int(child.attrib["type"]) } } features.append(feature) featurecollection = {"type":"FeatureCollection","features":features} f = open(datetime + ".json", "w") f.write(json.dumps(featurecollection)) f.close() shutil.copy(datetime + ".json", "recent.json") print("save to " + datetime + ".json")
%FNP ࣮ࡍʹσʔλΛऔಘͯ͠ɺ+40/Λ֬ೝͯ͠ΈΔ
None
None
(FP+40/Λඳըͯ͠ΈΔ
%FNP 0QFO-BZFSTΛͬͯɺϒϥβͰඳըͯ͠ΈΔ
ιʔείʔυ w KNBMJEFOHFPKTPO IUUQTHJUIVCDPN42KNBMJEFOHFPKTPO
͓ΘΓ