Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
Weather Data Scraping
Keiichiro
October 27, 2017
Programming
0
110
Weather Data Scraping
2017年10月27日に開催された "Pythonスクレイピング勉強会(APIによるデータの収集と活用)" で発表したスライドです。
Keiichiro
October 27, 2017
Tweet
Share
More Decks by Keiichiro
See All by Keiichiro
Let's try using AkiCart!!
9sq
0
630
Getting Started with ESP8266
9sq
1
150
Devsumi2016 OuchHackLT RESTful Toilet
9sq
0
44
An Attempt to Volcanic Activity Information Delivery using a Push Notification Service.
9sq
0
150
how to use ZY-FGD1442701V1 with mbed
9sq
0
5.5k
Qemb #01 Lightning Talk
9sq
2
120
Security SAKURA #04 Lightning Talk
9sq
1
2.5k
Other Decks in Programming
See All in Programming
How to start contributing to Kubernetes Projects
ydfu
0
140
企業内スモールデータでのデータ解析
hamage9
0
890
このタイミングで知っておきたい 開発生産性の高いエンジニア組織の特徴とは / dev-sumi-20220721-productivity-features
findyinc
7
2.6k
Efficient UI testing in Android
alexzhukovich
1
120
Enzyme から React Native Testing Library に移行した経緯 / 2022-07-20
tamago3keran
1
160
段階的な技術的負債の解消方法.pdf
ko2ic
2
910
Computer Vision Seminar 1/コンピュータビジョンセミナーvol.1 OpenCV活用
fixstars
0
160
WindowsコンテナDojo:第6回 Red Hat OpenShift入門
oniak3ibm
PRO
0
180
How to Test Your Compose UI (Droidcon Berlin 2022)
stewemetal
1
130
Amazon SageMakerでImagenを動かして猫画像生成してみた
hotoke_neko
0
110
動画合成アーキテクチャを実装してみて
satorunooshie
0
540
FutureCon 2022 FlutterアプリのPerformance測定
harukafujita
0
130
Featured
See All Featured
Fantastic passwords and where to find them - at NoRuKo
philnash
27
1.6k
Designing Experiences People Love
moore
130
22k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
655
120k
Happy Clients
brianwarren
89
5.6k
Product Roadmaps are Hard
iamctodd
35
6.8k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
151
13k
Building a Scalable Design System with Sketch
lauravandoore
448
30k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
37
3.3k
5 minutes of I Can Smell Your CMS
philhawksworth
196
18k
Navigating Team Friction
lara
175
11k
Three Pipe Problems
jasonvnalue
89
8.7k
jQuery: Nuts, Bolts and Bling
dougneiner
56
6.4k
Transcript
ຊࠃͷؾσʔλΛ εΫϨΠϐϯάͯ͠༡ΜͰΈΔ ,FJ:04)*.63"
ࣗݾհ w ٢ଜܓҰ ͚ʔ͍ͪ!42 w ήώϧϯͱ͍͏ձࣾͰࡂʹؔ͢ΔγεςϜΛ࡞͍ͬͯ·͢ ˠ+BWB 1ZUIPO +BWB4DSJQU
/PEFKT ͳͲΛ༻ w ͰઐҰԠిؾిࢠܥͷԿͰͰ͢ w ݄ճిࢠ࡞*P5ษڧձΛͬͯ·͢
None
None
εΫϨΠϐϯάରͱͯ͠ͷ ؾσʔλ
৭ʑͳใݯ w ؾிؔ ᵓؾி8FCαΠτ ᵓؾிࡂใ9.-ϑΥʔϚοτిจ ᵓ-Ξϥʔτ ެڞใίϞϯζ ᵓؾۀࢧԉηϯλʔ FUDʜ
w :BIPP"1*
৭ʑͳใݯ w ؾிؔ ᵓؾி8FCαΠτ ᵓؾிࡂใ9.-ϑΥʔϚοτిจ ᵓ-Ξϥʔτ ެڞใίϞϯζ ᵓؾۀࢧԉηϯλʔ FUDʜ
w :BIPP"1*
߱ਫྔ ".&%"4 ͷ σʔλΛऔಘͯ͠ΈΔ
ϑϩʔ w ʮ࠷৽ͷؾσʔλʯ$47μϯϩʔυ͔Βऔಘ w (FP+40/ʹม͢Δ w ਤ্ʹϚοϐϯάͯ͠ΈΔ 2(*4 ώʔτϚοϓ
w ؾிʮ࠷৽ͷؾσʔλʯ$47μϯϩʔυʹ͍ͭͯ w IUUQXXXEBUBKNBHPKQPCETUBUTEBUBNESS EPDTDTW@EM@SFBENFIUNM ߱ਫྔͷऔಘ
߱ਫྔͷऔಘ ߱ਫྔશཁૉ࠷৽ IUUQXXXEBUBKNBHPKQPCETUBUTEBUBNESSQSF@SDUBMMUBCMF QSFBMM@SDUDTW ߱ਫྔશཁૉ࣌ࠁࢦఆ ݄࣌ͷ߹ IUUQXXXEBUBKNBHPKQPCETUBUTEBUBNESSQSF@SDUBMMUBCMF QSFBMM@DTW
؍ଌॴ൪߸,ಓݝ,,ࠃࡍ൪߸,ݱࡏ࣌ࠁ(),ݱࡏ࣌ࠁ(݄),ݱࡏ࣌ࠁ(),ݱࡏ࣌ࠁ(࣌),ݱࡏ࣌ࠁ (),1࣌ؒ߱ਫྔۃߋ৽,1࣌ؒ߱ਫྔۃߋ৽(10ະຬ),3࣌ؒ߱ਫྔۃߋ৽,3࣌ؒ߱ਫྔۃߋ৽(10 ະຬ),24࣌ؒ߱ਫྔۃߋ৽,24࣌ؒ߱ਫྔۃߋ৽(10ະຬ),48࣌ؒ߱ਫྔۃߋ৽,48࣌ؒ߱ਫྔۃߋ৽ (10ະຬ),72࣌ؒ߱ਫྔۃߋ৽,72࣌ؒ߱ਫྔۃߋ৽(10ະຬ),1࣌ؒ߱ਫྔ ݱࡏ(mm),1࣌ؒ߱ਫྔ ݱࡏͷ࣭ใ,1࣌ؒ߱ਫྔ ࠓͷ࠷େ(mm),1࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ใ,3࣌ؒ߱ਫྔ ݱࡏ (mm),3࣌ؒ߱ਫྔ
ݱࡏͷ࣭ใ,3࣌ؒ߱ਫྔ ࠓͷ࠷େ(mm),3࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ ใ,24࣌ؒ߱ਫྔ ݱࡏ(mm),24࣌ؒ߱ਫྔ ݱࡏͷ࣭ใ,24࣌ؒ߱ਫྔ ࠓͷ࠷େ(mm),24࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ใ,48࣌ؒ߱ਫྔ ݱࡏ(mm),48࣌ؒ߱ਫྔ ݱࡏͷ࣭ใ,48࣌ؒ߱ਫྔ ࠓͷ࠷େ (mm),48࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ใ,72࣌ؒ߱ਫྔ ݱࡏ(mm),72࣌ؒ߱ਫྔ ݱࡏͷ࣭ใ,72 ࣌ؒ߱ਫྔ ࠓͷ࠷େ(mm),72࣌ؒ߱ਫྔ ࠓͷ࠷େͷ࣭ใ 11001,ւಓ फ୩ํ,फ୩ິ,, 2017,10,27,19,30,,,,,,,,,,,0.0,8,0.0,5,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4 11016,ւಓ फ୩ํ,ஓ, 47401,2017,10,27,19,30,,,,,,,,,,,0.0,8,0.0,5,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8 ,0.0,4 11046,ւಓ फ୩ํ,ྱจ,, 2017,10,27,19,30,,,,,,,,,,,0.0,8,0.0,5,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4,0.0,8,0.0,4 11061,ւಓ फ୩ํ,,,2017,10,27,19,30,,,,,,,,,,,,1,,1,,1,,1,,1,,1,,1,,1,,1,,1 ߱ਫྔͷऔಘ
w ϚοϓʹΦʔόʔϨΠ͢Δʹใ͕Γͳ͍ ؍ଌίʔυɺ؍ଌ໊͋Δ͚Ͳ࠲ඪ͕ͳ͍ ߱ਫྔͷऔಘ
w Ҭؾ؍ଌγεςϜ Ξϝμε ͷ֓ཁ w IUUQXXXKNBHPKQKNBLJTIPVLOPX BNFEBTLBJTFUTVIUNM $47ܗࣜ Ҭؾ؍ଌॴҰཡ<;*1ѹॖܗࣜ>
؍ଌใͷ४උ
None
ද फ୩ फ୩ິ ιϠϛαΩ ஓࢢफ୩ິ
फ୩ ஓ ϫοΧφΠ ஓࢢ։ӡɹஓํؾ फ୩ ྱจ Ϩϒϯ ྱจ܊ྱจொେࣈ߳ਂଜࣈτϯφΠ फ୩ ίΤτΠ ஓࢢେࣈଜࣈɹஓߤۭؾ؍ଌॴ फ୩ َࢤผ ϋϚΦχγϕπ फ୩܊Ԑଜَࢤผ फ୩ ຊധ ϞτυϚϦ ར৲܊ར৲࢜ொԗധࣈຊധɹར৲ߤۭؾ؍ଌॴ फ୩ প ψϚΧϫ ஓࢢଜࣈপ फ୩ ۹ܗ ΫπΨλ ར৲܊ར৲ொ۹ܗࣈઘொ फ୩ ๛ τϤτϛ ఱԘ܊๛ொࣈ্αϩϕπ फ୩ ผ ϋϚτϯϕπ ࢬ܊ผொΫονϟϩބ൞ फ୩ தผ φΧτϯϕπ ࢬ܊தผொ্ۨ फ୩ ݟࢬ ΩλϛΤαγ ࢬ܊ࢬொຊொɹݟࢬಛผҬؾ؍ଌॴ फ୩ Վొ λϊϘϦ ࢬ܊ࢬொՎొ౦ொ फ୩ ຈԆ ϗϩϊϕ ఱԘ܊ຈԆொࣈ্ຈԆ ্ த φΧΨϫ த܊தொத ্ ԻҖࢠ ΦτΠωοϓ த܊ԻҖࢠଜԻҖࢠ ্ খं ΦάϧϚ த܊ඒਂொࣈখं ্ ඒਂ ϏϑΧ த܊ඒਂொொ ্ ໊د φϤϩ ໊دࢢେڮ ্ ෩࿈ χγϑϨϯ ໊دࢢ෩࿈ொ෩࿈ ඞཁͳใ͚ͩൈ͖ग़͢
import sys, shutil import csv, json import urllib.request import sqlite3
import codecs argvs = sys.argv argc = len(argvs) if (argc > 1): datetime = argvs[1] else: datetime = 'rct' # SQLiteʹΞϝμε؍ଌςʔϒϧΛల։ connection = sqlite3.connect(":memory:") cursor = connection.cursor() cursor.execute("CREATE TABLE amedas (code TEXT PRIMARY KEY, pref TEXT, name TEXT, kana TEXT, address TEXT, lat_d INTEGER, lat_m REAL, lon_d INTEGER, lon_m REAL);") with open("amedas_point.csv",'r') as fin: dr = csv.DictReader(fin, fieldnames = ('code', 'pref', 'name', 'kana', 'address', 'lat_d', 'lat_m', 'lon_d', 'lon_m')) to_db = [(c['code'], c['pref'], c['name'], c['kana'], c['address'], c['lat_d'], c['lat_m'], c['lon_d'], c['lon_m']) for c in dr] cursor.executemany("INSERT INTO amedas(code, pref, name, kana, address, lat_d, lat_m, lon_d, lon_m) VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?);", to_db) connection.commit()
# ؾிWebαΠτ͔ΒcsvΛऔಘ data_url = 'http://www.data.jma.go.jp/obd/stats/data/mdrr/pre_rct/alltable/ preall00_'+datetime+'.csv' data_req = urllib.request.Request(data_url) with
urllib.request.urlopen(data_req) as response: reader = csv.reader(response.read().decode('shift-jis').splitlines()) header = next(reader)
for r in reader: cursor.execute("SELECT lat_d, lat_m, lon_d, lon_m FROM
amedas WHERE code=" + r[0]) row = cursor.fetchone() if(row is not None): lat = float(row[0]) + float(row[1])/60 lon = float(row[2]) + float(row[3])/60 value1h = float(r[19]) if r[19] else None value3h = float(r[23]) if r[23] else None value24h = float(r[27]) if r[27] else None value48h = float(r[31]) if r[31] else None value72h = float(r[35]) if r[35] else None feature = { "geometry": { "type": "Point", "coordinates": [ float(lon), float(lat) ] }, "type": "Feature", "properties": { "code": r[0], "pref": r[1], "name": r[2], "value1h": value1h, "value3h": value3h, "value24h": value24h, "value48h": value48h, "value72h": value72h } } features.append(feature) featurecollection = {"type":"FeatureCollection","features":features}
f = open(datetime + ".json", "w") f.write(json.dumps(featurecollection, ensure_ascii=False)) f.close() shutil.copy(datetime
+ ".json", "recent.json") print("save to " + datetime + ".json")
%FNP ࣮ࡍʹσʔλΛऔಘͯ͠ɺ+40/Λ֬ೝͯ͠ΈΔ
None
None
(FP+40/Λඳըͯ͠ΈΔ
%FNP 2(*4Ͱ(FP+40/ΛಡΈࠐΜͰɺਤʹΦʔόʔϨΠͯ͠ΈΔ
None
མཕใ -*%&/ ͷ σʔλΛऔಘͯ͠ΈΔ
ϑϩʔ w ߴղ૾߱ਫφΩϟετ͔ΒམཕͱछྨΛऔಘ w (FP+40/ʹม͢Δ w ਤ্ʹϚοϐϯάͯ͠ΈΔ 0QFO-BZFST
མཕใͷऔಘ
མཕใͷऔಘ w σʔλҰཡ IUUQXXXKNBHPKQKQIJHISFTPSBE IJHISFTPSBE@UJMFUJMF@CBTFUJNFYNM w ݸʑͷσʔλ IUUQTXXXKNBHPKQKQIJHISFTPSBE IJHISFTPSBE@UJMF-*%&/ OPOFEBUBYNM
import sys, shutil import json import urllib.request import xml.etree.ElementTree as
et argvs = sys.argv argc = len(argvs) if (argc > 1): datetime = argvs[1] else: basetime_url = 'http://www.jma.go.jp/jp/highresorad/highresorad_tile/ tile_basetime.xml' basetime_req = urllib.request.Request(basetime_url) with urllib.request.urlopen(basetime_req) as response: basetime_xml = response.read() basetime = et.fromstring(basetime_xml) datetime = basetime[0].text data_url = 'http://www.jma.go.jp/jp/highresorad/highresorad_tile/ LIDEN/'+datetime+'/'+datetime+'/none/data.xml' data_req = urllib.request.Request(data_url) with urllib.request.urlopen(data_req) as response: data_xml = response.read() data = et.fromstring(data_xml) features = []
for i,child in enumerate(data): if i is not 0: feature
= { "geometry": { "type": "Point", "coordinates": [ float(child.attrib["lon"]), float(child.attrib["lat"]) ] }, "type": "Feature", "properties": { "type": int(child.attrib["type"]) } } features.append(feature) featurecollection = {"type":"FeatureCollection","features":features} f = open(datetime + ".json", "w") f.write(json.dumps(featurecollection)) f.close() shutil.copy(datetime + ".json", "recent.json") print("save to " + datetime + ".json")
%FNP ࣮ࡍʹσʔλΛऔಘͯ͠ɺ+40/Λ֬ೝͯ͠ΈΔ
None
None
(FP+40/Λඳըͯ͠ΈΔ
%FNP 0QFO-BZFSTΛͬͯɺϒϥβͰඳըͯ͠ΈΔ
ιʔείʔυ w KNBMJEFOHFPKTPO IUUQTHJUIVCDPN42KNBMJEFOHFPKTPO
͓ΘΓ