Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
はじめてのスクレイピング!- bs4 と Selenium を 使ってみよう! -
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
cha1ra
January 11, 2019
Programming
1.7k
0
Share
はじめてのスクレイピング! - bs4 と Selenium を 使ってみよう! -
cha1ra
January 11, 2019
More Decks by cha1ra
See All by cha1ra
生成AIと読み解くLaravelの進化史:コミットメッセージからの洞察
cha1ra
0
620
githubハンズオン_vscodeから作成_.pdf
cha1ra
0
340
githubハンズオン_リポジトリclone_.pdf
cha1ra
0
530
Puppeteer Introduction and my original command "dk"
cha1ra
0
130
Introduction of Babel
cha1ra
0
110
ProgWrap 企画書 v1.2.1
cha1ra
0
100
web_speech_api.pdf
cha1ra
0
410
Web Service Hackathon @Dec. 6, 2018
cha1ra
0
36
Other Decks in Programming
See All in Programming
ユニットテストの先へ:テスト技法で要求・仕様を整理するJava開発実践 / Beyond_Unit_Testing_Practical_Java_Development_Techniques_for_Organizing_Requirements_and_Specifications
shimashima35
0
300
AI駆動開発で崩れていくコードベースを立て直す
kyoko_nr_nr
1
390
柔軟なPDFレイアウトエディタを支える型システム設計 — Discriminated UnionとConditional Typeの実践
minako__ph
4
1.2k
Oxcを導入して開発体験が向上した話
yug1224
4
250
ECR拡張スキャンでSBOMを収集して サプライチェーン攻撃の影響調査を 爆速で終わらせてみた
akihisaikeda
2
210
Agentic UI beyond Chats Architecture Patterns & Open Standards @ngMunich 05/2026
manfredsteyer
PRO
0
170
iOS26時代の新規アプリ開発
yuukiw00w
0
210
ReactとSvelteのその先、Ripple-TS / Beyond React and Svelte: Ripple-TS
ssssota
3
1.7k
プロパティの順序で型推論が壊れる!? TypeScript6.0の修正からContext-Sensitivityの仕組みを追う
bicstone
2
1.2k
Claspは野良GASの夢をみるか
takter00
0
140
AI駆動開発勉強会 広島支部 第一回勉強会 AI駆動開発概要とワークショップ
hayatoshimiu
0
410
Make SRE Operations Easier with Azure SRE Agent
kkamegawa
0
1.8k
Featured
See All Featured
Heart Work Chapter 1 - Part 1
lfama
PRO
7
36k
Navigating Team Friction
lara
192
16k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
25k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
560
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Darren the Foodie - Storyboard
khoart
PRO
3
3.4k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.5k
The World Runs on Bad Software
bkeepers
PRO
72
12k
Testing 201, or: Great Expectations
jmmastey
46
8.2k
Prompt Engineering for Job Search
mfonobong
0
320
Color Theory Basics | Prateek | Gurzu
gurzu
0
320
Between Models and Reality
mayunak
4
310
Transcript
- 4 1 294 00 !
:
22/ . +/ . 2 2 /
/9968 9 1:5 8 .4
53 5. 459 7 / 8 /996 /5. 53
E 9 9 1 /97
: 5 9 1 4 8 /97 1.69 89 1 / 4 1 : 9 1 /97
G H H H 7
:, 7514, 8 :5, 8 :5 8 :5, 7514, 4E, 7 ,/5:: :4 7 , 4E, 7 :, . / 7 65! 7 > 19 45 86<! 5 : 6 < 5 1 78 5 7 > 7 65!
/ .7736( //7) 1 9)6 ./5 / /7 16 )
) curl -v https://cha1ra.github.io/scrayping-handson/index.html > GET /scrayping-handson/index.html HTTP/1.1 > Host: cha1ra.github.io > User-Agent: curl/7.60.0 > Accept: */* > < HTTP/1.1 200 OK < Server: GitHub.com < Content-Type: text/html; charset=utf-8 # < <!DOCTYPE html> <html lang="ja"> # :
8B< 86 5 B9B 6
>5 8 HG 9 : G 8B< , 86 5, B9B 6,19B 6 B9B 6, 86 5, >5 , 8 , 6 > E> 5 8 , >5 , 8B< , /.1 > B 8>76!4>< 8BB B : > 56 97 !4>< E6 >7 >B6 489 6 8BB 8>76!4><
8B< 86 5 B9B 6
>5 8 HG 9 : G 8B< , 86 5, B9B 6,19B 6 B9B 6, 86 5, >5 , 8 , 6 > E> 5 8 , >5 , 8B< , /.1 > B 8>76!4>< ! 8BB B : > 56 97 !4>< E6 >7 >B6 489 6 8BB 8>76!4><
:
9 / . +/ . /
gk aS TLe l s
mT 4 0B ( 0 2 h S,1.-c y Sc o j _d fi Sn rp 2 fi t fi OW M /.bj u ) 4 D B 4 4 4 H B D D : 4 D
fj S d_ k rO
plS) / B / W T M 1 4g -,b _W y b _W H n i cW eh m W oO 1 4eh W s eh u L W .-ai W t : BD( DB D 2 2 B : D: 4 D B D:
$pip3 install requests $pip3 install
beautifulsoup4
9 < 9736 : <7
4 6G 9 H .<:> H ,9 < ,9736 , : <7 : <7, : <7 , 9736 ,4 6G ,9 17<< <6 , 9 , 4 6G , 9 < / 1 B 9 87!5 T ! 9 B 3 67B:8>!5 74< 8 > 7 3 59:E7B 9 9 87!5
) := > (( E " -''= ) C <
C ' = P C > 'C > O E" = >C " " O ". 241 8 E0 . E E / D 0 #R .' E0" = <". 241 8 E0 . E E / D 0 #R .' E0" import requests r = requests.get('https://cha1ra.github.io/scrayping-handson/index.html’) print(r) -''> = P ' ' ' C'! =E
8B< 86 5 B9B 6
>5 8 HG 9 : G 8B< , 86 5, B9B 6,19B 6 B9B 6, 86 5, >5 , 8 , 6 > E> 5 8 , >5 , 8B< , /.1 > B 8>76!4>< ! 8BB B : > 56 97 !4>< E6 >7 >B6 489 6 8BB 8>76!4><
soup = BeautifulSoup(r.content, 'html.parser')
& ) ID < :" > # .> 08 .'>
0 ID < :78 " > # .> 08 .'> 0 .> 021 .'> 0 .> 0 8 .'> 0 ID < :" 8 > </ > D -'' ' # I < , ID < :78 " 8 7/ 8 # . 8 / 8 0 8 / 8 : (().' 0 > D-'': D > I =' ' 8 '8D ' I D ID 7 " > # .> 08 .'> 0 ID " > # .> 08 .'> 0 .> 021 .'> 0 .> 0 8 .'> 0 ID " 8 > </ > D -'' ' # I < , ID 7 " 8 # . 8 / 8 0 8 / 8 : (().' 0
( gk aS TLe l
s mT 84 B 8 8 28 h S,1.-c y Sc o j _d fi Sn rp 28 fi t fi OW M /.bj u ) 84 8D 8 B 4 4 4 H B 8 8D 8 D8 : 4 D 8
D C $pip3 install selenium $brew install chromedriver 11. 1
/ / / / /
/ . -