Slide 1

Slide 1 text

- 4 1 294 00 !

Slide 2

Slide 2 text

: 22/ . +/ . 2 2 /

Slide 3

Slide 3 text

/9968 9 1:5 8 .4 53 5. 459 7 / 8 /996 /5. 53

Slide 4

Slide 4 text

E 9 9 1 /97 : 5 9 1 4 8 /97 1.69 89 1 / 4 1 : 9 1 /97

Slide 5

Slide 5 text

G H H H 7 :, 7514, 8 :5, 8 :5 8 :5, 7514, 4E, 7 ,/5:: :4 7 , 4E, 7 :, . / 7 65! 7 > 19 45 86 7 65!

Slide 6

Slide 6 text

/ .7736( //7) 1 9)6 ./5 / /7 16 ) ) curl -v https://cha1ra.github.io/scrayping-handson/index.html > GET /scrayping-handson/index.html HTTP/1.1 > Host: cha1ra.github.io > User-Agent: curl/7.60.0 > Accept: */* > < HTTP/1.1 200 OK < Server: GitHub.com < Content-Type: text/html; charset=utf-8 # < # :

Slide 7

Slide 7 text

8B< 86 5 B9B 6 >5 8 HG 9 : G 8B< , 86 5, B9B 6,19B 6 B9B 6, 86 5, >5 , 8 , 6 > E> 5 8 , >5 , 8B< , /.1 > B 8>76!4>< 8BB B : > 56 97 !4>< E6 >7 >B6 489 6 8BB 8>76!4><

Slide 8

Slide 8 text

8B< 86 5 B9B 6 >5 8 HG 9 : G 8B< , 86 5, B9B 6,19B 6 B9B 6, 86 5, >5 , 8 , 6 > E> 5 8 , >5 , 8B< , /.1 > B 8>76!4>< ! 8BB B : > 56 97 !4>< E6 >7 >B6 489 6 8BB 8>76!4><

Slide 9

Slide 9 text

: 9 / . +/ . /

Slide 10

Slide 10 text

gk aS TLe l s mT 4 0B ( 0 2 h S,1.-c y Sc o j _d fi Sn rp 2 fi t fi OW M /.bj u ) 4 D B 4 4 4 H B D D : 4 D

Slide 11

Slide 11 text

fj S d_ k rO plS) / B / W T M 1 4g -,b _W y b _W H n i cW eh m W oO 1 4eh W s eh u L W .-ai W t : BD( DB D 2 2 B : D: 4 D B D:

Slide 12

Slide 12 text

$pip3 install requests $pip3 install beautifulsoup4

Slide 13

Slide 13 text

9 < 9736 : <7 4 6G 9 H .<:> H ,9 < ,9736 , : <7 : <7, : <7 , 9736 ,4 6G ,9 17<< <6 , 9 , 4 6G , 9 < / 1 B 9 87!5 T ! 9 B 3 67B:8>!5 74< 8 > 7 3 59:E7B 9 9 87!5

Slide 14

Slide 14 text

) := > (( E " -''= ) C < C ' = P C > 'C > O E" = >C " " O ". 241 8 E0 . E E / D 0 #R .' E0" = <". 241 8 E0 . E E / D 0 #R .' E0" import requests r = requests.get('https://cha1ra.github.io/scrayping-handson/index.html’) print(r) -''> = P ' ' ' C'! =E

Slide 15

Slide 15 text

8B< 86 5 B9B 6 >5 8 HG 9 : G 8B< , 86 5, B9B 6,19B 6 B9B 6, 86 5, >5 , 8 , 6 > E> 5 8 , >5 , 8B< , /.1 > B 8>76!4>< ! 8BB B : > 56 97 !4>< E6 >7 >B6 489 6 8BB 8>76!4><

Slide 16

Slide 16 text

soup = BeautifulSoup(r.content, 'html.parser')

Slide 17

Slide 17 text

& ) ID < :" > # .> 08 .'> 0 ID < :78 " > # .> 08 .'> 0 .> 021 .'> 0 .> 0 8 .'> 0 ID < :" 8 > > D -'' ' # I < , ID < :78 " 8 7/ 8 # . 8 / 8 0 8 / 8 : (().' 0 > D-'': D > I =' ' 8 '8D ' I D ID 7 " > # .> 08 .'> 0 ID " > # .> 08 .'> 0 .> 021 .'> 0 .> 0 8 .'> 0 ID " 8 > > D -'' ' # I < , ID 7 " 8 # . 8 / 8 0 8 / 8 : (().' 0

Slide 18

Slide 18 text

( gk aS TLe l s mT 84 B 8 8 28 h S,1.-c y Sc o j _d fi Sn rp 28 fi t fi OW M /.bj u ) 84 8D 8 B 4 4 4 H B 8 8D 8 D8 : 4 D 8

Slide 19

Slide 19 text

D C $pip3 install selenium $brew install chromedriver 11. 1 / / / / /

Slide 20

Slide 20 text

Slide 21

Slide 21 text

/ . -