Slide 1

Slide 1 text

Web scraping Irio Musskopf Data Science Retreat for data scientists

Slide 2

Slide 2 text

Finding data Not always easy

Slide 3

Slide 3 text

1. Downloadable dataset

Slide 4

Slide 4 text

2.APIs

Slide 5

Slide 5 text

3. Scraping

Slide 6

Slide 6 text

4.Talk with other companies

Slide 7

Slide 7 text

4.Produce yourself

Slide 8

Slide 8 text

Doesn’t matter how complex the system is. It is possible.

Slide 9

Slide 9 text

Doesn’t matter how complex the system is. It is possible. Unless there’s a captcha.

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

DEMO

Slide 12

Slide 12 text

Selectors Limitations User agents Proxies

Slide 13

Slide 13 text

Irio Musskopf [email protected] Thanks