Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2016 - Al Sweigart - Automating Your Browser an...

PyBay
August 21, 2016

2016 - Al Sweigart - Automating Your Browser and Desktop Apps

Description
There's a lot of data on the web and in your desktop apps, but accessing it can involve a lot of tedious typing and clicking. This talk is an introduction to the Selenium and PyAutoGUI modules, with live demos straight from the interactive shell. Al Sweigart explains web scraping techniques and programmatically controlling the keyboard and mouse to automate these tasks for you.

Abstract
The internet and personal computer are central tools in many jobs, including professions outside of engineering. This makes web scraping and GUI automation are relevant to not just developers and QA testers, but academics, organizers, and office workers. This talk is an introduction to Selenium and PyAutoGUI modules. and programatically controlling your browser and desktop applications from Python.

Web scraping and GUI automation frameworks have an intimidating reputation for a steep learning curve. While they do have many sophisticated features, the basics that most folks will ever need can be covered in a single presentation.

This presentation has multiple live demos to showcase these modules straight from the interactive shell.

The content from this talk is derived from Automate the Boring Stuff with Python, a beginner's Python book freely available under a Creative Commons license at https://automatetheboringstuff.com

​Bio
Al Sweigart is a software developer and the author of Automate the Boring Stuff with Python, Invent Your Own Computer Games with Python, Making Games with Python & Pygame, and Hacking Secret Ciphers with Python. These books are freely available under a Creative Commons license at https://inventwithpython.com. Al enjoys haunting coffee shops, writing educational materials, cat whispering, and making useful software. He lives in San Francisco.

https://youtu.be/dZLyfbSQPXI

PyBay

August 21, 2016
Tweet

More Decks by PyBay

Other Decks in Programming

Transcript

  1. Hi, I’m Al. • I liek programming. • I wrote

    a programming book. • Creative Commons license. • AutomateTheBoringStuff.com
  2. Not Found The requested URL was not found on this

    server. Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.
  3. Selenium >>> from selenium import webdriver >>> browser = webdriver.Firefox()

    >>> browser.get('https://automatetheboringstuff.com') >>> elem = browser.find_element_by_css_selector('.entry-content > ol:nth-child(15) > li:nth-child(1) > a:nth-child(1)') >>> elem.text >>> elem.click() >>> browser.quit()
  4. Selenium >>> from selenium import webdriver >>> browser = webdriver.Firefox()

    >>> browser.get('https://automatetheboringstuff.com') >>> elem = browser.find_element_by_css_selector('.entry-content > ol:nth-child(15) > li:nth-child(1) > a:nth-child(1)') >>> elem.text >>> elem.click() >>> browser.quit()
  5. Selenium >>> from selenium import webdriver >>> browser = webdriver.Firefox()

    >>> browser.get('https://automatetheboringstuff.com') >>> elem = browser.find_element_by_css_selector('.entry-content > ol:nth-child(15) > li:nth-child(1) > a:nth-child(1)') >>> elem.text >>> elem.click() >>> browser.quit()
  6. Selenium >>> from selenium import webdriver >>> browser = webdriver.Firefox()

    >>> browser.get('https://automatetheboringstuff.com') >>> elem = browser.find_element_by_css_selector('.entry-content > ol:nth-child(15) > li:nth-child(1) > a:nth-child(1)') >>> elem.text >>> elem.click() >>> browser.quit()
  7. Selenium >>> from selenium import webdriver >>> browser = webdriver.Firefox()

    >>> browser.get('https://automatetheboringstuff.com') >>> elem = browser.find_element_by_css_selector('.entry-content > ol:nth-child(15) > li:nth-child(1) > a:nth-child(1)') >>> elem.text >>> elem.click() >>> browser.quit()
  8. Selenium >>> from selenium import webdriver >>> browser = webdriver.Firefox()

    >>> browser.get('https://automatetheboringstuff.com') >>> elem = browser.find_element_by_css_selector('.entry-content > ol:nth-child(15) > li:nth-child(1) > a:nth-child(1)') >>> elem.text >>> elem.click() >>> browser.quit()
  9. Selenium >>> from selenium import webdriver >>> browser = webdriver.Firefox()

    >>> browser.get('https://automatetheboringstuff.com') >>> elem = browser.find_element_by_css_selector('.entry-content > ol:nth-child(15) > li:nth-child(1) > a:nth-child(1)') >>> elem.text >>> elem.click() >>> browser.quit()
  10. Selenium >>> from selenium import webdriver >>> browser = webdriver.Firefox()

    >>> browser.get('https://automatetheboringstuff.com') >>> elem = browser.find_element_by_css_selector('.entry-content > ol:nth-child(15) > li:nth-child(1) > a:nth-child(1)') >>> elem.text >>> elem.click() >>> browser.quit()
  11. CSS Selector Syntax • browser.find_element_by_css_selector(' .entry-content > ol:nth-child(15) > li:nth-child(1)

    > a:nth-child(1)') • Returns an element object. • (Also: find_elements_by_css_selector() which returns a list of element objects.)
  12. Clicking and Typing • “Flat is better than nested.” •

    from selenium.webdriver.common.keys import Keys
  13. Reading Data from the Web Page • elem.text • elem.get_attribute('href')

    • To get ALL attributes: – elem.get_attributes()
  14. Reading Data from the Web Page • elem.text • elem.get_attribute('href')

    • To get ALL attributes: – elem.get_attributes() – elem.get_attribute('outerHTML') – '<a href="" class="main-navigation-toggle"><i class="fa fa-bars"></i></a>'
  15. Installing PyAutoGUI • pip install pyautogui – Works on Python

    2 & 3 – Works on Windows, Mac, & Linux – Simple API • https://pyautogui.readthedocs.org
  16. Mouse Control • click() • click([x, y]) • doubleClick() •

    rightClick() • moveTo(x, y [, duration=seconds]) • moveRel(x_offset, y_offset [, duration=seconds]) • dragTo(x, y [, duration=seconds]) • position() (returns (x, y) tuple) • size() (returns (width, height) tuple) • displayMousePosition()
  17. Failsafe • Move the mouse to the top-left corner of

    the screen to raise the FailSafeException. • pyautogui.PAUSE is set to 0.1, adding a tenth-second delay after each call.
  18. Image Recognition • Linux: sudo apt-get scrot • pixel(x, y)

    – returns RGB tuple • screenshot([filename]) – returns PIL/Pillow Image object [and saves to file] • locateOnScreen(imageFilename) – returns (left, top, width, height) tuple or None
  19. What is GUI automation used for? • Automating tests for

    non-browser apps. • Automating non-HTML parts of browser apps. • Cheating at Flash games.
  20. Thanks! • bit.ly/automatetalk • pip install selenium • pip install

    pyautogui • AutomateTheBoringStuff.com • @AlSweigart • [email protected]