Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Puppeteer Fetchbot

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

Puppeteer Fetchbot

Library and Shell command that provides a simple JSON-API to perform human like interactions and data extractions on any website. Built on top of puppeteer.

Avatar for Bernhard Behrendt

Bernhard Behrendt

June 18, 2018
Tweet

Other Decks in Programming

Transcript

  1. a Node library which provides an API to control Chrome

    and how we used it to automate UI-interactions 1 . 1
  2. which are two shop systems 1. an internal one //

    accessibility driven 2. a public one // marketing driven 1 . 7
  3. const puppeteer = require('puppeteer'); (async () => { const browser

    = await puppeteer.launch(); const page = await browser.newPage(); })(); 2 . 4
  4. const puppeteer = require('puppeteer'); (async () => { const browser

    = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://aoe.com'); await page.screenshot({path: 'aoe.png'}); await browser.close(); })(); 2 . 5
  5. const puppeteer = require('puppeteer'); const SELECTOR_GOOGLE_SEARCH_INPUT = 'input'; const SELECTOR_GOOGLE_SEARCH_RESULT

    = '#rso > div > div > div:nth const SELECTOR_AOE_OSS_LINK = '#page-header > div.header-menus > (async () => { const browser = await puppeteer.launch({headless:false}); const page = await browser.newPage(); page.setViewport({ width:1024, height:768 }); 2 . 8
  6. const puppeteer = require('puppeteer'); const SELECTOR_GOOGLE_SEARCH_INPUT = 'input'; (async ()

    => { const browser = await puppeteer.launch({headless: false}); const page = await browser.newPage(); page.setViewport({ width: 1024, height: 768 }); await page goto('https://google com'); 2 . 9
  7. { results: [ 'AOE: Agile Softwareentwicklung | IT & Technologie

    Dienstlei '//SEIBERT/MEDIA, Wiesbaden - Agile Software-Entwicklung, At 'Scholz & Volkmer', '– 3deluxe - Transdisciplinary Design –', 'vitronic.de - VITRONIC - the machine vision people', 'AbbVie: Startseite | Deutschland', 'AOE – the open web company | LinkedIn', 'Chemical Company and Manufacturer - The WeylChem Group', 'Sweco.de | Sweco.de', 'Braun + Company' ] } 2 . 10
  8. Element Amount Text inputs 43 Selects (2 clicks each) 2-4

    Checkboxes 12 Radio Button Sets 10 Datepicker 1 Buttons 11 Total min. 56 – max. ca. 90 3 . 1
  9. { "http://google.com":{ "root":true, "type":[["input", "web companies in wiesbaden\n"]], "waitFor":[[3000]], "fetch":{

    "h3.r as headlines":[], "#rso > div > div > div > div > div > h3 > a as links":{ "attr": "href", "type": [] } } } 4 . 8
  10. fetchbot --job fetch.json --agent "LONG_LONG_USER_AGENT_STRING" const FetchBot = require('fetchbot'); (async

    () => { const fetchbot = new FetchBot(''); let fetchBotData = await fetchbot.runAndExit('./fetch.json'); console.log(fetchBotData); })(); 4 . 9
  11. { "headlines": [ "AOE: Agile Softwareentwicklung | IT & Technologie

    Dienst "//SEIBERT/MEDIA, Wiesbaden - Agile Software-Entwicklung, "Scholz & Volkmer", "[...]" ], "links": [ "https://www.aoe.com/de/home.html", "https://www.seibert-media.net/", "https://www.s-v.de/", "[...]" ] } 4 . 10
  12. const fetchbot = new FetchBot('', { "attached": true, "slowmo": 250,

    "width": 1280, "height": 1024, "trust": true }); 4 . 12
  13. Type casting Boolean Number String Array of String(s) Array of

    Numbers(s) Objects containing an additional attribute matching 4 . 13
  14. { "https://testserver.local/PostForms/webshopform.php": { "root": true, "click": "input[type='submit']", "waitFor": [[5000]] },

    "https://testserver.local/tarifuebersicht-kabel": { "click": "#sg-133 > div > div.cobra-product-footer > p > butt "waitFor": [[1000]] }, "https://testserver.local/breitband/produktkonfiguration/": [ { 4 . 16