Slide 1

Slide 1 text

WebDriver Internals SeleniumCamp 2012

Slide 2

Slide 2 text

Who am I? • Senior Test Engineer at FINN.no • Engineering Productivity team • Norway’s largest online marketplace • 900 million page views / month • 4 million unique users / month

Slide 3

Slide 3 text

Who am I? @jarib http://github.com/jarib jari@finn.no

Slide 4

Slide 4 text

Who am I?

Slide 5

Slide 5 text

Browser automation 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 6

Slide 6 text

Browser automation Watir 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 7

Slide 7 text

Watir • Nice API • Native control over IE (using COM) + - • Ruby only • IE only (thus Windows only) • Internals not pretty

Slide 8

Slide 8 text

Watir

Slide 9

Slide 9 text

Browser automation Watir 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 10

Slide 10 text

Browser automation Watir Selenium 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 11

Slide 11 text

Selenium • Jason Huggins @ Thoughtworks • Needed to test in-house Time and Expenses app • Open source release in 2004

Slide 12

Slide 12 text

Selenium Table-based syntax, inspired by FIT

Slide 13

Slide 13 text

Selenium Table-based syntax, inspired by FIT "Selenium Core"

Slide 14

Slide 14 text

Selenium Pure JavaScript Cross-domain policy prevents moving from one domain to the other ==

Slide 15

Slide 15 text

Selenium Solution: HTTP server

Slide 16

Slide 16 text

Selenium "Selenium Remote Control"

Slide 17

Slide 17 text

Selenium • Languages: Java, Python, C#, Ruby ++ • Core written in JavaScript • Supports "all" browsers + - • Restricted by JS sandbox • Bloated, procedural API • Complicated architecture

Slide 18

Slide 18 text

Selenium public void testNew() throws Exception { selenium.open("/"); selenium.type("q", "selenium rc"); selenium.click("btnG"); selenium.waitForPageToLoad("30000"); assertTrue(selenium.isTextPresent("Results * for selenium rc")); }

Slide 19

Slide 19 text

Selenium

Slide 20

Slide 20 text

Selenium WTF!?

Slide 21

Slide 21 text

Selenium selenium.open "/" selenium.type 'q', 'selenium' puts selenium.title POST /selenium-server/driver/?cmd=open&1=%2F&sessionId=24f8bcc POST /selenium-server/driver/?cmd=type&1=q&2=selenium&sessionId=24f8bcc POST /selenium-server/driver/?cmd=getTitle&sessionId=24f8bcc

Slide 22

Slide 22 text

Selenium selenium.type 'q', 'selenium' POST /selenium-server/driver/?cmd=type&1=q&2=selenium command name

Slide 23

Slide 23 text

Selenium selenium.type 'q', 'selenium' POST /selenium-server/driver/?cmd=type&1=q&2=selenium argument 1

Slide 24

Slide 24 text

Selenium selenium.type 'q', 'selenium' POST /selenium-server/driver/?cmd=type&1=q&2=selenium argument 2

Slide 25

Slide 25 text

Browser automation Watir Selenium 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 26

Slide 26 text

Browser automation Watir Selenium WebDriver 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 27

Slide 27 text

WebDriver • Benefit of hindsight • design goal: small, object-oriented API • native code when needed, JavaScript elsewhere • still leverage HTTP as a transport • Language support (Java, Ruby, Python, C#) • Browser support • Firefox, Chrome, Opera, IE, iPhone, Android

Slide 28

Slide 28 text

WebDriver

Slide 29

Slide 29 text

Selenium

Slide 30

Slide 30 text

Browser automation Watir Selenium WebDriver 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 31

Slide 31 text

Browser automation Watir Selenium WebDriver 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 32

Slide 32 text

Browser automation Watir Selenium WebDriver Selenium 2 Watir-Webdriver 2003 2004 2005 2006 2007 2008 2009 2010 2002 2011

Slide 33

Slide 33 text

Architectural themes • Emulate the user • Prove that the drivers work • You shouldn't need to understand everything • Every API call is an RPC

Slide 34

Slide 34 text

Architectural themes Emulate the user

Slide 35

Slide 35 text

Architectural themes • Designed to accurately emulate user interaction with a web application • Use native events where possible... • ...but make it easy to do cross-browser • API should match user interactions • e.g., no fireEvent() Emulate the user

Slide 36

Slide 36 text

Architectural themes Prove the drivers work

Slide 37

Slide 37 text

Architectural themes • Extensive automated test suite • Mostly integration tests • Strong culture for adding tests when fixing bugs • Challening CI setup • 6 browsers on 3 OSes with 4 languages • 72 builds per commit Prove the drivers work

Slide 38

Slide 38 text

Architectural themes You shouldn't need to understand how everything works

Slide 39

Slide 39 text

Architectural themes • Lots of languages / technologies in use • Architecture should allow developers to focus their talents where they'll be most productive You shouldn't need to understand how everything works

Slide 40

Slide 40 text

Architectural themes Every API call is an RPC

Slide 41

Slide 41 text

Architectural themes • Need to communicate with external browser process • Performance at the mercy of network latency • Introduces tension in the API design • Coarseness (== improved performance) vs expressiveness and ease of use Every API call is an RPC

Slide 42

Slide 42 text

WebDriver Internals WebDriver API WebDriver SPI JSON wire protocol Browser

Slide 43

Slide 43 text

API user facing

Slide 44

Slide 44 text

API user facing SPI implementor facing

Slide 45

Slide 45 text

driver.findElement(By.name("q")).sendKeys("webdriver"); API command –> { command : "findElement", parameters: { using: "name", value: "q" } } response <– { status: 0, value : } command –> { command: "sendKeys", parameters: { element: , value: ["webdriver"] } } response <– { status: 0 } SPI

Slide 46

Slide 46 text

API command –> { command : "findElement", parameters: { using: "name", value: "q" } } response <– { status: 0, value : } command –> { command: "sendKeys", parameters: { element: , value: ["webdriver"] } } response <– { status: 0 } driver.findElement(By.name("q")).sendKeys("webdriver"); SPI

Slide 47

Slide 47 text

API command –> { command : "findElement", parameters: { using: "name", value: "q" } } response <– { status: 0, value : } command –> { command: "sendKeys", parameters: { element: , value: ["webdriver"] } } response <– { status: 0 } driver.findElement(By.name("q")).sendKeys("webdriver"); SPI

Slide 48

Slide 48 text

API object oriented SPI procedural

Slide 49

Slide 49 text

API object oriented SPI procedural WebDriver driver = new FirefoxDriver(); WebElement input = driver.findElement(By.name("q")); input.sendKeys("webdriver"); input.getAttribute("value"); newSession(browser: 'firefox') findElement(using: 'name', value: 'q') sendKeysToElement(id: '00cbe31e', value: ['webdriver']) getElementAttribute(id: '00cbe31e', name: 'value')

Slide 50

Slide 50 text

API object oriented, user facing SPI procedural, implementor facing

Slide 51

Slide 51 text

API SPI wire protocol browser automation atoms, native code

Slide 52

Slide 52 text

API SPI browser automation standard reference implementation wire protocol browser automation atoms, native code

Slide 53

Slide 53 text

Wire protocol • Reference implementation of the Service Provider Interface (SPI) • In spec terms, not a requirement for implementors • Though comes with some benefits • e.g. existing clients in multiple languages

Slide 54

Slide 54 text

Wire protocol Language binding Java Ruby C# Python Browser Firefox Opera Chrome IE

Slide 55

Slide 55 text

Wire protocol JSON HTTP Language binding Java Ruby C# Python Browser Firefox Opera Chrome IE

Slide 56

Slide 56 text

Wire protocol JSON HTTP Language binding Java Ruby C# Python Browser Firefox Opera Chrome IE server client

Slide 57

Slide 57 text

Wire protocol Machine A JSON HTTP Language binding Java Ruby C# Python Browser Firefox Opera Chrome IE

Slide 58

Slide 58 text

Wire protocol JSON HTTP Machine B Remote WebDriver Server JSON HTTP Browser Firefox Opera Chrome IE Machine A Language bindings Java Ruby Python C#

Slide 59

Slide 59 text

Wire protocol Selenium 2 Grid Hub JSON HTTP Grid Node JSON HTTP Language bindings Java Ruby Python C# Grid Node Grid Node CI server

Slide 60

Slide 60 text

Wire protocol Maps SPI commands to RESTish HTTP resources POST /session POST /session/:sessionId/element POST /session/:sessionId/element/:id/value GET /session/:sessionId/element/:id/attribute/:name newSession(browser: 'firefox') findElement(sessionId: 'c688f8e4', using: 'name', value: 'q') sendKeysToElement(id: '00cbe31e', value: ['webdriver']) getElementAttribute(id: '00cbe31e', name: 'value')

Slide 61

Slide 61 text

Wire protocol Maps SPI commands to RESTish HTTP resources POST /session POST /session/c688f8e4/element POST /session/c688f8e4/element/00cbe31e/value GET /session/c688f8e4/element/00cbe31e/attribute/value newSession(browser: 'firefox') findElement(sessionId: 'c688f8e4', using: 'name', value: 'q') sendKeysToElement(id: '00cbe31e', value: ['webdriver']) getElementAttribute(id: '00cbe31e', name: 'value')

Slide 62

Slide 62 text

Wire protocol http://code.google.com/p/selenium/wiki/JsonWireProtocol

Slide 63

Slide 63 text

Automation Atoms • Shared JavaScript library • Google Closure Compiler / Library • Advanced compilation

Slide 64

Slide 64 text

Automation Atoms JavaScript Atoms (built on Google Closure Library) Google Closure Compiler Firefox extension (atoms.js) ChromeDriver (atoms.h) IE driver (atoms.h) Opera driver (OperaAtoms.java)

Slide 65

Slide 65 text

Automation Atoms /** * @param {!Element} elem The element to consider. * @return {string} visible text. */ bot.dom.getVisibleText = function(elem) { var lines = []; bot.dom.appendVisibleTextLinesFromElement_(elem, lines); lines = goog.array.map( lines, bot.dom.trimExcludingNonBreakingSpaceCharacters_); var joined = lines.join('\n'); var trimmed = bot.dom.trimExcludingNonBreakingSpaceCharacters_(joined); // Replace non-breakable spaces with regular ones. return trimmed.replace(/\xa0/g, ' '); };

Slide 66

Slide 66 text

Drivers client JS HTTPD wire protocol Firefox Dispatcher / Command Processor JS atoms XPCOM

Slide 67

Slide 67 text

Drivers client wire protocol chromedriver mongoose.c HTTPD JS atoms Chrome Automation Proxy IPC

Slide 68

Slide 68 text

Drivers client wire protocol IEDriver.dll mongoose.c HTTPD JS atoms IE COM APIs COM

Slide 69

Slide 69 text

Drivers client wire protocol OperaDriver Jetty HTTPD JS atoms Opera Scope (protobuf)

Slide 70

Slide 70 text

Demo Goal: Implement the ability to maximize the browser window Step 1: Ruby + Firefox Step 2: Java client + Remote Server driver.manage().window().maximize();

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

Getting involved • Ask for help and input early • Keep the change small • Write tests

Slide 73

Slide 73 text

Getting involved • selenium-developers mailing list • #selenium on irc.freenode.net • http://code.google.com/p/selenium/ • http://code.google.com/p/selenium/issues/list • Look for the GettingInvolved label

Slide 74

Slide 74 text

Questions?