Slide 1

Slide 1 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Create Go WebDriver client from scratch 1 2021.11.13 Go Conference 2021 Autumn Kazuki Higashiguchi (@hgsgtk)

Slide 2

Slide 2 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Goal of the talk 1 2 3 See implemented demo Go codes to gain in-depth knowledge of relevant technologies. Learn about WebDriver. You’ll be able to imagine the implementation of browser automation. You’ll want to create something from scratch with nice standard libraries of Go. 2

Slide 3

Slide 3 text

3 Engineering Manager @ BASE BANK, a subsidiary of BASE +4 years Gopher Kazuki Higashiguchi > Twitter: @hgsgtk > GitHub: @hgsgtk

Slide 4

Slide 4 text

Overview of WebDriver

Slide 5

Slide 5 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Selenium WebDriver Selenium WebDriver is an API that allows us to write automated tests for web applications. ● Interfaces to discover and manipulate DOM elements and control of user agents ● Supports different sets of browsers (ChromeDriver, geckodriver, Microsoft Edge Driver...etc) ● Provides compatibilities with many programming languages (JavaScript, Python, Ruby, Go...etc) ○ Famous Go libraries: tebeka/selenium, sclevine/agouti 5

Slide 6

Slide 6 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. The specification of WebDriver W3C Recommendation describes the specification of WebDriver. 6

Slide 7

Slide 7 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Communication protocol ● Provides an HTTP compliant wire protocol ● Consists of Local end and Remote end *1 7 User Agent Remote end Local end HTTP *1 There are two node types of remote end, intermediary node and endpoint node. See W3C document for details. ChromeDriver

Slide 8

Slide 8 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. WebDriver protocol is organised into commands HTTP request with a method and URL defined in W3C specification represents a single command. Therefore each command produces a single HTTP response. 8 User Agent Remote end Local end HTTP ChromeDriver

Slide 9

Slide 9 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Commands and information structure Commands ● New session: POST /session ● Find element: POST /session/{session id}/element ● Element click: POST /session/{session id}/element/{element id}/click ● Take screenshot: GET /session/{session id}/screenshot ● New tab window: POST /session/{session id}/window/new ● Check status: GET /status ...etc 9 Session element timeouts element id session id screenshot cookie frame window ...etc text click ...etc status

Slide 10

Slide 10 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Remote end listens incoming HTTP requests 10 User Agent Remote end Local end ChromeDriver Listening on 9515 port For instance, ChromeDriver launched locally will start listening on port 9515.

Slide 11

Slide 11 text

WebDriver client created from scratch

Slide 12

Slide 12 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. gowd: self-made WebDriver client gowd: Go WebDriver client, implementing local end protocol (https://github.com/hgsgtk/gowd) *1 12 User Agent Remote end Local end HTTP ChromeDriver *1 This is just for the presentation of Go Conference 2021 Autumn

Slide 13

Slide 13 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Feature of gowd Feature list ○ Open browser / Close browser ○ Navigate to page / Get current URL ○ Find element / Get element text ○ Click element ○ New window tab ○ Take screenshot Using only the Go standard libaries 13

Slide 14

Slide 14 text

Deep dive into WebDriver client

Slide 15

Slide 15 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Common browser automation code steps 15 Open browser Find element Navigate to page Take screenshot Click element Close browser

Slide 16

Slide 16 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Deep dive > Sub agenda 16 1. Open browser 2. Close browser 3. Navigate to page 4. Find element 5. Click element 6. Take screenshot Open browser Find element Navigate to page Take screenshot Click element Close browser

Slide 17

Slide 17 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Deep dive > 1. Open browser 17 1. Open browser 2. Close browser 3. Navigate to page 4. Find element 5. Click element 6. Take screenshot Open browser Find element Navigate to page Take screenshot Click element Close browser

Slide 18

Slide 18 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. 1. Open browser Common interface of webdriver client libraries. ● New a WebDriver (ex. NewWebDriver(), NewChromeDriver()...) ● Open a browser (ex. driver.New()...) 18

Slide 19

Slide 19 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Open browser via curl 19 3. Opened 1. Send HTTP request 2. JSON response will be return from ChromeDriver

Slide 20

Slide 20 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Internal implementation to open browser 20 1. Send POST /session request a. Will open a new browser 2. Decode JSON response body from remote end 3. Keep session id to use afterwards

Slide 21

Slide 21 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. One session represents one browser 21 Session element timeouts element id session id screenshot cookie frame window ...etc text click ...etc status

Slide 22

Slide 22 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Deep dive > 2. Close browser 22 1. Open browser 2. Close browser 3. Navigate to page 4. Find element 5. Click element 6. Take screenshot Open browser Find element Navigate to page Take screenshot Click element Close browser

Slide 23

Slide 23 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. 2. Close browser 23 Send DELETE /session/{session id} request. The browser pointed by session id will close. Browser.Close() closes the browser.

Slide 24

Slide 24 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Deep dive > 3. Navigate to page 24 1. Open browser 2. Close browser 3. Navigate to page 4. Find element 5. Click element 6. Take screenshot Open browser Find element Navigate to page Take screenshot Click element Close browser

Slide 25

Slide 25 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. 3. Navigate to page 25 ● Send POST /session/{session id}/url request ○ Specify the url in the request body Browser.NavigateTo() navigates browser to a given URL.

Slide 26

Slide 26 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Deep dive > 4. Find element 26 1. Open browser 2. Close browser 3. Navigate to page 4. Find element 5. Click element 6. Take screenshot Open browser Find element Navigate to page Take screenshot Click element Close browser

Slide 27

Slide 27 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. 4. Find element 27 Browser.FindElement() find a element in the current browser by using locator strategy (“link text”). In the code example, tried to find element which link text is “More information…”.

Slide 28

Slide 28 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Locator strategy Element locator strategy is an enumurated attribute to search for elements in the current browser. 28 ● css selector: “div > a“ ● link text: “More information…“ ● partial link text: “Mor“ ● tag name: “

“ ● xpath: “//div/a“

Slide 29

Slide 29 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Internal implementation to find element 29 1. Send GET /session/{session id}/element request a. Specify the locator strategy and search value in the request body 2. Decode JSON response body from remote en 3. Keep element id to use afterwards

Slide 30

Slide 30 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Internal implementation to get element id element-6066-11e4-a52e-4f735466c ecf is string constant, called web element identifier. *1 30 JSON response Get element id from JSON response. *1 The old WebDriver JSON protocol uses `ELEMENT` key.

Slide 31

Slide 31 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Element represents an element 31 Session element timeouts element id session id screenshot cookie frame window ...etc text click ...etc status

Slide 32

Slide 32 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Deep dive > 5. Click element 32 1. Open browser 2. Close browser 3. Navigate to page 4. Find element 5. Click element 6. Take screenshot Open browser Find element Navigate to page Take screenshot Click element Close browser

Slide 33

Slide 33 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. 5. Click element Element.Click() clicks the element pointed by element id. Send POST /session/{session id}/element/{element id}/click request 33

Slide 34

Slide 34 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Deep dive > 6. Take screenshot 34 1. Open browser 2. Close browser 3. Navigate to page 4. Find element 5. Click element 6. Take screenshot Open browser Find element Navigate to page Take screenshot Click element Close browser

Slide 35

Slide 35 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. 6. Take screenshot 35 example.com.png Browser.TakeScreenshot() takes the screenshot of current browser.

Slide 36

Slide 36 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Internal implementation to take screenshot 1. Send GET /session/{sessionID}/screenshot 2. Decode JSON response body from remote end 3. Decode Base64 encoded screenshot image 36 Remote end Local end Base64 encoded string ChromeDriver

Slide 37

Slide 37 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Remote end returns base64-encoded PNG image 37 https://source.chromium.org/chromium/chr omium/src/+/master:chrome/test/chromed river/chrome/web_view.h;l=232 *1 PNG has been created as a lossless image format. It’s supposed to exactly preserve all details of an image. *2 Lossy format (like JPEG) produce much smaller files, because they don’t save unnecessary details. W3C Recommendation describes the specification of screenshot. ● Dumps a snapshot as a loseless PNG image *1*2 ● PNG image will be returned as a Base64 encoded string ChromeDriver / web_view.h

Slide 38

Slide 38 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Deep dive into Base64 encoding There are two types of “Base64 encoding” defined in RFC4648, RFC3548. ● Base64: Base 64 Encoding ● Base64url: Base 64 Encoding with URL and Filename Safe Alphabet 38 https://pkg.go.dev/encoding/[email protected]#pkg-variables src/encoding/base64/base64.go

Slide 39

Slide 39 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Difference between Base64 and Base64url 39 https://cs.opensource.google/go/go/+/refs/tags/go1.17.3:src/encoding/base64/base64.go;l=35 In Base64url, Replace + -> - / -> _ src/encoding/base64/base64.go

Slide 40

Slide 40 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. More Base N data encodings 40 Standard Definition and usecase Go implementation Base16 RFC4648 Base32 RFC4648 encoding/base32 Base36 Used by URL shortening service (ex. TinyURL) *1 Base45 Draft IETF Specification Dasio/base45 Base58 Used by bitcoin address itchyny/base58-go Base64 RFC4648 encoding/base64 Base85 (Ascii85) RFC1924 Used by Adboe’s PostScript and PDF Base91 (basE91) developed by Joachim Henke Base92, 94, 95 - *1 https://en.wikipedia.org/wiki/Binary-to-text_encoding#Base58

Slide 41

Slide 41 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. Summary 1 2 3 WebDriver client communicates with remote end (ex. ChromeDriver) by HTTP protocol WebDriver is Interfaces to discover and manipulate DOM elements and control of user agents WebDriver client can be made by only Go standard libraries (ex. net/http, encoding/json, encoding/base64) 41

Slide 42

Slide 42 text

© 2012-2019 BASE, Inc. © 2012-2021 BASE BANK, Inc. More BASE! #basebank-code-reading-ja 42 BASE BANK holds Go code reading party on a regular basis. The next session will be on 2021.11.25 (Thu). ● #basebank-code-reading-ja (in Gophers slack workspace) ● basebank/gophers-code-reading-party (GitHub repository) https://github.com/basebank/gophers-code-reading-pa rty/issues/16