Slide 1

Slide 1 text

How WebKit Works Adam Barth (abarth) October 30, 2012

Slide 2

Slide 2 text

What is WebKit? WebKit is a rendering engine for web content WebKit is not a browser, a science project, or the solution to every problem HTML JavaScript CSS WebKit Rendering of a web page

Slide 3

Slide 3 text

WebCore (HTML, CSS, DOM, etc, etc) Major Components WTF (Data structures, Threading primitives) Platform (Network, Storage, Graphics) JavaScriptCore (JavaScript Virtual Machine) Bindings (JavaScript API, Objective-C API) WebKit and WebKit2 (Embedding API) This talk

Slide 4

Slide 4 text

Life of Web Page Network Loader HTML Parser DOM Script Render Tree CSS Graphics Context

Slide 5

Slide 5 text

Page Main Frame Document Pages, Frames, and Documents Frame Frame Frame Document Document Document Frame Document

Slide 6

Slide 6 text

Lifecycle of a Frame Uninitialized Initial Document Provisional Ready to Commit Committed Checking Policy ● Committed is the quiescent state

Slide 7

Slide 7 text

How the Loader Works (Idealized) MemoryCache CachedResourceLoader CachedResource ResourceRequest ResourceLoader ResourceHandle CachedResourceRequest The Loader is actually very messy and complicated, but we have a long-term project to clean up its nuttiness Platform-specific code

Slide 8

Slide 8 text

Tokenizer TreeBuilder How the HTML Parser Works Bytes Characters Tokens Nodes DOM Hello, world! StartTag: body Hello, StartTag: span world! EndTag: span body Hello, span world! body Hello, span world! 3C 62 6F 64 79 3E 48 65 6C 6C 6F 2C 20 3C 73 70 61 6E 3E 77 6F 72 6C 64 21 3C 2F 73 70 61 6E 3E 3C 2F 62 6F 64 79 3E

Slide 9

Slide 9 text

Preload Scanning for Fun and Profit Mary had a little lamb Tokenizer TreeBuilder document.write(""); Script execution can change the input stream Preload scanner tokenizes ahead ● When parser is blocked on external scripts ● Starts resource loads earlier

Slide 10

Slide 10 text

XSSAuditor Tokenizer TreeBuilder HTTP Request HTTP Response XSSAuditor XSSAuditor examines token stream Looks for scripts that were also in the request ● Assumes those scripts were reflected XSS ● Blocks them

Slide 11

Slide 11 text

DOM + CSS → Render Tree body Hello, span world! html head title Greeting img #footer { position: fixed; bottom: 0; left: 0 } body > span { font-weight: bold; } Render Block Render Inline Render Text Render Image Render Text bold Layout Render Block fixed

Slide 12

Slide 12 text

Anonymous RenderObjects div Hello, div world! Render Block Render Block Render Text Render Block Render Text Anonymous ● Not every RenderObject has a DOM Node ● Every RenderBlock either: ○ Has all inline children ○ Has no inline children

Slide 13

Slide 13 text

LayerTree Render Block Render Inline Render Text Render Image Render Text bold Render Block fixed Render Layer Render Layer ● Sparse representation of RenderTree ● Enables accelerated compositing, scrolling

Slide 14

Slide 14 text

Yet Another Tree: LineBoxTree
An old silent pond... A frog jumps into the pond, splash! Silence again.
InlineTextBox InlineTextBox InlineTextBox RootInlineBox RootInlineBox RootInlineBox InlineTextBox Render Block Render Text Render Inline Render Text bold ● One RootInlineBox per line of text ● List of inline flow and inline text boxes

Slide 15

Slide 15 text

Conclusion ● WebCore's main processing pipeline: ○ Loader and Parser ○ CSS, DOM, and Script ○ RenderTree, LayerTree, and InlineBoxes ● Other major subsystems ○ Accessibility, Editing, Events, CSS, Web Inspector ○ Plugins, SVG, MathML, XSLT... ● Other components ○ WebKit, Bindings, Platform, JavaScriptCore, WTF ○ ... 1.5 MLOC of C++ ● Learn more: ○ http://www.webkit.org/coding/technical-articles.html

Slide 16

Slide 16 text