Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CodeCamp 2016 - How the web and browsers actually work

James Macfie
April 08, 2016
57

CodeCamp 2016 - How the web and browsers actually work

James Macfie

April 08, 2016
Tweet

Transcript

  1. @jamesmacfie How the web and browsers actually work An in-depth

    look at how things are rendered to the screen #codecamp
  2. #codecamp What happens when we enter in a domain? And

    how is that information rendered to the screen?
  3. Before the browser does anything, it checks the url validity

    #codecamp If there are invalid characters, the browser will use Punycode to make the URL valid. bücher.com xn--bcher-kva.com/ büücher.com xn--bcher-kvaa.com/ münchen.com xn--mnchen-3ya.com/
  4. Does the browser need to force the request to use

    https? #codecamp There’s an internal cache of sites that have requested to only be communicated with https, not http HTTP Strict Transport Security LISt
  5. Now the browser can request the domain’s ip address #codecamp

    Cache Hosts file Local network DNS resolver
  6. Example for www.google.com #codecamp Recursive resolver Root resolver “Hey, can

    I have the IP address for www.google.com?” “Don’t have it - try xxx.xxx.xxx.xxx” “Don’t have it - try xxx.xxx.xxx.xxx” TLD resolver “Hey, can I have the IP address for www.google.com?” Domain resolver “Hey, can I have the IP address for www.google.com?” Sure, it’s -xxx.xxx.xxx.xxx”
  7. This is what a request and response looks like #codecamp

    Request Response 200 OK Content-Length: 243 Content-Type: text/html [response headers] <html> <head> <title>Howdy</title> </head> GET / HTTP/1.1 Host: google.com Connection: close [other headers]
  8. Now, what’s this tcp/ip I’ve heard about? #codecamp TCP/IP handles

    how data is transferred 1 4 7 2 5 8 3 6 9 6 1 7 2 9 5 3 4 8 1 4 7 2 5 8 3 6 9
  9. These are the different parts of a browser #codecamp User

    interface Browser engine Rendering engine Networking Data persistence JavaScript UI Backend
  10. Rendering html basically does this #codecamp Parse HTML and create

    the DOM tree Parse styles and create the render tree Layout the render tree Paint the render tree
  11. For example, 2 + 3 - 1 could return this

    #codecamp Expression - Number 1 Expression + Number 2 Number 3
  12. XML is a context free grammar #codecamp <memo> <addressee>John</addressee> <sender>Carla</addressee>

    <date>19980901</date> <title>New coffee maker</title> <body> The new coffee maker has been installed! Operation is simple: put a cup in the opening and press the red button. </body> </memo>
  13. #codecamp HTML is not a context free grammar so we

    cannot use common parsing techniques
  14. #codecamp There are two main reasons for this: - HTML

    parsers are extremely fault tolerant - the html can be modified as it is being parsed
  15. html’s forgiving nature is pretty nice really #codecamp <html> <div>

    <p> </div> </span>Crappy HTML </p> </html> <html> <head></head>
 <body> <div> <p></p> </div> Crappy HTML <p></p> </body> </html>
  16. html’s forgiving nature is pretty nice really #codecamp <table> <table>

    <tr><td>inner table</td></tr> </table> <tr><td>outer table</td></tr> </table> <table> <tr><td>outer table</td></tr> </table> <table> <tr><td>inner table</td></tr> </table>
  17. HTML’s parsing process is reentrant #codecamp Dynamic code can modify

    the HTML as it is being parsed. This can add extra tokens to the HTML. Think about a script tag that gets evaluated in the middle of the input which contains a document.write call
  18. The html parsing flow #codecamp DOM tree construction Network data

    Tokeniser Script execution DOM document.write()
  19. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Current state: “Data”
  20. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Current state: “Tag open”
  21. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Current state: “Tag name”
  22. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Current state: “Data”
  23. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Token created: start-tag {name: html}
  24. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Current state: “Data”
  25. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Token created: character {data: C}
  26. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Current state: “Tag open”
  27. Let’s look at how the tokeniser would parse this #codecamp

    <html> <body> Codecamp </body> </html> Current state: “Close tag open”
  28. We end up with twelve tokens #codecamp start-tag { name:

    html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html }
  29. Next the DOM tree is constructed from the tokens #codecamp

    First, the root document node is created and all other nodes will be added to this. For each token, the spec defines which DOM element is relevant. These elements are added both to the DOM tree and also to the stack of open elements.
  30. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: Initial
  31. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: Before html HTMLHtmlElement
  32. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: Before head HTMLHtmlElement HTMLHeadElement
  33. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: In head HTMLHtmlElement HTMLHeadElement
  34. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: After head HTMLHtmlElement HTMLHeadElement
  35. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: In body HTMLHtmlElement HTMLHeadElement HTMLBodyElement
  36. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: In body HTMLHtmlElement HTMLHeadElement HTMLBodyElement Text
  37. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: In body HTMLHtmlElement HTMLHeadElement HTMLBodyElement Text
  38. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: After body HTMLHtmlElement HTMLHeadElement HTMLBodyElement Text
  39. How does this get converted into the dom tree? #codecamp

    start-tag { name: html } start-tag { name: body } character { data: C } character { data: o } character { data: d } character { data: e } character { data: c } character { data: a } character { data: m } character { data: p } end-tag { name: body } end-tag { name: html } State: After after body HTMLHtmlElement HTMLHeadElement HTMLBodyElement Text
  40. slightly more complicated #codecamp HTMLHtmlElement <html> <body> <div> <p>Code</p> <p>camp</p>

    </div> <div> <p> <span>2016</span> </p> </div> </body> </html> HTMLHeadElement HTMLBodyElement HTMLDivElement HTMLDivElement HTMLParagrahElement HTMLParagraphElement HTMLParagraphElement HTMLSpanElement Text Text Text
  41. We can now execute ‘defer’ scripts #codecamp <script> Ref: https://html.spec.whatwg.org/

    Parse HTML Fetch script Execute script <script async> <script defer>
  42. #codecamp Not all of these styles will block <link href=“style.css"

    rel=“stylesheet"> <link href=“style.css" rel="stylesheet" media=“all”> <link href="portrait.css" rel="stylesheet" media=“orientation:portrait"> <link href=“print.css" rel="stylesheet" media="print">
  43. For example - dom tree #codecamp <head></head> <body> <p> Codecamp

    <span> 2016 </span> </p> <span> Wellington </span> <div> <img src=“codecamp.jpg” /> </div> </body> body span div img p span text text text head
  44. For example - cssom #codecamp body {
 font-size: 16px; }

    p { font-weight: bold; } p span { display: none; } span { color: red; } img {
 float: right; } img p span body span font-size: 16px; font-weight: bold; font-size: 16px; font-size: 16px; font-weight: bold; display: none; color: red; font-size: 16px; color: red; font-size: 16px; float: right; Note - this doesn’t include browser styles so is incomplete, but hopefully you get the idea div font-size: 16px;
  45. Not all dom nodes are rendered to the screen #codecamp

    The <head> tag, for example, has no visual element. Neither does anything with display: none.
  46. For example - render tree #codecamp div p span body

    font-size: 16px; font-weight: bold; font-size: 16px; font-size: 16px; color: red; font-size: 16px; float: right; text text img font-size: 16px;
  47. BTW, some nodes have more than one element to render

    #codecamp Take a select box for example Select an item Item number two Item number three Item number four Item number one Input Button Dropdown box
  48. Some nodes are also present in a different position #codecamp

    That is, the node is present both in the DOM tree and render tree, but not at the same point. Eg, an absolute or fixed positioned element
  49. Previous example with an absolutely positioned img #codecamp body span

    div p span img body {
 position: relative } img {
 position: absolute; }
  50. Where does the visual property information come from? #codecamp It

    comes from a few different places: - the browser’s defaults - user stylesheets - author stylesheets (those from the developer) - inline styles
  51. #codecamp not only does the browser have to keep track

    of the order, but also calculate the specificity
  52. #codecamp Example - different sources of css /* Browser defaults

    */ body {
 font-family: serif; } /* Author styles */ body {
 font-family: sans-serif; } #header { font-family: Helvetica; } h1 { font-family: ‘Comic Sans MS’; } <body> <main> <h1 id=“header” style=“font- family: Impact;”>Hello!</h1> </main> </body>
  53. #codecamp GECKO creates an extra couple of trees for styles

    - a rule tree and a style context tree
  54. #codecamp Example - here’s some html and some css /*

    1 */ div { display: block; text-indent: 1em; } /* 2 */ h1 { display: block; font-size: 3em; } /* 3 */ span { display: block; } /* 4 */ .un { text-decoration: underline; } <body> <div> <h1>Birdy bird</h1> <span class=“un”>Name: <strong>Spotted shag</strong></span> <span>Family: <em class=“un”>Phalacrocoracidae</em></span> </div> </body>
  55. #codecamp Example - This would be the dom tree <body>

    <div> <h1>Birdy bird</h1> <span class=“un”>Name: <strong>Spotted shag</strong></span> <span>Family: <em class=“un”>Phalacrocoracidae</em></span> </div> </body> div span.un h1 em.un span body strong
  56. #codecamp Example - th is would be the rule tree

    /* 1 */ div { display: block; text-indent: 1em; } /* 2 */ h1 { display: block; font-size: 3em; } /* 3 */ span { display: block; } /* 4 */ .un { text-decoration: underline; } A (null) C: 2 E: 4 B: 1 F: 4 D: 3
  57. #codecamp Example - this would be the style context tree

    body div span.un h1 span em.un strong B: 1 C: 2 A: null E: 4 F: 4 D: 3
  58. There are also hash maps for quickly looking up styles

    #codecamp Both gecko and webkit implement a few different hash maps for storing references to styles - ids - classes - tags - general
  59. determining layout positions is a recursive process #codecamp To figure

    out the exact position of each node in the render tree, we start at the root and traverse it to compute the geometry of all nodes. <body> <main style=“width:50%”> <div style=“width:50%”>
 Hello! </div> </main> </body> Hello! Viewport (size = device width) div (50%) div (50%)
  60. Layout - briefly #codecamp 1. A node determines it’s own

    width 2. Over all the nodes children: 1. A child x and y positions set 2. Layout is called on the child if necessary 3. The parent uses the childs accumulated height to determine it’s own height
  61. Painting things happens in a certain order #codecamp 1. Background

    colour 2. Background image (which includes gradients) 3. Border 4. Children 5. Outline
  62. Places where this info came from #codecamp HTML5 Rocks article

    (from 2011, but very in depth) http://www.html5rocks.com/en/tutorials/internals/howbrowserswork Google’s BlinkOn internal talks https://www.youtube.com/channel/UCIfQb9u7ALnOE4ZmexRecDg Google Developers https://developers.google.com/web/fundamentals/performance Mozilla - Gecko overview https://wiki.mozilla.org/Gecko:Overview