about me PHP’s new HTML parser Retrieving HTML elements with modern CSS selectors Try out some code 2025 Parsing and Traversing HTML with PHP 8.4 2 WHY PARSE? What’s it usually used for Using the new DOM classes to enable the new features XPATH SELECTORS When to use them CLOSING AND Q&A Useful tools, further reading, Q&A INTRO NEW DOM API IN 8.4
about me and my experience PHP’s new HTML parser Retrieving HTML elements with modern CSS selectors Try out some code 2025 Parsing and Traversing HTML with PHP 8.4 4 PARSING HTML What’s it used for Using the new DOM classes to enable the new features XPATH SELECTORS When to use them CLOSING AND Q&A Useful tools, further reading, Q&A INTRO NEW DOM API IN 8.4
No API needed MANIPULATE Add or remove elements Move elements around Sanitise untrusted content CONVERT Produce plain text, PDF, or audio 2025 Parsing and Traversing HTML with PHP 8.4 8
about me and my experience PHP’s new HTML parser Retrieving HTML elements with modern CSS selectors Try out some code 2025 Parsing and Traversing HTML with PHP 8.4 11 PARSING HTML What’s it used for Using the new DOM classes to enable the new features XPATH SELECTORS When to use them CLOSING AND Q&A Useful tools, further reading, Q&A INTRO NEW DOM API IN 8.4
14 HTML5-PHP • PHP library • Started in 2013 • Based on an older version of the W3C HTML5 standard • Available via composer LEXBOR • Fast, C-based parser • Started in 2018 • Based on newer WHATWG standard • Available and enabled on most installations since PHP 8.4 • Fast, C-based parser • Started in 1999 • Based on HTML4 standard, with some HTML5 support added later • Available and enabled on most PHP installations LIBXML
about me and my experience PHP’s new HTML parser Retrieving HTML elements with modern CSS selectors Try out some code 2025 Parsing and Traversing HTML with PHP 8.4 30 PARSING HTML What’s it used for Using the new DOM classes to enable the new features XPATH SELECTORS When to use them CLOSING AND Q&A Useful tools, further reading, Q&A INTRO NEW DOM API IN 8.4
Fewer lines needed to turn HTML into DOM • Top-Level HTML Elements as DOM Properties – head, body, title • innerHTML property – Set or get child nodes using HTML strings • CSS selector support – querySelector() and querySelectorAll() – No need for CSS to XPath (e.g. Symfony’s CSS Selector) 2025 Parsing and Traversing HTML with PHP 8.4 31
about me and my experience PHP’s new HTML parser Retrieving HTML elements with modern CSS selectors Try out some code 2025 Parsing and Traversing HTML with PHP 8.4 36 PARSING HTML What’s it used for Using the new DOM classes to enable the new features XPATH SELECTORS When to use them CLOSING AND Q&A Useful tools, further reading, Q&A INTRO NEW DOM API IN 8.4
DOM API CSS SELECTORS • querySelector($selectors) – “Returns the first descendant element that matches the CSS selectors” • querySelectorAll($selectors) – “Returns a NodeList containing all descendant elements that match the CSS selectors”
DOM API CSS SELECTORS Get all paragraphs in article that have at least one link inside them Get h1 headings that are followed immediately by a h2 heading
DOM API CSS SELECTORS Get external links — URLs starting with “http” and not containing “example.com”, case insensitive href attributes starting with “http” case insensitive match Exclude elements that contain “example.com” anywhere in the href attribute
about me and my experience PHP’s new HTML parser Retrieving HTML elements with modern CSS selectors Try out some code 2025 Parsing and Traversing HTML with PHP 8.4 43 PARSING HTML What’s it used for Using the new DOM classes to enable the new features XPATH SELECTORS When to use them CLOSING AND Q&A Useful tools, further reading, Q&A INTRO NEW DOM API IN 8.4
CSS SELECTORS ARE NOT ENOUGH XPATH SELECTORS • CSS more concise, but less expressive than XPath • XPath can… • Select elements based on text content • Select attributes
about me and my experience PHP’s new HTML parser Retrieving HTML elements with modern CSS selectors Try out some code 2025 Parsing and Traversing HTML with PHP 8.4 46 PARSING HTML What’s it used for Using the new DOM classes to enable the new features XPATH SELECTORS When to use them CLOSING AND Q&A Useful tools, further reading, Q&A INTRO NEW DOM API IN 8.4
about me and my experience PHP’s new HTML parser Retrieving HTML elements with modern CSS selectors Try out some code 2025 Parsing and Traversing HTML with PHP 8.4 52 PARSING HTML What’s it used for Using the new DOM classes to enable the new features XPATH SELECTORS When to use them CLOSING AND Q&A Useful tools, further reading, Q&A INTRO NEW DOM API IN 8.4
execution – chrome --headless --dump-dom https://www.example.com • Symfony’s HTML Sanitizer – Clean untrusted HTML for output 20XX Parsing and Traversing HTML with PHP 8.4 53 USEFUL TOOLS • Niels Dossche – Responsible for PHP 8.4’s DOM changes • DOM Living Standard – https://dom.spec.whatwg.org