digital marketing and engineering expertise to solve complex challenges for enterprise companies. We act as an embedded team for our customers, providing transformative solutions that merge strategy, data, automation and technology. *We are hiring! More on that later…
context of the research 2. Explain why visual similarity is useful 3. How to define visual similarity 4. Implementation overview 5. Additional use cases
this project was already underway, with the team in the final stages of completing the text similarity component. Text similarity is a well-established process with numerous documented approaches to tackle it effectively. In the following slides, we will turn our attention to visual similarity.
spend most of their time on other sites. This means that users prefer your site to work the same way as all the other sites they already know. Design for patterns for which users are accustomed.” - Jakob Nielsen Source: https://www.nngroup.com/videos/jakobs-law-internet-ux/
similar, the advantage of having multiple brands can be diminished. Users may perceive the websites as identical, which could negatively impact business metrics. By comparing multiple internal brands and competitors, and incorporating business metrics, we wanted to define the optimal threshold for visual similarity.
had already explored several machine learning approaches. However, implementing them at the scale we required turned out to be both slow and costly. We started looking at a different approach.
compare (internal brands and competitors) - List of categories per each domain (HTML templates) - List of web pages for each category - Web Crawler (utilising a Headless Browser to render web pages)
high-level API for controlling Chrome, abstracting the DevTools Protocol. For lower-level tasks, we can directly use the DevTools Protocol (CDP), a protocol designed to automate actions on Chromium, Chrome, and other Blink-based browsers. Source: https://chromedevtools.github.io/devtools-protocol/
snapshot of all nodes (elements) rendered on the page, their content, positions, and dimensions. Source: https://chromedevtools.github.io/devtools-protocol/tot/DOMSnapshot/#method-captureSnapshot
TREE INFORMATION "X":868.390625,"Y":81,"width":22,"height":22 Parsing and normalising the DOMSnapshot output Source: https://chromedevtools.github.io/devtools-protocol/tot/DOMSnapshot/#method-captureSnapshot
* X% BoxA = "X":832.375,"Y":84.296875,"width":16,"height":16 BoxB = "X":868.390625,"Y":81,"width":22,"height":22 if BoxB coordinates are included in BoxA coordinates + thresholds if BoxB dimensions are included in BoxA dimension + thresholds then Boxes are similar (add to the list of similar boxes) else Boxes are not similar … Continue comparing BoxA with …
we can use the Jaccard Index. Where A ∩ B (the intersection) is the set of similar elements on the two pages, and A ∪ B (the union) is the set of all unique elements from both pages combined.
adding multiple optimisations: - Including only visible nodes - Background colors - Merging overlapping nodes - Considering the z-index of nodes - Using more performant data structures …and many more.
the company was highly satisfied with both the process and the outcomes. The resulting text and visual similarity metrics were integrated into the annual business goals as control metrics. Max threshold of visual similarity in this case was around 40%, but this may vary depending on websites, type of pages, and goals.
I’ve found that a team of researchers from Harbin Institute of Technology, Harbin, China and Cyberspace Security Research Center, Peng Cheng Laboratory, Shenzhen, China come up with a similar method to detecting phishing web sites. Source: https://www.researchgate.net/publication/336377602_Algorithm_of_web_page_similarity_comparison_based_on_visual_block
web pages have intrusive interstitials and dialogs that may interfere with search engines to understanding of the content. Image source: https://developers.google.com/search/docs/appearance/avoid-intrusive-interstitials
that web pages rendered by search engines or other systems successfully handle and position specific elements as intended, detecting misalignments or unexpected behaviours.