Slide 1

Slide 1 text

Advanced on-page SEO Rubén Martínez Paradigma | CAMON Madrid, Nov 18th 2013

Slide 2

Slide 2 text

User Search flow on the WWW SEO deals with the bottlenecks in the information flow Understand Optimize

Slide 3

Slide 3 text

Technical or On-page SEO is everything that helps a website generate more revenues from search engines and that webmasters have full control over. What is On-page SEO? Off-page SEO Technical SEO

Slide 4

Slide 4 text

Why does technical SEO matter? It helps close the gap between web servers, search engines and human beings Source: http://knowledgeoman.com

Slide 5

Slide 5 text

The search operator “site:” can be used to get a rough estimate of the number of pages indexed by Google of a given website Compare the count of indexed pages of close competitors for the same target audience Content inventory Root domain # pages indexed by Google.es Orange.es 10,300,000 Movistar.es 1,810,000 Ono.es 960,000 Vodafone.es 922,000 Yoigo.com 4,030 Simyo.es 541 Table populated by querying Google for the count of indexed pages. E.g.: http://www.google.es/search?q=site%3Aorange.es Count your content, its conversion rates and the rate of publication and obsolescence

Slide 6

Slide 6 text

Organization of the information

Slide 7

Slide 7 text

Organization of the information

Slide 8

Slide 8 text

Links as proxies to importance – PageRank algorithm

Slide 9

Slide 9 text

Value of a linked webpage Where p1 ,… p2 , pN are the pages whose value we are determining, M (pj ) is the set of pages that link to pi L (pj ) is the number of outbound links on page pj N is the total number of pages

Slide 10

Slide 10 text

PageRank for Larry Page Larry Page before the algorithm

Slide 11

Slide 11 text

…not for web “page” Larry Page before his algorithm Larry Page after his algorithm Source: http://www.google.com/press/images.html

Slide 12

Slide 12 text

Organization of the information

Slide 13

Slide 13 text

Simpler organization is more effective

Slide 14

Slide 14 text

Visualize the network and analyze with Gephi Visualize the graph of your website Crawl with Xenu’s Link Sleuth (desktop application for Windows) Filter fields on a bash shell $ head crawl.txt $ cut -f1,2 crawl.txt | sed -e 's/http\:\/\/www\.{domain}\.{tld}//g' -e 's/\t/,/g' | grep -v "\.jpg\|http\:|\.css\|\.js" >filtered.csv $ head filtered.csv

Slide 15

Slide 15 text

Graph – Example 1 Website of an annual event

Slide 16

Slide 16 text

Graph – Example 2 Website of a shopping website

Slide 17

Slide 17 text

The power of weak links Thin connections tend to link the clusters, allowing information to move between them Source: Giles, Jim. Making the links. Nature Aug 23rd 2012

Slide 18

Slide 18 text

Friendly URLs need to bear in mind the URL encoding, the presence of delimiting characters and the organization of the information of the website. Googlebot does not reach the optional last part of a URL for a document from the hash onwards. Anatomy of a URL URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] Example: foo://example.com:8042/over/there?name=ferret#nose \_/ \______________/\_________/ \_________/ \__/ | | | | | scheme authority path query fragment | ___________________|_ / \ / \ urn:example:animal:ferret:nose

Slide 19

Slide 19 text

Topology of on-page links PageRank random surfer PageRank reasonable surfer

Slide 20

Slide 20 text

Single Page Applications Single Application Applications (SAP) pages free client browsers from querying web servers. SAP are now growing in use thanks to AJAX and frameworks like backbone and angular.js. This is a major challenge for search engines because the fragments in the URLs prevent crawlers to scrape the content. Google are asking webmasters to make their AJAX-based websites crawlable

Slide 21

Slide 21 text

Single Page Applications

Slide 22

Slide 22 text

SEO for Single Page Applications Modify the URL fragments for stateful AJAX pages http://example.com/page?query#!state Use a headless browser that outputs an HTML snapshot on your web server rather than a client machine Allow search engine crawlers to access these URLs by escaping the state http://example.com/page?query&_escaped_fragment_=state Show the original URL to users in the search results

Slide 23

Slide 23 text

She is thinking keywords. Again.

Slide 24

Slide 24 text

A document's relevance given a user query Example: Query “SEO” on Google.es Match keyword Search engine optimization - Wikipedia, the free encyclopedia en.wikipedia.org/wiki/Search_engine_optimization‎ Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's "natural" or un-paid ("organic") search SEO/BirdLife www.seo.org Se trata una federación de ámbito estatal de grupos territoriales, tiene como fines el estudio y la defensa de las aves y está integrada en la ONG mundial Co-ocurrence of keywords SEO <> search, search engine, website SEO <> aves, ONG TF*IDF (Term Frequency x Inverse Document Frequency) Topic modelling – Latent Dirichlet Allocation

Slide 25

Slide 25 text

TF*IDF tf–idf is the product of two statistics, term frequency and inverse document frequency With t the number of times that a term occurs in document d D the number of documents in the corpus denominator: number of documents where the term t appears

Slide 26

Slide 26 text

Latent Dirichlet Allocation LDA Source: http://moz.com/blog/lda-and-googles-rankings-well-correlated

Slide 27

Slide 27 text

Topic modelling - LDA Source: http://mengjunxie.github.io/ae-lda/index.html LDA based feature selection is reliable and generally better than document frequency based feature selection

Slide 28

Slide 28 text

Structured data

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

Markup detected by Google Example of the webpage of an event - Structured Data Testing Tool

Slide 32

Slide 32 text

“Normal” organic results, the second enjoys sitelinks for higher visibility and CTR Exceptionally visible organic result with itemised deep links thanks to Structured Data tagging The section with a salmon- coloured background contains text links of Google Adwords

Slide 33

Slide 33 text

These are sitelinks – they are great for visitibility and CTR but you do not have prior control over them Note: Sitelinks can be removed via Google Webmasters Tools

Slide 34

Slide 34 text

Source: groovecommerce.com Example of structured markup in retail Prominent results at the top of Google SERP

Slide 35

Slide 35 text

Source: http://support.google.com Other examples of structured data Classifieds, aggregators, online music stores…

Slide 36

Slide 36 text

Tools: Google Analytcs and server logs Usage metrics, e.g. conversions (goals), time per page, pages/session, social signals, etc. are now part of SEO. Google Analytics lets you detect losses in the navigation flow of your website Engagement

Slide 37

Slide 37 text

Thank you If you enjoyed it, engage with us! @tucamon @paradigmate @rubenmartinezs