Slide 1

Slide 1 text

Architecting 3D content: what video structuring can teach us about the metaverse SEO [an experiment] 1

Slide 2

Slide 2 text

The experiment: swiss raclette servings from different angles, in different setups w/ different ingredients, incorporated in one video. 2

Slide 3

Slide 3 text

Let’s quickly list the top known video ranking factors together 3

Slide 4

Slide 4 text

Is your video search strategy resilient? 4

Slide 5

Slide 5 text

Research-backed & business-focused computer science engineer, specialising in content engineering & end-to-end SEO studied, worked, published in visited what shaped me as an SEO 5

Slide 6

Slide 6 text

Worked in entertainment and learned creative approaches from award-winning journalists, video editors & producers, famous faces, marketing directors since 2010, thanks to Macedonian Idol, Dragi Nedelchevski, Igor Tomeski, Samir Ljuma. what shaped me as an SEO 6

Slide 7

Slide 7 text

We need more open, scientific, actionable research in our industry. I practice what I preach and… 7

Slide 8

Slide 8 text

my imperative is that marketing is the GENEROUS act of helping other people achieve their goals and an opportunity to SERVE 8

Slide 9

Slide 9 text

Let’s see what Google patents hint 9

Slide 10

Slide 10 text

Patents 10

Slide 11

Slide 11 text

Excerpt(s) Summary: more engagement means better ranking. Evidence: “...means for determining a user engagement value for the media item based on at least one user shares of the media item, user indications of interest in the media item, user comments on the media item…” 11

Slide 12

Slide 12 text

Excerpt(s) Summary: more engagement means better ranking. Evidence: “...A score for a media item is computed by determining a plurality of positive user actions associated with the media, combining a plurality of score contributions from the plurality of positive user actions to determine a value for the score, and applying an exponential decay to the value for the score. The media items are ranked based on the scores…” 12

Slide 13

Slide 13 text

Patents 13

Slide 14

Slide 14 text

Excerpt(s) Summary: embedding videos can lead to better ranking. Evidence: “...A user may thereby be presented with relevant search results, ranked by or including content sharing data, for example how often the item of content has been shared, and possibly in conjunction with other ranking data (internal and/or external reference).” 14

Slide 15

Slide 15 text

What if I ignore classic SEO tips + these G-patents for video ranking optimization? 15

Slide 16

Slide 16 text

What could be so special about a video that has nearly 700+ views and 20+ likes on YouTube until today? 16

Slide 17

Slide 17 text

New idea! Cross-functionality: apply known computer science techniques to a new type of problem 17

Slide 18

Slide 18 text

18

Slide 19

Slide 19 text

Excerpt(s) Summary: demonstrated ability to detect and predict 3D shapes. Evidence: “...The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes…Thus our model is able to extract shapes without access to groundtruth shape information in the target dataset.” 19

Slide 20

Slide 20 text

Search & CTR: efforts for displaying products in 3D & augmented reality in SERPs 20

Slide 21

Slide 21 text

3D synergy Search Generative AI Metaverse 3D as a key for immersive experiences 21

Slide 22

Slide 22 text

Reinvent your SEO with 3D: dissecting everything step by step 22

Slide 23

Slide 23 text

My real video talk - verified by Yandex leak too! Lacking: no synonyms for swiss raclette, no tags, no comments, no shares, no description, no ad or email campaign, no user signals, no social signals, no language settings, no captions, no start and end screens, undefined location/geography, no established niche youtube channel fan-base (core audience), never had a history of channel advertising in any way, no embeds, no schema markup, no backlinks, no link text on YouTube, no link depth, URL length and slug were defined by Google, my channel & personal region did not overlap with Switzerland, no tagged products, not even in a playlist! 23

Slide 24

Slide 24 text

My real video talk Implemented: short video (filtered multiple public Instagram Swiss Raclette videos combined together), verified host (Google), simple filename (raclette.mp4), good objects, quality title (longer one combining topical entities = lemmas), no prohibited content, had YT channel with some history already in place, was nearby in Germany during video’s lifetime. Approach: apply computer science knowledge to video. 24

Slide 25

Slide 25 text

Deep dive insights 25

Slide 26

Slide 26 text

Over 87% of the traffic came from content intelligence features! 26

Slide 27

Slide 27 text

Google Vision API + Vision Intelligence API 27

Slide 28

Slide 28 text

Can we deconstruct videos by using computer vision? Yes, computer vision technology can be used to deconstruct videos. Computer vision algorithms can analyze video frames to extract and process visual information, such as object detection, image segmentation, optical flow, etc. 28

Slide 29

Slide 29 text

Popular computer vision algorithms for video deconstruction 1. Object Detection: YOLO, Faster R-CNN, RetinaNet. 2. Image Segmentation: Mask R-CNN, U-Net, DeepLabv3+. 3. Optical Flow: Farneback, Lucas-Kanade. 4. Action Recognition: Two-Stream Convolutional Networks, Temporal Segment Networks (TSN), 3D Convolutional Neural Networks (3D-CNN). 5. General Video Analysis: Keyframe Extraction, Video Summarization. 6. Object Tracking: KCF, Deep SORT, GOTURN. 29

Slide 30

Slide 30 text

1. Object character recognition or extracting written text from video (could be books, notes, shops’ names). 2. Object categorization or organizing objects by their look, shape, texture (items, people, animals, stuff). 3. Automatic speech recognition or what is said during the video. 4. Audio or other relevant sounds that can help grasp the topic of the video to match better (example: water, forest...), also in scene understanding. 5. Sentiment understanding or emotions during the video. 6. Safe search classification and which color scheme is used. 7. Even certain movements that people make like the “what’s the time gesture”. 30

Slide 31

Slide 31 text

..a paradox! Assume that everything can be detected and identified. Assume that everything can be misinterpreted or inappropriately tagged or classified. Google has a lot of data and engineering resources that we cannot get or implement but we have a… V/S 31

Slide 32

Slide 32 text

Are your data and story good enough? 32

Slide 33

Slide 33 text

“The experiments we performed are clearly indicating that even when we use advanced algorithms like YOLO, there’s still a space for the objects to be incorrectly labeled in visual environments like videos and metaverse spaces (virtual reality and augmented reality platforms). Having this in mind, we need to find a more structured way of providing 3D information to search engines.” - Emilija Gjorgjevska, WordLift’s blog https://wordlift.io/blog/en/metaverse-seo/ 3D schema markup 33

Slide 34

Slide 34 text

Apple. Seriously? 34

Slide 35

Slide 35 text

3D schema markup 35

Slide 36

Slide 36 text

Everything matters Beware: Sometimes the story behind the video won’t allow optimizing for objects etc. in video. User signals like comments, likes, subscribers, and so on matter a lot. However, the reach to other audiences without a solid basis that focuses on how the video is created in the first place is limited. 36

Slide 37

Slide 37 text

Scalable across platforms 37

Slide 38

Slide 38 text

Momentum: the creator era of AI-powered canvases 38

Slide 39

Slide 39 text

Connection to the metaverse ”You’ll own the things you create, build out and earn in the metaverse. Even more importantly, you will be able to monetise it. Creators will be highly incentivised to be present and create in this space.” 39

Slide 40

Slide 40 text

Objectverse: 3D prototyping 40

Slide 41

Slide 41 text

“We can now do generative AI for images. We can do it for videos. At the rate that it’s moving, you’ll do it for entire villages; 3D villages and landscapes and cities and so on. You’ll be able to assemble an example of an image and generate an entire 3D world.” 41 Connection to the metaverse

Slide 42

Slide 42 text

Key takeaway 3D optimization works, even when you ignore other video SEO advice and guidelines 42

Slide 43

Slide 43 text

1. Harder to reverse engineer. 1. Harder to replicate. 1. Usually overlooked. 1. Usually lacking strategy. 43

Slide 44

Slide 44 text

We are at the festival of creativity living in the most interesting time ever 44

Slide 45

Slide 45 text

Dominik Schwarz Andrea Volpini Astrid Kramer To the people that lift us! Thanks. Me. Emi. 45

Slide 46

Slide 46 text

...questions? *all images are found on the Internet, except the YT channel one and are not used for commercial purposes 46