$30 off During Our Annual Pro Sale. View Details »

Seed Selection for Genre Specific Search

Seed Selection for Genre Specific Search

This poster was presented at IIIT-H RnD show case

dharmeshkakadia

February 09, 2013
Tweet

More Decks by dharmeshkakadia

Other Decks in Technology

Transcript

  1. Seed Selection for Genre Specific Search
    Search and Information Extraction Lab P Nikhil Priyatam Krish Perumal Dharmesh Kakadia
    Vasudeva Varma
    International Institute of Information Technology, Hyderabad
    AIM  
    •  This work aims to get a set of diverse seed URLs
    for genre specific search using Twitter data.
    SYSTEM  ARCHITECTURE  
    PROPOSED  ALGORITHM  
    WORKING  OF  ALGORITHM  
    EVALUATION  ARCHITECTURE  
    EXPERIMENTAL  RESULTS  
    MOTIVATION  
    •  Coverage and diversity are crucial aspects of
    genre specific search engines. These depend
    largely on the initial set of seed URLs. There is
    no existing work that automates the process of
    seed URL selection with a focus on diversity.
    CONCLUSION  
    •  First work to automate the process of seed URL
    selection for genre specific search
    •  Addressed the issue of crawl diversity, which was
    hitherto neglected.
    DIVERSITY  SCORES  
    SIMILARITY  MEASURES  FOR  EDGES  
    •  Content overlap
    •  URL n-gram overlap
    •  Timestamp similarity
    •  Follower-followee relations or Retweets

    View Slide