FOSS.IN/2010, Bengaluru, India Semantic Content Repositories Leveraging Drupal and other FOSS tools for a connected meaningful web Pratul Kalia (lut4rp) 17th December, 2010
FOSS.IN/2010, Bengaluru, India Semantic Web • What is missing? • Data processing by machines • The most overused term today - web 2.0 • Beyond the gloss, shine and reflections... • Tim Berners-Lee and his Web 3.0
FOSS.IN/2010, Bengaluru, India Agropedia: what is it? • Semantically enabled agricultural knowledge (knowledge models) • Read/write web (CMS built using Drupal) • A “social networking” platform for agricultural knowledge exchange Started at IITK with funding from NAIP (National Agriculture Innovation Project) and ICAR (Indian Council of Agriculture Research) Now, 20 institutes and universities across India with 100+ agricultural scientists
FOSS.IN/2010, Bengaluru, India Agrotagger • What is it? • Taxonomies • Agrovoc • Agrotags • Document (pdf, ppt. odp, odt) conversion to text • Stop words removal and stemming of converted text • Intersection of list of words with Agrovoc terms... • a bag of Agrovoc terms generated! • Mapping of a bag of Agrovoc terms with Agrotags... • a set of Agrotags generated! • Probability analysis of Agrotags generated on... • Length of a phrase in words • Frequency of the words • Node Degree of the candidate terms • Occurrence based on location of the terms • Appearance: Binary Variable to check the presence of the terms. • Extracted top “n” Agrotags as an output! Yikes!