Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Semantic Content Repositories

Semantic Content Repositories

Leveraging Drupal and other FOSS tools, for a connected meaningful web. Presented at FOSS.IN/2010, Bangalore.

Pratul Kalia

December 17, 2010
Tweet

More Decks by Pratul Kalia

Other Decks in Technology

Transcript

  1. FOSS.IN/2010, Bengaluru, India Semantic Content Repositories Leveraging Drupal and other

    FOSS tools for a connected meaningful web Pratul Kalia (lut4rp) 17th December, 2010
  2. FOSS.IN/2010, Bengaluru, India Semantic Web • What is missing? •

    Data processing by machines • The most overused term today - web 2.0 • Beyond the gloss, shine and reflections... • Tim Berners-Lee and his Web 3.0
  3. FOSS.IN/2010, Bengaluru, India The Boring Big Words • Metadata •

    Ontologies • RDF • Taxonomies • FOAF, DBpedia etc. etc.
  4. FOSS.IN/2010, Bengaluru, India Enough of talk... • How do you

    actually *do* it? • Not in the future... • Today. Here. Now.
  5. FOSS.IN/2010, Bengaluru, India Why Drupal? • Data abstraction • nodes

    - first class generic objects • Taxonomies • freeform, restricted, multilevel • RDFa support
  6. FOSS.IN/2010, Bengaluru, India Why Drupal? • FOSS == community •

    Contributions: • faceted search • Apache Solr integration • It is already happening! • data.gov.uk — all Drupal!
  7. FOSS.IN/2010, Bengaluru, India Semantics on the go • Why? •

    Cost, portability, ease of use • XML-RPC, REST • Android • Example - Drupal Services
  8. FOSS.IN/2010, Bengaluru, India Agropedia: what is it? • Semantically enabled

    agricultural knowledge (knowledge models) • Read/write web (CMS built using Drupal) • A “social networking” platform for agricultural knowledge exchange Started at IITK with funding from NAIP (National Agriculture Innovation Project) and ICAR (Indian Council of Agriculture Research) Now, 20 institutes and universities across India with 100+ agricultural scientists
  9. FOSS.IN/2010, Bengaluru, India Agrotagger • What is it? • Taxonomies

    • Agrovoc • Agrotags • Document (pdf, ppt. odp, odt) conversion to text • Stop words removal and stemming of converted text • Intersection of list of words with Agrovoc terms... • a bag of Agrovoc terms generated! • Mapping of a bag of Agrovoc terms with Agrotags... • a set of Agrotags generated! • Probability analysis of Agrotags generated on... • Length of a phrase in words • Frequency of the words • Node Degree of the candidate terms • Occurrence based on location of the terms • Appearance: Binary Variable to check the presence of the terms. • Extracted top “n” Agrotags as an output! Yikes!
  10. FOSS.IN/2010, Bengaluru, India Where is the code? • Efforts are

    on to open up the code • (edit: might be open before this talk actually happens)