Upgrade to Pro — share decks privately, control downloads, hide ads and more …

xml-motor ~ what,why,how this xml-parsing rubygem

xml-motor ~ what,why,how this xml-parsing rubygem


rubygem xml-motor
~ What it is?
~ Why you should use it?
~ How to use it?

Abhishek Kumar

March 24, 2012

More Decks by Abhishek Kumar

Other Decks in Programming


  1. xml­motor What it is : slide#2 Why you should use

    it : slide#3-6 How to use it : slide#7-12 AbhishekKr http://www.twitter.com/aBionic http://github.com/abhishekkr
  2. Late 2011, started a new rubygem project for parsing xml,

    html. @Rubygems: http://rubygems.org/gems/xml-motor @GitHub : https://github.com/abhishekkr/rubygem_xml_motor Just created it to test out my work at compact, quick & easy xml­parsing algorithm... can see that @Slideshare: http://www.slideshare.net/AbhishekKr/xmlmotor So, currently this is a non­native, completely independent less­than­250 ruby­LOC available as a simple rubygem to be require­d and use in an easy freehand notation (like 'div.img') and match with any/multiple node attributes (like 'id=”a1”' or ['type=”color”', 'name=”white”']).
  3. Current Features • Has a single method access to parse

    require xml nodes from content or file. • Use it only if you are gonna parse that xml­content once. • For using same xml­content more than once, follow the 3­way step mentioned in examples on end slides. • It doesn't depend on presence of any other system library, purely non­native. • It parses broken or corrupted xml/html content correctly, just for the content it have. • Can parse results on looking for node­names, attributes of node or both.
  4. Uses free­freehand notation to retrieve xml nodes. If your xml

    looks like, '<library>... <book> <title>ABC</title> <author>CBA</author> </book>... <book> <title>XYZ</title> <authors> <author>XY</author><author>YZ</author> </authors> </book>... </library>' and you look for 'book.author', then, you'll get back ['CBA', 'XY', 'YZ']; What that means is the child­node could be at any depth in the parent­node. Default return mode is without the tags, there is a switch to get the nodes.
  5. To filter your nodes on the basis of attributes, single

    or multiple attributes can be provided. These attribute searches can be combined up with freehand node name searches. Readme (a bit weird, have to loosen it up): https://raw.github.com/abhishekkr/rubygem_xm l_motor/master/README
  6. Features To Come Work on making it more performance efficient.

    Limit over result­nodes retrieved from start/end of matching nodes. Multi­node attribute­based filter for a hierarchical node search. Add more common CSS Selector style, capability is already present using attribute based search... just need to add a mapping method.
  7. say, you have an xml file 'dummy.xml', with data as

    <dummy> <ummy> <mmy class='sys'>non-native</mmy> </ummy> <ummy> <mmy class='sys'> <my class='sys' id='mem'>compact</my> </mmy> </ummy> <mmy type='user'> <my class='usage'>easy</my> </mmy> </dummy>
  8. its available at rubygems.org, install it as $ gem install

    xml­motor include it in your ruby code, #!/usr/bin/env ruby require 'xml­motor' get the XML Filename and/or XML data available fyl = File.join(File.expand_path (File.dirname __FILE__),'dummy.xml') xml = File.open(fyl,'r'){|fr| fr.read }
  9. One-time XML-Parsing directly from file XMLMotor.get_node_from_file (fyl, 'ummy.mmy', 'class="sys"') Result:

    ["non­native", "\n <my class=\"sys\" id=\"mem\">compact</my>\n "] One-time XML-Parsing directly from content XMLMotor.get_node_from_content (xml, 'dummy.my', 'class="usage"') Result: ["easy"]
  10. Way to go for XML-Parsing for xml node searches xsplit

    = XMLMotor.splitter xml xtags = XMLMotor.indexify xsplit [] just normal node name based freehand notation to search: XMLMotor.xmldata (xsplit, xtags, 'dummy.my') Result: ["compact", "easy"] [] searching for values of required nodes filtered by attribute: XMLMotor.xmldata (xsplit, xtags, nil, 'class="usage"') Result: ["easy"]
  11. [] searching for values of required nodes filtered by freehand

    tag-name notation & attribute: XMLMotor.xmldata(xsplit, xtags, 'dummy.my', 'class="usage"') Result: ["easy"] [] searching for values of required nodes filtered by freehand tag-name notation & multiple attributes: XMLMotor.xmldata(xsplit, xtags, 'dummy.my', ['class="sys"', 'id="mem"']) Result: ["compact"]