Upgrade to Pro — share decks privately, control downloads, hide ads and more …

xml-motor ~ what,why,how this xml-parsing rubygem

xml-motor ~ what,why,how this xml-parsing rubygem

http://rubygems.org/gems/xml-motor

rubygem xml-motor
~ What it is?
~ Why you should use it?
~ How to use it?

Abhishek Kumar

March 24, 2012
Tweet

More Decks by Abhishek Kumar

Other Decks in Programming

Transcript

  1. xml­motor
    What it is : slide#2
    Why you should use it : slide#3-6
    How to use it : slide#7-12
    AbhishekKr
    http://www.twitter.com/aBionic
    http://github.com/abhishekkr

    View full-size slide

  2. Late 2011, started a new rubygem project for parsing xml, html.
    @Rubygems: http://rubygems.org/gems/xml-motor
    @GitHub : https://github.com/abhishekkr/rubygem_xml_motor
    Just created it to test out my work at compact, quick &
    easy xml­parsing algorithm... can see that
    @Slideshare: http://www.slideshare.net/AbhishekKr/xmlmotor
    So, currently this is a non­native, completely independent
    less­than­250 ruby­LOC available as a simple rubygem to
    be require­d and use in an easy freehand notation (like
    'div.img') and match with any/multiple node attributes (like
    'id=”a1”' or ['type=”color”', 'name=”white”']).

    View full-size slide

  3. Current Features

    Has a single method access to parse require xml nodes from
    content or file.

    Use it only if you are gonna parse that xml­content once.

    For using same xml­content more than once, follow the 3­way
    step mentioned in examples on end slides.

    It doesn't depend on presence of any other system library,
    purely non­native.

    It parses broken or corrupted xml/html content correctly, just
    for the content it have.

    Can parse results on looking for node­names, attributes of
    node or both.

    View full-size slide

  4. Uses free­freehand notation to retrieve xml nodes.
    If your xml looks like,
    '...
    ABC CBA ...
    XYZ

    XYYZ

    ...
    '
    and you look for 'book.author',
    then, you'll get back ['CBA', 'XY', 'YZ'];
    What that means is the child­node could be at any
    depth in the parent­node.
    Default return mode is without the tags, there is a
    switch to get the nodes.

    View full-size slide

  5. To filter your nodes on the basis of attributes,
    single or multiple attributes can be provided.
    These attribute searches can be combined up
    with freehand node name searches.
    Readme (a bit weird, have to loosen it up):
    https://raw.github.com/abhishekkr/rubygem_xm
    l_motor/master/README

    View full-size slide

  6. Features To Come
    Work on making it more performance efficient.
    Limit over result­nodes retrieved from start/end of
    matching nodes.
    Multi­node attribute­based filter for a hierarchical
    node search.
    Add more common CSS Selector style, capability is
    already present using attribute based search... just
    need to add a mapping method.

    View full-size slide

  7. USAGE
    code we are going to try:
    https://github.com/abhishekkr/axml-motor/tree/master/ruby/examples

    View full-size slide

  8. say, you have an xml file 'dummy.xml', with data as


    non-native



    compact



    easy


    View full-size slide

  9. its available at rubygems.org, install it as
    $ gem install xml­motor
    include it in your ruby code,
    #!/usr/bin/env ruby
    require 'xml­motor'
    get the XML Filename and/or XML data
    available
    fyl = File.join(File.expand_path
    (File.dirname __FILE__),'dummy.xml')
    xml = File.open(fyl,'r'){|fr| fr.read }

    View full-size slide

  10. One-time XML-Parsing directly from file
    XMLMotor.get_node_from_file
    (fyl, 'ummy.mmy', 'class="sys"')
    Result:
    ["non­native", "\n id=\"mem\">compact\n "]
    One-time XML-Parsing directly from content
    XMLMotor.get_node_from_content
    (xml, 'dummy.my', 'class="usage"')
    Result:
    ["easy"]

    View full-size slide

  11. Way to go for XML-Parsing for xml node searches
    xsplit = XMLMotor.splitter xml
    xtags = XMLMotor.indexify xsplit
    [] just normal node name based freehand notation
    to search:
    XMLMotor.xmldata
    (xsplit, xtags, 'dummy.my')
    Result: ["compact", "easy"]
    [] searching for values of required nodes filtered
    by attribute:
    XMLMotor.xmldata
    (xsplit, xtags, nil, 'class="usage"')
    Result: ["easy"]

    View full-size slide

  12. [] searching for values of required nodes filtered by
    freehand tag-name notation & attribute:
    XMLMotor.xmldata(xsplit, xtags,
    'dummy.my', 'class="usage"')
    Result: ["easy"]
    [] searching for values of required nodes filtered by
    freehand tag-name notation & multiple attributes:
    XMLMotor.xmldata(xsplit, xtags,
    'dummy.my', ['class="sys"', 'id="mem"'])
    Result: ["compact"]

    View full-size slide