Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NoSQL War Stories meetup talk on Hadoop and Neo...

NoSQL War Stories meetup talk on Hadoop and Neo4j for network analysis

Avatar for Friso van Vollenhoven

Friso van Vollenhoven

April 13, 2012
Tweet

Other Decks in Technology

Transcript

  1. Every('nodes')[First[decl:'id', 'name']] Hfs['TextDelimited[['id', 'name']]']['/tmp/nodes']'] [{2}:'id', 'name'] [{2}:'id', 'name'] [tail] [{2}:'id',

    'name'] [{2}:'id', 'name'] GroupBy('nodes')[by:['id']] nodes[{1}:'id'] [{2}:'id', 'name'] Each('nodes')[FilterPartialDuplicates[decl:'id', 'name']] [{2}:'id', 'name'] [{2}:'id', 'name'] Each('nodes')[PathToNodes[decl:'id', 'name']] [{2}:'id', 'name'] [{2}:'id', 'name'] GlobHfs[/Users/friso/Downloads/bview/alltxt.txt] [{14}:'proto', 'time', 'type', 'peerip', 'peeras', 'prefix', 'path', 'origin', 'nexthop', 'localpref', 'MED', 'community', 'AAGG', 'aggregator'] [{14}:'proto', 'time', 'type', 'peerip', 'peeras', 'prefix', 'path', 'origin', 'nexthop', 'localpref', 'MED', 'community', 'AAGG', 'aggregator'] Each('edges')[PathToEdges[decl:'from', 'to', 'updatecount']] [{14}:'proto', [{14}:'proto', Every('edges')[Sum[decl:'updatecount'][args:1]] Hfs['TextDelimited[['from', 'to', 'updatecount']]']['/tmp/edges']'] [{3}:'from', 'to', 'updatecount'] [{3}:'from', 'to', 'updatecount'] [{3}:'from', 'to', 'updatecount'] [{3}:'from', 'to', 'updatecount'] GroupBy('edges')[by:['from', 'to']] edges[{2}:'from', 'to'] [{3}:'from', 'to', 'updatecount'] [{3}:'from', 'to', 'updatecount'] [{3}:'from', 'to', 'updatecount'] [head]
  2. • No SQL was used throughout the entire codebase •

    (Even though it was tempting to use Hive at one point) • You can find code here: https://github.com/friso/graphs • You can find me on Twitter here: @fzk • You can find me on e-mail here: [email protected]