Upgrade to Pro — share decks privately, control downloads, hide ads and more …

glusterfs-pmux

254c7a2e2055fe9300ea3b419c4b1cfe?s=47 maebashi
November 13, 2013

 glusterfs-pmux

GlusterFSを利用した軽量MapReduceフレームワークpmux
2013/11/13 Gluster Cloud Night 発表資料
(於 レッドハット株式会社)

254c7a2e2055fe9300ea3b419c4b1cfe?s=128

maebashi

November 13, 2013
Tweet

Transcript

  1. ©ɹ2013 Internet Initiative Japan Inc. ©ɹ2013 Internet Initiative Japan Inc.

    GlusterFSΛར༻ͨ͠ ܰྔMapReduceϑϨʔϜϫʔΫ Pmux גࣜձࣾΠϯλʔωοτΠχγΞςΟϒ maebashi@iij.ad.jp
  2. ©ɹ2013 Internet Initiative Japan Inc. ࣗݾ঺հ •  લڮ޹޿(Takahiro Maebashi) • 

    גࣜձࣾΠϯλʔωοτΠχγΞςΟϒ(IIJ) •  ITpro: ITݕূϥϘ -- ෼ࢄϑΝΠϧγεςϜͷ GlusterFSɿ͜Μͳͱ͖ɺͲ͏ͳΔ – http://itpro.nikkeibp.co.jp/ article/COLUMN/20130104/447701/!
  3. ©ɹ2013 Internet Initiative Japan Inc. GlusterFS @ IIJ Tokyo Osaka

    Matsue
  4. ©ɹ2013 Internet Initiative Japan Inc. ίϯςφϢχοτʮIZmoʯ (Matsue Data Center Park)

    IT module air-conditioning unit
  5. ©ɹ2013 Internet Initiative Japan Inc. GlusterFS servers in IZmo • 

    ϥοΫ͕ࣼΊʹ഑ஔ͞Ε͍ͯΔ
  6. ©ɹ2013 Internet Initiative Japan Inc. Today's Talk

  7. ©ɹ2013 Internet Initiative Japan Inc. glusterfs-hadoop

  8. ©ɹ2013 Internet Initiative Japan Inc. What is MapReduce? MapͱReduceͷ2ஈ֊Ͱ෼ࢄॲཧ (1)

    Map – நग़ɺม׵ (2) Reduce – ू໿ɺूܭ
  9. ©ɹ2013 Internet Initiative Japan Inc. What is GlusterFS?

  10. ©ɹ2013 Internet Initiative Japan Inc. What is GlusterFS? (2) (ྫ:

    distributed volume ͷ৔߹) ϑΝΠϧ୯ҐͰɺϑΝΠϧ໊ʹԠͯ͡෼ࢄ
  11. ©ɹ2013 Internet Initiative Japan Inc. What is pmux? (1) • 

    pipeline multiplexer ʹ༝དྷ •  RubyͰهड़͞Ε͍ͯΔ • https://github.com/iij/pmux! • https://forge.gluster.org/pmux!
  12. ©ɹ2013 Internet Initiative Japan Inc. What is pmux? (2) • 

    ϑΝΠϧϕʔεͷ map/reduce πʔϧ •  Unix ͷඪ४ೖྗ/ग़ྗΛΠϯλϑΣʔεͱͯ͠ ࢖͏ $ pmux --mapper="grep PATTERN" *.log Example: ෼ࢄgrep GlusterFS্ͷϑΝΠϧ
  13. ©ɹ2013 Internet Initiative Japan Inc. What is pmux? (3)

  14. ©ɹ2013 Internet Initiative Japan Inc. Install $ gem install pmux

    $ gem install pmux $ gem install gflocator $ sudo gflocator
  15. ©ɹ2013 Internet Initiative Japan Inc. Execution Overview (1) MapReduceɺreduce phaseͳ͠ͷ৔߹

  16. ©ɹ2013 Internet Initiative Japan Inc. 1. ର৅ͱ͢ΔϑΝΠϧ܈Λ୳͢ pmux ίϚϯυ͸͜ͷϗετ Ͱ࣮ߦ͢Δ

    USVTUFEHMVTUFSGTQBUIJOGP ΛಡΈग़͢
  17. ©ɹ2013 Internet Initiative Japan Inc. ֦ுϑΝΠϧଐੑ(xattr) •  ϝλσʔλΛϢʔβ͕ϑΝΠϧʹ݁ͼ͚ͭΔ͜ ͱ͕ग़དྷΔΑ͏ʹ͢ΔϑΝΠϧγεςϜͷػ ೳ

    (wikipedia) •  GlusterFS ͸ɺ֦ுϑΝΠϧଐੑΛ֎෦ͱ΍Γ ͱΓ͢ΔͨΊͷ࢓૊Έͱͯ͠࢖͏
  18. ©ɹ2013 Internet Initiative Japan Inc. ֦ுϑΝΠϧଐੑ (2) $ sudo getfattr

    -n trusted.glusterfs.pathinfo \! access_log.20131020! # file: access_log.20131020! trusted.glusterfs.pathinfo="(<DISTRIBUTE:d2r2-! dht> (<REPLICATE:d2r2-replicate-0> <POSIX(/glu! sterfs/brick/d2r2):ex01.example.com:/glusterfs! /brick/d2r2/log/0000/access_log.20131020> <POS! IX(/glusterfs/brick/d2r2):ex00.example.com:/gl! usterfs/brick/d2r2/log/0000/access_log.2013102! 0>))"
  19. ©ɹ2013 Internet Initiative Japan Inc. ֦ுϑΝΠϧଐੑ (3) (glusterfs-hadoop GlusterFSXattr.java)

  20. ©ɹ2013 Internet Initiative Japan Inc. 2. ֤ϊʔυͰpmuxΛىಈ dispatcher worker

  21. ©ɹ2013 Internet Initiative Japan Inc. 3. map tasks Λ֤ϊʔυʹׂΓ౰ͯ tasks

    are assigned to nodes(workers) dynamically dispatcher worker
  22. ©ɹ2013 Internet Initiative Japan Inc. 4. popen (map task ࣮ߦ)

    dispatcher worker
  23. ©ɹ2013 Internet Initiative Japan Inc. 5. ݁ՌΛ dispatcher ʹฦ͢ dispatcher

    worker
  24. ©ɹ2013 Internet Initiative Japan Inc. Execution Overview (2) reduce phase

    ͕͋Δ৔߹
  25. ©ɹ2013 Internet Initiative Japan Inc. 4. popen (map task ࣮ߦ)

    dispatcher worker
  26. ©ɹ2013 Internet Initiative Japan Inc. 5. mapper ͕Ұ࣌ϑΝΠϧੜ੒ mapper͸தؒ݁ՌΛؚΜͩҰ࣌ϑΝΠϧΛੜ੒ dispatcher

    worker
  27. ©ɹ2013 Internet Initiative Japan Inc. 6. shuffle dispatcher worker

  28. ©ɹ2013 Internet Initiative Japan Inc. 7. reduce tasks ΛϊʔυʹׂΓ౰ͯ dispatcher

    worker
  29. ©ɹ2013 Internet Initiative Japan Inc. 8. dispatcher ʹ݁ՌΛฦ͢ dispatcher worker

  30. ©ɹ2013 Internet Initiative Japan Inc. example(1): εςʔλείʔυΛ਺͑Δ Apache log ͷHTTPεςʔλείʔυͷग़ݱ਺Λ਺͑Δ

    $ pmux --mapper='cut -d" " -f 9’ \ --reducer='sort|uniq -c’ /mnt/glusterfs/*.log 176331 200 106360 206 809 400 21852 403 533 404 27 406 805 416 25 500
  31. ©ɹ2013 Internet Initiative Japan Inc. example(2): word count $ pmux

    --mapper=map.rb --reducer=reduce.rb \ --file=map.rb –-file=reduce.rb \ /mnt/glusterfs/*.txt #! /usr/bin/ruby -an $F.each {|f| print "#{f}\t1\n"} #! /usr/bin/ruby -an BEGIN {$c = Hash.new 0} $c[$F[0]] += $F[1].to_i END {$c.each {|k, v| print "#{k} #{v}\n"}} map.rb reduce.rb command line
  32. ©ɹ2013 Internet Initiative Japan Inc. ੑೳ 14:00:00.416011 IP 21.44.60.29.http >

    170.73.162.175.58546: . 3523999974:3524001422(1448) ack 3401170238 win 1716 <nop,nop,timestamp 1070614671 1955062367> ҎԼͷΑ͏ͳύέοτΩϟϓνϟϩά (by tcpdump) ֤ϑΝΠϧͰ࠷΋ग़ݱ਺ͷଟ͍IPΞυϨεΛநग़͢Δ 8344 files, 500K lines/file, total 4 billion lines
  33. ©ɹ2013 Internet Initiative Japan Inc. map ίϚϯυ --mapper='egrep –o "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+"|

    sort|uniq -c|sort -nr|head -1'
  34. ©ɹ2013 Internet Initiative Japan Inc. ݁Ռ 8 hr 49 min

    6 sec 1 node, without pmux
  35. ©ɹ2013 Internet Initiative Japan Inc. ݁Ռ 8 hr 49 min

    6 sec 1 min 45 sec 300ഒ! 1 node, without pmux 60 nodes (֤ϊʔυ8ίΞ)