Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Flinkで構築する リアルタイムストリーム処理 パイプライン #scalafukuoka /processing-platform-with-apache-flink

Apache Flinkで構築する リアルタイムストリーム処理 パイプライン #scalafukuoka /processing-platform-with-apache-flink

Scala福岡2017のLT資料です

0a98ad166f9cdf8d27d92c37438c6e9d?s=128

Manabu Matsuzaki

July 29, 2017
Tweet

Transcript

  1. Apache FlinkͰߏங͢Δ ϦΞϧλΠϜετϦʔϜॲཧ ύΠϓϥΠϯ Scala෱Ԭ2017 2017/07/29 @matsumana

  2. ࣗݾ঺հ • ໊લɿ দ࡚ ֶ • ॴଐɿ LINE Fukuokaגࣜձࣾ •

    Roleɿ SRE • Twitterɿ @matsumana
  3. ΞδΣϯμ • Flinkͷ֓ཁ • TwitterͷσʔλΛूܭͯ͠ՄࢹԽ͢ΔσϞ

  4. Flinkͷ֓ཁ

  5. • Streaming first • Fault tolerant • Scalable • Performance

    ಛ௃
  6. ϩΰ͸ՄѪ͘ͳ͍

  7. جຊతͳॲཧͷྲྀΕ

  8. Source(input), Sink(output) ͸৭ʑબ୒Մೳ • Twitter (source) • Kafka (source/sink) •

    RabbitMQ (source/sink) • Apache NiFi (source/sink) • AWS Kinesis (source/sink)
  9. Source(input), Sink(output) ͸৭ʑબ୒Մೳ • HDFS (sink) • Elasticsearch (sink) •

    Cassandra (sink) • Redis (sink) • Flume (sink) • ActiveMQ (sink) • Third-Party Projects (e.g. Apache Zeppelin)
  10. ίʔυྫ

  11. JavaͰ΋ScalaͰ΋ॻ͚Δ͕ Scalaͷํ͕ ॻ͖΍͍͢͠ಡΈ΍͍͢

  12. // Word count in Scala // set up the execution

    environment val env = ExecutionEnvironment.getExecutionEnvironment // get input data val text = env.fromElements( "To be, or not to be --that is the question:--", "Whether 'tis nobler in the mind to suffer", "The slings and arrows of outrageous fortune", "Or to take arms against a sea of troubles") // count val counts = text .flatMap { _.toLowerCase.split("\\W+") } .map { (_, 1) } .groupBy(0) .sum(1) // emit result and print result counts.print()
  13. // Word count in Java // set up the execution

    environment // (লུ) // get input data // (লུ) // count DataStream<Tuple2<String, Integer>> counts = text .flatMap((String line, Collector<String> out) -> { String[] tokens = line.toLowerCase().split("\\W+"); Arrays.stream(tokens) .forEach(out::collect); }) .map(s -> new Tuple2<>(s, 1)) .groupBy(0) .sum(1); // emit result and print result // (লུ)
  14. Flink DataStream API Programming Guide ʹ΋αϯϓϧίʔυ͕͋Γ·͢ https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_api.html

  15. Flink͸LINEͰ΋ ࢖ΘΕ͍ͯ·͢

  16. ڈ೥ͷΧϯϑΝϨϯεͰ঺հ͞Ε͍ͯΔͷͰ ৄ͘͠͸ͦͪΒΛݟͯԼ͍͞ ʢಈըɾεϥΠυ͕ެ։͞Εͯ·͢ʣ • LINE DEVELOPER DAY 2016
 https://engineering.linecorp.com/ja/blog/detail/87 •

    B-6 ηογϣϯ
 New stream processing platform with Apache Flink
  17. • ֤http statusίʔυ਺ • http statusίʔυͷׂ߹
 (2xx, 3xx, 4xx, 5xx)

    • ϨεϙϯελΠϜͷׂ߹ • ϨεϙϯελΠϜͷpercentile
 (avg, min, 50, 90, 95, 98, 99) ͪͳΈʹɺΧϯϑΝϨϯεͰ࿩͕ग़ͯΔ ΞΫηεϩάूܭ͸͜Μͳ΍ͭͰ͢
  18. ࣍͸ Twitter͔ΒσʔλΛऔ͖ͬͯͯ ՄࢹԽ͢ΔσϞ

  19. ֓ཁਤ

  20. Thank you :)