Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Networking & Number Crunching with C++: from In...

Networking & Number Crunching with C++: from Incremental Statistical Computation to Online Machine Learning (CppCon 2015)

Our task is simple: Get some numbers, crunch them, get some more -- lather, rinse, repeat.

Caveats:

- the numbers arrive over the network,
- our number crunching may take time,
- combining numbers coming from multiple network feeds can be useful.

In this session we will examine multiple design points through several exploratory examples.

The questions we'd like to answer:

- networking & online numerics -- what are the choices and libraries?
- how do we test correctness?
- how do we measure performance -- and keep track of how much computation we can perform as the data arrives?
- task distribution & collection design -- how can the task-based concurrency help?
- how does the choice of a network library (and the implied level of abstraction) influence the available concurrency design options?
- can the improvements to std::future coming in Concurrency TS (composition, continuation) help -- and if so, how?

Matt P. Dziubinski

September 25, 2015
Tweet

More Decks by Matt P. Dziubinski

Other Decks in Programming

Transcript

  1. C++, Networking, and Numerics From Incremental Statistics to Online Machine

    Learning Matt P. Dziubinski CppCon 2015 [email protected] // @matt_dz Department of Mathematical Sciences, Aalborg University CREATES (Center for Research in Econometric Analysis of Time Series)
  2. Goal(s) do { • Get some numbers, • crunch them,

    • get some more } while (more_numbers); 2
  3. Outline • Takeaways • What I set out to do

    • What I actually did • Why and how what I actually did was very, very wrong 3
  4. Outline • Takeaways • What I set out to do

    • What I actually did • Why and how what I actually did was very, very wrong • What I should have done instead 3
  5. Outline • Failures • What I set out to do

    • What I actually did • Why and how what I actually did was very, very wrong • What I should have done instead 4
  6. Outline • Lessons learned • What I set out to

    do • What I actually did • Why and how what I actually did was very, very wrong • What I should have done instead 5
  7. Intro: oh... -- Edsger W.Dijkstra, ”How do we tell truths

    that might hurt?”, June 18, 1975 14
  8. Is "My Bio" Slide Obligatory? -- Edsger W.Dijkstra, ”How do

    we tell truths that might hurt?”, June 18, 1975 (emphasis mine) 15
  9. Intro: Takeaways Yale N. Patt, Microprocessor Performance, Phase 2: Can

    We Harness the Transformation Hierarchy http://hps.ece.utexas.edu/videos.html 33
  10. Intro - Eggdrop src/main.c int main(int arg_c, char **arg_v) {

    // ... debug0("main: entering loop"); while (1) { mainloop(1); } } 51
  11. Intro - Eggdrop src/main.c int mainloop(int toplevel) { // ...

    char buf[520]; // ... xx = sockgets(buf, &i); // ... if (xx >= 0) { /* Non-error */ int idx; for (idx = 0; idx < dcc_total; idx++) // ... } 52
  12. Intro: (Recurring) Takeaways Yale N. Patt at Yale Patt 75

    Visions of the Future Computer Architecture Workshop: ” ’Are you a software person or a hardware person?’ I’m a person this pigeonholing has to go We must break the layers Abstractions are great - AFTER you understand what’s being abstracted” Yale N. Patt, 2013 IEEE CS Harry H. Goode Award Recipient Interview — https://youtu.be/S7wXivUy-tk Yale N. Patt at Yale Patt 75 Visions of the Future Computer Architecture Workshop — https://youtu.be/x4LH1cJCvxs 54
  13. Intro: (Recurring) Takeaways ”Finally, it is also very fortunate to

    see from a researcher’s point of view that many open and fundamental questions will definitely appear and that these will stimulate and keep our lives busy, hopefully for the next 100 years.” Hardware/Software Codesign: The Past, the Present, and Predicting the Future http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6172642 55
  14. Task(s) do { • Get some numbers, • crunch them,

    • get some more } while (more_numbers); 57
  15. Task(s) do { • Get some numbers, • network •

    crunch them, • incremental, online • get some more } while (more_numbers); 58
  16. Pattern-Oriented Software Architecture (POSA) http://www.cs.wustl.edu/~schmidt/POSA/ • Pattern-Oriented Software Architecture: A

    System of Patterns, Volume 1 • Pattern-Oriented Software Architecture: Patterns for Concurrent and Networked Objects, Volume 2 71
  17. Excuses • Make It Work • Make It Right •

    Make It Fast http://c2.com/cgi/wiki?MakeItWorkMakeItRightMakeItFast • Do The Simplest Thing That Could Possibly Work http://c2.com/cgi/wiki?DoTheSimplestThingThatCouldPossiblyWork 74
  18. Excuses Design Principles • Make It Work • Make It

    Right • Make It Fast http://c2.com/cgi/wiki?MakeItWorkMakeItRightMakeItFast • Do The Simplest Thing That Could Possibly Work http://c2.com/cgi/wiki?DoTheSimplestThingThatCouldPossiblyWork 75
  19. Catch: C++ Automated Test Cases in Headers I #include "catch.hpp"

    (also shown on this slide: full list of dependencies and complete build instructions) 81
  20. Catch: C++ Automated Test Cases in Headers II SCENARIO("API symbol

    query constructed correctly", "[symbol][query][unit]") { GIVEN("Query date parameters") { const date start_date = ...#1; const date end_date = ...#2; WHEN("the symbol is set to X") { const symbol_type symbol = "X"; const auto uri = api::path() + api::symbol_query(symbol, start_date, THEN("the built URI is correct") { REQUIRE(uri == ...X...#1...#2); } } } } 82
  21. Testing? These things are ”easy mode” for tests. -- Ben

    Deane https://github.com/boostcon/cppnow_presentations_2015/raw/master/files/testing- battlenet.pdf https://cppcon2015.sched.org/event/ac2534ecb08510c5810e7df34cdddb94 83
  22. net::download_http -> net::download_socket • cf. http://www.sfml-dev.org/tutorials/2.3/network-socket.php#non-blocking- sockets • http://boost.org/libs/utility http://www.boost.org/doc/libs/master/libs/utility/doc/html/string_ref.html

    http://theboostcpplibraries.com/boost.utility // Constructs from a NULL-terminated string basic_string_ref(const charT* str); // Constructs from a pointer, length pair basic_string_ref(const charT* str, size_type len); 92
  23. Boost.StringRef -- std::string_view Marshall Clow: string_view - when to use

    it, and when not. http://www.boost.org/doc/libs/release/libs/utility/doc/html/string_re http://en.cppreference.com/w/cpp/experimental/basic_string_view 93
  24. Performance Numbers: Sync (Single-Issue Sequential) id,symbol,count,time 1,AAPL,565449,1.59043 2,AXP,731366,3.43745 3,BA,867366,5.40218 4,CAT,830327,7.08103

    5,CSCO,400440,8.49192 6,CVX,687198,9.98761 7,DD,910932,12.2254 8,DIS,910430,14.058 9,GE,871676,15.8333 10,GS,280604,17.059 11,HD,556611,18.2738 12,IBM,860071,20.3876 13,INTC,559127,21.9856 14,JNJ,724724,25.5534 15,JPM,500473,26.576 16,KO,864903,28.5405 17,MCD,717021,30.087 18,MMM,698996,31.749 19,MRK,733948,33.2642 20,MSFT,475451,34.3134 21,NKE,556344,36.4545 94
  25. Performance Numbers: Async Pipeline id,symbol,count,time 1,AAPL,565449,2.00713 2,AXP,731366,2.09158 3,BA,867366,2.13468 4,CAT,830327,2.19194 5,CSCO,400440,2.19197

    6,CVX,687198,2.19198 7,DD,910932,2.51895 8,DIS,910430,2.51898 9,GE,871676,2.51899 10,GS,280604,2.519 11,HD,556611,2.51901 12,IBM,860071,2.51902 13,INTC,559127,2.51902 14,JNJ,724724,2.51903 15,JPM,500473,2.51904 16,KO,864903,2.51905 17,MCD,717021,2.51906 18,MMM,698996,2.51907 19,MRK,733948,2.51908 20,MSFT,475451,2.51908 21,NKE,556344,2.51909 95
  26. Testing • Testing • Phil Nash: Test Driven C++ With

    Catch • http://www.levelofindirection.com/journal/2015/7/8/a-game-of- tag.html • https://www.snellman.net/blog/archive/2015-07-09-unit-testing-a- tcp-stack/ 116
  27. Testing and Performance • Continuous Performance Management • Martin Thompson:

    ”Designing for Performance” https://youtube.com/watch?v=fDGWWpHlzvw • ”Performance test as part of Continuous Integration” • ”Can your acceptance tests run as performance tests?” • ”Build telemetry into production systems” • CPM for C++ http://baptiste-wicht.com/posts/2015/06/continuous- performance-management-with-cpm-for-cpp.html • Baseline(s) for CPM • Measure baseline overhead: NOP • Re-measure added overhead incrementally auto do_nothing = [](double price) {}; auto process_price = do_nothing; 117
  28. Testing and Performance • Bryce Adelstein-Lelbach: Benchmarking C++ Code •

    Repeat tests: Uncertainty! • Chandler: Tuning C++: Benchmarks, and Compilers, and CPUs! Oh My! • perf - More than just counters: https://perf.wiki.kernel.org/ (e.g., asm branches visualization) • more tools: • Linux Performance: http://www.brendangregg.com/linuxperf.html • gcc-explorer: https://github.com/mattgodbolt/gcc-explorer (e.g., asm <-> C++ code matching colorization) • Rx(Cpp) for backtesting timing w/ virtual time: • https://github.com/Reactive-Extensions/RxCpp • http://weareadaptive.com/blog/2015/07/16/ historical-time-series-data-rx/ • http://blogs.msdn.com/b/rxteam/archive/2012/06/14/testing-rx- queries-using-virtual-time-scheduling.aspx 118
  29. Performance and Latency Standard Deviation and application latency should never

    show up on the same page If you haven’t stated percentiles and a Max, you haven’t specified your requirements Measuring throughput without latency behavior is [usually] meaningless • http://www.azulsystems.com/presentations/qcon-ny-2015-how-not-to- measure-latency • http://www.azulsystems.com/presentations/qcon-london-2014- understanding-latency • http://psy-lob-saw.blogspot.com/2015/02/hdrhistogram-better-latency- capture.html 119
  30. Communication • Lock-free • e.g., http://moodycamel.com/blog • Preferable over MPMC:

    MPSC • Fedor: Live lock-free or deadlock (practical Lock-free programming) • Pedro: How to make your data structures wait-free for reads • bounded tail-latency (readers) • http://concurrencyfreaks.blogspot.com/ • Michael: C++11/14/17 Atomics the Deep dive: the gory details, before the story consumes you! 120
  31. Communication • Artur: Concurrency TS: The Editor’s Report http://www.boost.org/doc/libs/release/doc/html/thread/synchronization.html •

    Pablo: Parallel Program Execution using Work Stealing • Gor: C++ Coroutines - a negative overhead abstraction • http://wg21.link/n4134 - Asynchronous I/O, tcp_reader • Paul: C++ Atomics: The Sad Story of memory_order_consume: A Happy Ending at Last? • Is Parallel Programming Hard, And, If So, What Can You Do About It?: https://www.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.h • What is RCU, Fundamentally?: https://lwn.net/Articles/262464/ • WIP, Feedback Wanted & Welcome!: https://github.com/MattPD/cpplinks/ atomics.lockfree.memory_model.md 121
  32. Physical Models: CPU Pipeline Lecture 7. Pipelining - Carnegie Mellon

    - Computer Architecture 2015 - Onur Mutlu https://youtu.be/dKXbONPqBNY 124