Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mahmoud S. Mahmoud. Auckland University of Technology, New Zealand

Mahmoud S. Mahmoud. Auckland University of Technology, New Zealand


Multicore World

July 20, 2012


  1. Parallelization Middleware for Exascale Data Processing and Transport and the

    SKA telescope Mahmoud S. Mahmoud PhD Student, Research Assistant
  2. Overview ➲  SKA's phased construction is a driver for greater

    software architecture scalability. ➲  ASKAP technical computing plans show a desire for service oriented solutions (Humphreys, 2009). ➲  LOFAR computing also calls for a dynamic and adaptive high performance service architectures (Andresson, 2003).
  3. Parallelization Middleware ➲  Message Passing Interface (MPI) ➲  Common Object

    Request Broker Architecture (CORBA) ➲  Distributed Data Service (DDS) ➲  ProActive Parrallel Suite ➲  Internet Connection Engine (ICE) ➲  IBM InfoSphere Streams (Streams) ➲  Caravella-GPU ➲  S4 (Map Reduce Inspired) ➲  Comet (Microsoft)
  4. Stream Computing Paradigm ➲  A shift from traditional data mining

    towards real time analysis of data in motion. (IBM, 2008)
  5. Case Study: IBM InfoSphere Streams (Streams) ver. 2 ➲  Distributed

    run-time environment for dynamic task scheduling and load-balancing. ➲  Asynchronous low latency high performance messaging. ➲  Eclipse IDE plug-in. ➲  Offers a declarative language to facilitate problem definition and parallelization. ➲  Mainly executes on multi-core x86 architectures, however can facilitate other architectures.
  6. Stream Processing Language (SPL) (originating from SPADE in Streams ver.

    1) ➲  Relational Operators   Filter, Functor, Punctor, Sort, Join & Aggregate ➲  Adapter Operators   File/Directory, TCP/UDP & Import/Export ➲  Utility Operators   Barrier, Throttle, Delay ... ➲  Compat Operators ➲  Primitive Operators   Support Java and C++
  7. Data Flow Graph Optimization ➲  Container generation. ➲  Operator fusing.

    ➲  Continuous profiling and optimization.
  8. Considerations for Parallelization ➲  Development minimization tools. ➲  Dynamic scheduling

    and load balancing. ➲  Scalability and adaptability. ➲  Measuring arithmetic intensity. ➲  Efficient power usage. ➲  Data provenance.
  9. Effective Data Provenance (adapted from Humphreys, 2009)

  10. Acknowledgments ➲  Erdödy Entrepreneurship for their kind sponsorship. ➲  Tertiarty

    Education Commision (TEC) for PhD scholarship. ➲  IBM T.J. Watson Center & IBM NZ for guidance and equipment. ➲  IRASR and AUT University