Mahmoud S. Mahmoud. Auckland University of Technology, New Zealand

Mahmoud S. Mahmoud. Auckland University of Technology, New Zealand

Bd1c4acb24d143c7ca8dff849461ebe3?s=128

Multicore World

July 20, 2012
Tweet

Transcript

  1. Parallelization Middleware for Exascale Data Processing and Transport and the

    SKA telescope Mahmoud S. Mahmoud PhD Student, Research Assistant
  2. Overview ➲  SKA's phased construction is a driver for greater

    software architecture scalability. ➲  ASKAP technical computing plans show a desire for service oriented solutions (Humphreys, 2009). ➲  LOFAR computing also calls for a dynamic and adaptive high performance service architectures (Andresson, 2003).
  3. Parallelization Middleware ➲  Message Passing Interface (MPI) ➲  Common Object

    Request Broker Architecture (CORBA) ➲  Distributed Data Service (DDS) ➲  ProActive Parrallel Suite ➲  Internet Connection Engine (ICE) ➲  IBM InfoSphere Streams (Streams) ➲  Caravella-GPU ➲  S4 (Map Reduce Inspired) ➲  Comet (Microsoft)
  4. Stream Computing Paradigm ➲  A shift from traditional data mining

    towards real time analysis of data in motion. (IBM, 2008)
  5. Case Study: IBM InfoSphere Streams (Streams) ver. 2 ➲  Distributed

    run-time environment for dynamic task scheduling and load-balancing. ➲  Asynchronous low latency high performance messaging. ➲  Eclipse IDE plug-in. ➲  Offers a declarative language to facilitate problem definition and parallelization. ➲  Mainly executes on multi-core x86 architectures, however can facilitate other architectures.
  6. Stream Processing Language (SPL) (originating from SPADE in Streams ver.

    1) ➲  Relational Operators   Filter, Functor, Punctor, Sort, Join & Aggregate ➲  Adapter Operators   File/Directory, TCP/UDP & Import/Export ➲  Utility Operators   Barrier, Throttle, Delay ... ➲  Compat Operators ➲  Primitive Operators   Support Java and C++
  7. Data Flow Graph Optimization ➲  Container generation. ➲  Operator fusing.

    ➲  Continuous profiling and optimization.
  8. Considerations for Parallelization ➲  Development minimization tools. ➲  Dynamic scheduling

    and load balancing. ➲  Scalability and adaptability. ➲  Measuring arithmetic intensity. ➲  Efficient power usage. ➲  Data provenance.
  9. Effective Data Provenance (adapted from Humphreys, 2009)

  10. Acknowledgments ➲  Erdödy Entrepreneurship for their kind sponsorship. ➲  Tertiarty

    Education Commision (TEC) for PhD scholarship. ➲  IBM T.J. Watson Center & IBM NZ for guidance and equipment. ➲  IRASR and AUT University