Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyConZA 2013: "Building the SKA" by Simon Ratcl...

Pycon ZA
October 03, 2013

PyConZA 2013: "Building the SKA" by Simon Ratcliffe

The Square Kilometer Array will be one of the prime scientific data generates in the next few decades.

Construction is scheduled to commence in late 2016 and last for the best part of a decade. Current estimates put data volume generation near 1 Exabyte per day with 2-3 ExaFLOPs of processing required to handle this data.

As a host country, South Africa is constructing a large precursor telescope known as MeerKAT. Once complete this will be the most sensitive telescope of it's kind in the world - until dwarfed by the SKA.

We make extensive use of Python from the entire Monitor and Control system through to data handling and processing.

This talk looks at our current usage of Python, and our desire to see the entire high performance processing chain being able to call itself Pythonic.

We will discuss some of the challenges specific to the radio astronomy environment and how we believe Python can contribute, particularly when it comes to the trade off between development time and performance.

Pycon ZA

October 03, 2013
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. Can I have fifty pounds to mend the shed? =

    Ewan McTeagle Johannes Hevelius 60ft - 1673
  2. 1 Jy = 10-26 Wm-2Hz-1 Eskom ? I canna do

    it, captain, ye canna alter the laws of physics
  3. 105 Jy 108 Jy 0 Jy Baddies Sun @ 5

    GHz GSM Phone @ 1km iOS 7
  4. Big Data Andrew Cooper 1000000000000000000 B @ 1 bit per

    grain of standardised* sand * assumptions apply (10gpmm^3, 3kmx2kmx700m, only valid when calculated on paper napkin, just say no to assumptions) X
  5. Big Iron 10 -100 PFlop Many PFlop up to 500

    Tb/s 0.2 - 2.5 EFlop SKA Phase 2 up to 5 Pb/s 1 EFlop/ EMAC … Incoming Data from collectors Switch Buffer store Switch Buffer store HPC Bulk Store Correlator Beamformer UV Processor Imaging: Non-Imaging: Corner Turning Course Delays Fine F-step/ Correlation Visibility Steering Observation Buffer Gridding Visibilities Imaging Image Storage Corner Turning Course Delays Beamforming/ De-dispersion Beam Steering Observation Buffer Time-series Searching Search analysis Object/timing Storage HPC science processing 1 - 5 PFlop Few PFlop Software Complexit y & Data*: 400 - 1000 Gb/s 30 PFlop SKA Phase 1 up to 50 Tb/s 15 PFlop/ PMAC * All numbers made up for increased wow factor Slide Credit: Andy Faulkner
  6. cooling Ground Loop Hive FERRO 50 TFLOPS 2 TB RAM

    8 TB Flash 20 Gbps 2.5 kW Shielded Rugged Cheap
  7. Big Storage MeerKAT has a primary archive of 10 PB

    SKA will need at least 1 EB in around 2023
  8. IO / Cache / FLOPS / kW / $ V

    ij = M ij B ij G ij D ij E ij P ij T ij V ij IDEAL MAGIC SKA mid
  9. Deliver only what is actually needed Scale requires better ideas,

    not more code Simple is better than complex Optimise last Commodity is king Perfection is an illusion Philosophy (to combat the malaise that is big software)
  10. You hip fanboi – don't you know Python is too

    slow... agggtaaa|tttaccct 0 [cgt]gggtaaa|tttaccc[acg] 3 a[act]ggtaaa|tttacc[agt]t 9 ag[act]gtaaa|tttac[agt]ct 8 agg[act]taaa|ttta[agt]cct 10 aggg[acg]aaa|ttt[cgt]ccct 3 agggt[cgt]aa|tt[acg]accct 4 agggta[cgt]a|t[acg]taccct 3 agggtaa[cgt]|[acg]ttaccct 5 regex_dna (http://shootout.alioth.debian.org/) C: 7.21 seconds Python: 31.7 seconds 4.4 times slower ! { }
  11. No really, it is so sloooooow... Laplace solver (http://www.scipy.org/PerformancePython) C:

    0.97 seconds Python: 6345.6 seconds 6,500 times slower Suck it Python !
  12. Cereal ? Totally serial 6345.6 seconds 2.57 seconds Vectorised* 2469

    times faster ! (only 2.6 times slower than C) *Like Totally
  13. Simple optimisation gives us close to C performance That's because

    numpy is just optimised C underneath True, but I didn't actually have to write any C Don't you get paid by the kloc ? Someone else can write code, I am off to the beach.
  14. Admittedly, we are still slower than C. But the language

    itself does not dictate performance*. The compiler / interpeter does. It's not Python, but CPython to blame. *yes, yes, I know. But outrageous claims are my forté, after all I invented the question mark.
  15. Optimised gridding ~ 1ns per point per core (Xeon E5-2690)

    We need 64 Gbps to keep an 8-core CPU busy In reality we have an order of magnitude less
  16. If our IO is efficient we can sacrifice an order

    of magnitude in CPU performance without overall impact* *For our specific data intensive use cases. Terms and conditions apply. If you can read this you are too close.
  17. Is our IO efficient ? Yes* * Blah blah blah

    HDF5 blah blah Blosc blah blah C is the only real language blah blah SPEAD blah blah numpy
  18. Our biggest cost is development time, not hardware This is

    where using high level languages, without excessive optimisation, really win.
  19. Data Transport - SPEAD Self describing, high speed, low overhead

    numpy transport connecting a wide range of potentially disparate devices.