Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyConZA 2012: "MeerKAT science with Python" by Simon Ratcliffe

Pycon ZA
October 04, 2012

PyConZA 2012: "MeerKAT science with Python" by Simon Ratcliffe

This talk is aimed at highlighting the use of Python within the MeerKAT radio telescope project. It will introduce the audience to radio astronomy in general and the MeerKAT and SKA in particular. The wide ranging use of Python in the project will be shown, finishing with a discussion on our goals of making Python the primary language for the high performance side of the telescope.

The MeerKAT radio telescope (www.ska.ac.za) is currently being constructed in the Karoo, and once complete will be the most sensitive telescope in the world in it's frequency range. In addition the MeerKAT serves as a precusor instrument to the international SKA project (www.skatelescope.org), which has recently awarded hosting of the majority of the facility to South Africa. This talk will introduce radio astronomy and discuss the wide ranging use of Python within the MeerKAT telescope. We will also discuss our attempts and future plans to make Python an integral part of the extremely challenging high performance computing efforts that are central to modern radio facilities.

Pycon ZA

October 04, 2012
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. KAT-7 Christiansen, W. N., Frater, R. H., Watkinson, A., Osullivan,

    J. D., Goss, W. M., & Lockhart, I. A, 1977. Mon. Not. R. Astr. Soc, 181, 183
  2. KAT-7 Science MOST image (McAdam 1991) Proceedings of the Astronomical

    Society of Australia, copyright 1991 CSIRO Publishing
  3. Oct 20: ATel #3694; F. Hungwe (Rhodes U, HartRAO) et

    al., on behalf of the Fermi Lat Oct 21: PKS 1510-089 KAT-7 Science
  4. Corroborating evidence: ATel #3713; Michael Gaylard (HartRAO), Marion West (HartRAO),

    Philip Edwards (CSIRO Astronomy and Space Science), Jamie Stevens (CSIRO Astronomy and Space Science), Roopesh Ojha (NASA/GSFC) and Faith Hungwe (Rhodes U., HartRAO) KAT-7 Science
  5. Big Iron 10 -100 PFlop Many PFlop up to 500

    Tb/s 0.2 - 2.5 EFlop SKA Phase 2 up to 5 Pb/s 1 EFlop/ EMAC … Incoming Data from collectors Switch Buffer store Switc h Buffer store HP C Bulk Store Correlator Beamformer UV Processor Imaging: Non- Imaging: Corner Turning Course Delays Fine F-step/ Correlation Visibility Steering Observation Buffer Gridding Visibilities Imaging Image Storage Corner Turning Course Delays Beamforming/ De-dispersion Beam Steering Observation Buffer Time-series Searching Search analysis Object/timing Storage HPC science processing 1 - 5 PFlop Few PFlop Software Complexity & Data*: 400 - 1000 Gb/s 30 PFlop SKA Phase 1 up to 50 Tb/s 15 PFlop/ PMAC * All numbers approximate made up! Credit: Andy Faulkner
  6. Big Data Andrew Cooper 1000000000000000000 B @ 1 bit per

    grain of standardised* sand * assumptions apply (10gpmm^3, 3kmx2kmx700m, only valid when calculated on paper napkin) X
  7. Medium Data 1000000000000000 B @ 1 byte per grain of

    standardised* iDevice detritus * retina display has infinite resolution which may imply some spacetime discontinuity introduced by blend process for every teenager on the planet X
  8. SP Philosophy Develop in an iterative fashion Common elements are

    lightweight and simple Test as you go Deploy intelligently Solve today's problem, not tomorrow's
  9. Done is the engine of more Simple is better than

    complex Optimise last Commodity is king Perfection is an illusion SP Philosophy JSNTB – Just Say No To Bullets
  10. KAT-7 – Commensal Observing Package katlive provides real time access

    to visibility and meta data for end users. System is aware of target types such as gain/bandpass cal and can apply these on the fly to the data. Any additional meta data (e.g. wind speed) can be added to object via simple query. No impact on current observation – truly commensal. Large number of sessions possible as data is all multicast.
  11. Data Transport - SPEAD Self describing, high speed, low overhead

    numpy transport connecting a wide range of potentially disparate devices.
  12. Data Transport - SPEAD Joint development between SKA South Africa

    and UC Berkeley as part of the CASPER collaboration. Designed to handle a wide variety of astronomical data including voltage, visibility, and sensor data. Standard output data format for ROACH based correlators. Reference Python implementation available: https://github.com/ska-sa/PySPEAD
  13. Data Storage - HDF5 Read case (whole dataset) for H5Py

    was much slower than raw disk speed (5 – 10x). Test suite was written to benchmark data access across a variety of usage scenarios: Raw – binary read of entire HDF5 file into memory (Python) Full load – entire dataset read into memory Spectrogram – Freq vs Time for a single baseline Scan – contiguous time range (200 samples) for all channels and baselines Flagging – all data for single timestamp Imaging – all data for single channel Test HW: Intel 990X / X58 / Intel 310 series SSD / 12 GB RAM Test SW: Ubuntu 11.04 64-bit / Python 2.7.1 / ext4 FS Test SW: HDF5 1.8.4 / H5Py 2.0.0 / PyTables 2.3 Test File: 7.34 GiB HDF5 with real KAT-7 data
  14. Data Storage - HDF5 System caches dropped prior to each

    run os.system("sync; echo 1 > /proc/sys/vm/drop_caches") Testing parameters included: Atom type: Complex64 / Float32 Data Order: Time,Frequency,Baseline / Time,Baseline,Frequency Compression: None / bzip2 / zlib / blosc0 / blosc9 Each test result is average of 5 runs with randomised indices for subselections. Eventual optimal (best balance across tests) combination as follows: Atom Type: Complex64 Data Order: Time, Baseline, Frequency Compression: blosc9 In other words, none of our original parameters appeared optimal ! Blosc9 file is 21% smaller as well :)
  15. Reduction - Compressed Sensing Toolkit Solves: y = A x

    M samples known MxN matrix unknown N-dim S-sparse signal S < M < N
  16. Pure Python Greedy methods OMP, OMP+ Convex relaxation: BP, BP+,

    QP, QP+,... Not released yet… Reduction - Compressed Sensing Toolkit
  17. Ideal image of PKS 1610-60 radio galaxy based on 12-hour

    observation Initial image based on 10-minute observation Corrected image using compsense Corrected image using traditional package Reduction - Compressed Sensing Toolkit