Slide 1

Slide 1 text

Issue/Revision: 1.0 Reference: PGCon’US 16 Status: Issued ESA UNCLASSIFIED - Releasable to the Public ESASky: New Window to the Universe Pilar de Teodoro on behalf of ESDC team 04/19/2016

Slide 2

Slide 2 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 2 ESA UNCLASSIFIED - Releasable to the Public European Space Agency

Slide 3

Slide 3 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 3 ESA UNCLASSIFIED - Releasable to the Public Some physics Different Wavelengths: Image credits: Radio:WSRT/R. Braun; Infrared:NASA/Spitzer/K. Gordon; Visible: Robert Gendler; Ultraviolet: NASA/GALEX; X-ray: ESA/XMM/W. Pietsch

Slide 4

Slide 4 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 4 ESA UNCLASSIFIED - Releasable to the Public ESA Satellites

Slide 5

Slide 5 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 5 ESA UNCLASSIFIED - Releasable to the Public ESASky Concept Goal for Beta release: •  Multi-wavelength exploration •  Single and multi- target search Data Ingredients list: •  All-Sky maps (aka HiPS) •  Individual ESA archives real data and footprints •  Catalogues

Slide 6

Slide 6 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 6 ESA UNCLASSIFIED - Releasable to the Public Ø  Data contents of beta release: •  13 years of INTEGRAL data, 2077 gamma-ray sources •  16 years of XMM-Newton data, 8721 observations, 565962 X-ray and 6 million optical/UV sources •  25 years of HST data, 588820 observations, 29 million optical sources •  4 years of Hipparcos data, 2.5 million optical sources •  2.3 year of ISO data, 47652 observations •  4 years of Herschel data, 16039 observations •  4 years of Planck data, 9 all-sky maps, 153142 radio sources ESA Sky beta In total more than 29.000.000 sources and 1 million observations!!!

Slide 7

Slide 7 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 7 ESA UNCLASSIFIED - Releasable to the Public DEMO or VIDEO DEMO

Slide 8

Slide 8 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 8 ESA UNCLASSIFIED - Releasable to the Public

Slide 9

Slide 9 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 9 ESA UNCLASSIFIED - Releasable to the Public Focusing on data: HiPS HiPS: Hierarchical Progressive Survey (CDS) HEALPix sky tessellation (Hierarchical Equal Area isoLatitude Pixelation) [ http://healpix.sourceforge.net ] HiPS is now IVOA note [ http://ivoa.net/documents/Notes/HiPS/index.html ]

Slide 10

Slide 10 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 10 ESA UNCLASSIFIED - Releasable to the Public HiPS: Hierarchical Progressive Survey HiPS Don’t Lie! CCD borders Duplicated sources Observations with different calibration or exposure time

Slide 11

Slide 11 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 11 ESA UNCLASSIFIED - Releasable to the Public Footprints in numbers Observation type #footprints (#observation) #polygons per observation #total polygons #total points XMM-Newton 8.375 From 2 up to 13 20.041 192.831 XMM-OM(UV) 9.211 2 18.422 38.462 XMM-OM(UVB) 3.379 2 6.758 76.925 HST 345.519 From 2 up to 10 777.376 2.361.085 ISO 6.898 From 2 up to 9 17.104 50.198 Herschel 23.172 2 46.344 1.076.667

Slide 12

Slide 12 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 12 ESA UNCLASSIFIED - Releasable to the Public Catalogues Catalogue #sources INTEGRAL 2.077 XMM Slew 41.423 3XMM EPIC 333.481 XMM OM 3.492.082 Tycho-2 2.539.913 HSC 29.164.309 PSZ 1.653 PGSS 13.242 PCSS 138.247 Effective display of catalogues: order and limit

Slide 13

Slide 13 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 13 ESA UNCLASSIFIED - Releasable to the Public Histograms

Slide 14

Slide 14 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 14 ESA UNCLASSIFIED - Releasable to the Public Yes! We use PostgreSQL!

Slide 15

Slide 15 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 15 ESA UNCLASSIFIED - Releasable to the Public Hardware architecture ESA Sky Science archives at ESAC ESA Hubble Science Archive Herschel Science Archive XMM-Newton Science Archive Frontend Backend Tap server HTTP load balancer DB DATA HiPS ISO Science Archive

Slide 16

Slide 16 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 16 ESA UNCLASSIFIED - Releasable to the Public Frontend Running on a Web Browser (HTML5/CSS3) Google Web Toolkit •  Aladin Lite wrapper (JSNI) •  Data Visualization (Highcharts) Usage of IVOA Protocols •  TAP accessing archive metadata •  STC-s describing complex FoVs Astronomical services access (Simbad) •  Target coordinates resolver •  Angular size resolver

Slide 17

Slide 17 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 17 ESA UNCLASSIFIED - Releasable to the Public Backend Data Access Apache HTTP Server •  Serves HiPS requests Java Servlet container (Tomcat) •  Serves TAP & Target Resolver requests •  Data and metadata download request Database •  PostgreSQL DB (FDW and MV) •  Spherical data types library (PgSphere)+q3c +PostGIS for ephemerides •  Footprints -> Spherical data types Usage of IVOA Protocols & Standards •  TAP requests •  ADQL translation to SQL + PgSphere •  Storage of STC-S footprint information

Slide 18

Slide 18 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 18 ESA UNCLASSIFIED - Releasable to the Public PostgreSQL nodes: Main features Number of nodes: OPS 1 RAM 512GB Storage 2.9TB SSD Operating System RHEL 6 PostgreSQL 9.4.3 Number of nodes: DEV 1 RAM 8GB Storage 100GB SAS Operating System RHEL 6 PostgreSQL 9.4.3

Slide 19

Slide 19 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 19 ESA UNCLASSIFIED - Releasable to the Public PostgreSQL extensions Spherical Indexing SW on PostgreSQL : Q3C 2006, Koposov, S.; Bartunov, O. – Quad Tree pixelization, B-Tree indexing PgSphere • 2008, Bartunov, O.; Sigaev, T.; Richter, J.; Chilingarian, I. – GIST indexing – Support for more spherical objects pg_healpix 2012, Koposov, S. -count strategy

Slide 20

Slide 20 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 20 ESA UNCLASSIFIED - Releasable to the Public Performance Optimization

Slide 21

Slide 21 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 21 ESA UNCLASSIFIED - Releasable to the Public FYI Ø  Expecting 1000 concurrent users navigating and 100 users downloading data Ø  No login Ø  Access quality depends on network Ø  Tested in Australia and very slow Ø  Possibility to replicate HIPS in different countries (CDN)

Slide 22

Slide 22 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 22 ESA UNCLASSIFIED - Releasable to the Public 1-Histograms: count strategy Density maps

Slide 23

Slide 23 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 23 ESA UNCLASSIFIED - Releasable to the Public Count strategy 1- Create the healpix id SELECT healpix_ang2ipix_nest(256::bigint, hst_observation_fdw.ra_deg::double precision, hst_observation_fdw.dec_deg::double precision) AS npix,….. 2- Create the index on the healpix id CREATE INDEX mv_hst_observation_fdw_npix ON public.mv_hst_observation_fdw USING btree (npix); 3- populate the table for querying the counts EXECUTE format('with aux_count as (select npix>>'||shift||' as npix,count(npix) as count from %s group by npix>>'||shift||') update count_table_h'||nside||' u set %s = count from aux_count where u.npix=aux_count.npix', table_to_count, count_column);

Slide 24

Slide 24 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 24 ESA UNCLASSIFIED - Releasable to the Public 2- PostgreSQL tuning How to make faster queries: 1.  Access to different catalogues: •  Postgresql Foreign Data Wrappers •  Materialized views •  Indices on materialized views CREATE MATERIALIZED VIEW public.mv_v_gcc_catalog_fdw AS SELECT v_gcc_catalog_fdw.name, v_gcc_catalog_fdw.ra, v_gcc_catalog_fdw."dec", v_gcc_catalog_fdw.snr, v_gcc_catalog_fdw.gau_major_axis, v_gcc_catalog_fdw.gau_major_axis_sig, v_gcc_catalog_fdw.gau_minor_axis, v_gcc_catalog_fdw.gau_minor_axis_sig, spoint(v_gcc_catalog_fdw.ra * pi() / 180.0::double precision, v_gcc_catalog_fdw."dec" * pi() / 180.0::double precision) AS pos, healpix_ang2ipix_nest(256::bigint, v_gcc_catalog_fdw.ra, v_gcc_catalog_fdw."dec") AS npix FROM v_gcc_catalog_fdw WITH DATA; CREATE INDEX gcc_pos_idx ON public.mv_v_gcc_catalog_fdw USING gist (pos); CREATE INDEX gcc_npix_idx ON public.mv_v_gcc_catalog_fdw USING btree (npix);

Slide 25

Slide 25 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 25 ESA UNCLASSIFIED - Releasable to the Public PostgreSQL: some queries to resolve 1.  Typical queries: [2016-02-19 11:06:56 CET-ammiapp01.n1data.lan(45409)]LOG: duration: 125747.477 ms execute : SELECT COUNT(public.mv_hubble_sc_fdw."pos") AS COUNT FROM public.mv_hubble_sc_fdw WHERE '1' = (public.mv_hubble_sc_fdw."pos" @ spoly('{' || spoint(radians(11.433558137430037),radians(41.78051021988508)) || ',' || spoint(radians(10.685606987908637),radians(41.78291729012794)) || ',' || spoint(radians(9.937655789597498),radians(41.78052176464371)) || ',' || spoint(radians(9.943560916211448),radians(41.267043854985126)) || ',' || spoint(radians(9.949314821366702),radians(40.75356532335342)) || ',' || spoint(radians(10.685592956764461),radians(40.75592346189901)) || ',' || spoint(radians(11.42187104951722),radians(40.75355395876293)) || ',' || spoint(radians(11.427638800815389),radians(41.26703240193894)) || '}')) Limit 100000 [2016-02-19 11:12:18 CET-ammiapp01.n1data.lan(46369)]LOG: duration: 11716.126 ms execute : SELECT COUNT(public.mv_hst_observation_fdw."observation_id") AS COUNT FROM public.mv_hst_observation_fdw WHERE '1' = (public.mv_hst_observation_fdw."fov" && spoly('{' || spoint(radians(11.433558137430037),radians(41.78051021988508)) || ',' || spoint(radians(10.685606987908637),radians(41.78291729012794)) || ',' || spoint(radians(9.937655789597498),radians(41.78052176464371)) || ',' || spoint(radians(9.943560916211448),radians(41.267043854985126)) || ',' || spoint(radians(9.949314821366702),radians(40.75356532335342)) || ',' || spoint(radians(10.685592956764461),radians(40.75592346189901)) || ',' || spoint(radians(11.42187104951722),radians(40.75355395876293)) || ',' || spoint(radians(11.427638800815389),radians(41.26703240193894)) || '}'))

Slide 26

Slide 26 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 26 ESA UNCLASSIFIED - Releasable to the Public 3-Performance projects Ø  pgbench -- run a benchmark test on PostgreSQL Ø  Pgtune

Slide 27

Slide 27 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 27 ESA UNCLASSIFIED - Releasable to the Public PostgreSQL Current nodes: Main features Number of nodes 2 RAM 512G Storage 2.9TB SSD Operating System RHEL 6 PostgreSQL release 9.5.2 Database Size 54G (50TB expected in 5 year with Gaia Data)

Slide 28

Slide 28 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 28 ESA UNCLASSIFIED - Releasable to the Public High Availability and load balancing-I http://www.postgresql.org/docs/current/interactive/high-availability.html Pgpool-II Replication: mmi1 mmi2 RO 9.5.2 RW 9.5.2 Load balancer pgpool-II

Slide 29

Slide 29 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 29 ESA UNCLASSIFIED - Releasable to the Public High Availability and load balancing-II Hot standby: 1-Ability to connect to the server and run read-only queries while the server is in archive recovery or standby mode. 2-If hot_standby is turned on in postgresql.conf and there is a recovery.conf file present, the server will run in Hot Standby mode http://www.postgresql.org/docs/current/interactive/hot-standby.html mmi1 mmi2 RO 9.5.2 RW 9.5.2

Slide 30

Slide 30 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 30 ESA UNCLASSIFIED - Releasable to the Public High Availability and load balancing-III mmi1 mmi2 RW 9.5.2 RW 9.5.2 Ingestion App Server1 App Server2

Slide 31

Slide 31 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 31 ESA UNCLASSIFIED - Releasable to the Public What else can be improved?

Slide 32

Slide 32 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 32 ESA UNCLASSIFIED - Releasable to the Public Release timeline 25th September: Tech-Talk @ ESAC and first internal release to ESAC for feedback 26th October: Focus demo @ ADASS2015 and first public beta release Mid 2016: final release Continuous releases

Slide 33

Slide 33 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 33 ESA UNCLASSIFIED - Releasable to the Public Technology roadmap Beta release @ ADASS2015 • Scientific validation of footprints and ESA all- sky HiPS • Download management • Multi-target summary table • Documentation • Integration with Simbad search engine • Helpdesk Support • Hardware scaling requirements First release (Mid 2016) • Interoperability with VO tools • SAMP integration • Refactoring of some components • DB High availability • DB load balancing • Video tutorials Continuous releases • Link to Vizier • Generation of detailed footprints (spectra) • Imaging and spectroscopic data • Online visualization of data • Time-series • Observation planning • Massive data visualization • Virtual Reality • Solar objects using PostGIS

Slide 34

Slide 34 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 34 ESA UNCLASSIFIED - Releasable to the Public Data integration roadmap Beta release @ ADASS2015 • All-sky HiPS mosaics • INTEGRAL • XMM-Newton • HST • ISO • AKARI • Herschel • Planck • Science ready data (imaging) • XMM-Newton • XMM OM(UV and UVB) • HST(core) • Herschel • ISOCAM • Catalogs • INTEGRAL • 3XMM EPIC • XMM Slew • XMM OM • Tycho-2 • Hubble • Planck First release (Mid 2016) • All-sky HiPS mosaics • EXOSAT • Science ready data (imaging) • EXOSAT • INTEGRAL • XMM-Newton • HST • Herschel • Suzaku(JAXA) • Catalogs • Hipparcos • AKARI • Hubble • Planck • 3XMM EPIC • XMM Slew • XMM OM Continuous releases • All-sky HiPS mosaics • Future HiPS • Science ready data (imaging and spectra) • EXOSAT • INTEGRAL • XMM-Newton • HST • ISOCAM • Herschel • Future missions • Catalogs • 3XMM EPIC • XMM Slew • XMM OM • Hubble • Planck • Gaia • Herschel • Future catalogues

Slide 35

Slide 35 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 35 ESA UNCLASSIFIED - Releasable to the Public Conclusion ESA Sky http://archives.esac.esa.int/esasky-beta •  Visualization web tool of all ESA (and others) space science data from one interface •  Built on top of the existing ESA science archives •  Access to final best data products from ESA space science missions, through VO protocols Using PostgreSQL

Slide 36

Slide 36 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 36 ESA UNCLASSIFIED - Releasable to the Public So we will continue…. All that heaven will allow

Slide 37

Slide 37 text

Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 | Slide 37 ESA UNCLASSIFIED - Releasable to the Public Questions? Team Fabrizio Giordano (Software Engineer and Key Person), Jesús Salgado (Technical astro lead), Bruno Merín (Product Owner), Deborah Baines, Belén López Martí (HiPS generation), María H. Sarmiento (Software Engineer), Elena Racero (HiPS and footprints), Raúl Gutiérrez (Software Engineer), Pilar de Teodoro (DB administrator), Sara Nieto (Sotware Engineer) Previous Collaborators (MMI prototype) Iñaki Ortiz, Ignacio León, Andy Pollock, Michael Rosa, Javier Castellanos Acknowledge Pierre Fernique and Thomas Boch from CDS, expert science and technical staff at ESAC Your feedback is really appreciated! https://support.cosmos.esa.int/esdc/ Or directly to: [email protected]