of observations. - Observations in 6 optical bands. - 200 000 images per year. - 37 billion stars and galaxies - Accurate measurement of dark matter and dark energy. - 20TB of data per night. - Require 100PB of storage.
year of operations. - Provides time series in 6 bands and additional metadata [e.g. distance] - 14 classes. Unknown interpretation. - 7848 training objects, 3.5 million tests objects. - Part of the PLAsTiCC Kaggle challenge. Easy features!
scale Python. - Allows running preprocessing of the test set, which is later than memory. - Automatically use all CPU cores. import dask.dataframe as dd df = dd.read_csv(path_in) df = df.repartition(npartitions=1000) df.to_parquet(path_out)
very important for understanding dark energy, dark matter, detecting super novae and transients. - DASK. Allows for parallel out-of-core processing from Python. - PyTorch. Flexible framework for deep learning. - Irregular sparse time series. Annoyingly tricky. - AI Saturday. Why not join?