Slide 1

Slide 1 text

Python's Future in Science Juan Nunez-Iglesias Victorian Life Sciences Computation Initiative (VLSCI) University of Melbourne at PyCon Australia 2016, Melbourne

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

http://bit.ly/prog-lang-in-astronomy

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

def simple_grep(pattern, file): for line in file: if pattern in line: yield line

Slide 8

Slide 8 text

import numpy as np from scipy.stats import mstats def quantile_normalise(X): quantiles = np.mean(np.sort(X, axis=0), axis=1) ranks = mstats.rankdata(X, axis=0).astype(int) Xnorm = quantiles[ranks] return Xnorm

Slide 9

Slide 9 text

flights %>% group_by(year, month, day) %>% select(arr_delay) %>% summarise(arrival.delay = median(arr_delay, na.rm = TRUE), num.flights = n()) %>% arrange(year, month, day) %>% mutate(date = date_from_tup(year, month, day), weekday = weekday_from_tup(year, month, day)) %>% ggplot(aes(y=arrival.delay, x=date, colour=weekday)) + geom_point() + scale_color_brewer(type="qual")

Slide 10

Slide 10 text

−25 0 25 50 Jan 2013 Apr 2013 Jul 2013 Oct 2013 Jan 2014 date arrival.delay weekday Friday Monday Saturday Sunday Thursday Tuesday Wednesday

Slide 11

Slide 11 text

fibonacci 0 = 0 fibonacci 1 = 1 fibonacci n = fibonacci (n - 1) + fibonacci (n - 2)

Slide 12

Slide 12 text

fibonacci 0 = 0 fibonacci 1 = 1 fibonacci n = fibonacci (n - 1) + fibonacci (n - 2) fibonacci = 0 : 1 : zipWith (+) fibonacci (tail fibonacci)

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

https://benchmarksgame.alioth.debian.org

Slide 15

Slide 15 text

a = 5

Slide 16

Slide 16 text

int a = 5; a = 5

Slide 17

Slide 17 text

struct _longobject { long ob_refcnt; PyTypeObject *ob_type; size_t ob_size; long ob_digit[1]; }; ... int a = 5; a = 5

Slide 18

Slide 18 text

https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/ struct _longobject { long ob_refcnt; PyTypeObject *ob_type; size_t ob_size; long ob_digit[1]; }; ... int a = 5; a = 5

Slide 19

Slide 19 text

def add(a, b): return a + b

Slide 20

Slide 20 text

sum(my_list) np.sum(my_array)

Slide 21

Slide 21 text

import numpy as np from scipy.stats import mstats def quantile_normalise(X): quantiles = np.mean(np.sort(X, axis=0), axis=1) ranks = mstats.rankdata(X, axis=0).astype(int) Xnorm = quantiles[ranks] return Xnorm

Slide 22

Slide 22 text

Conn = (Adj + Adj.T) / 2 degs = np.ravel(Conn.sum(axis=0)) Degs = sparse.diags(1 / np.sqrt(degs)) Lap = Degs - Conn Q = Degs @ Lap @ Degs eigvals, eigvecs = eigsh(Q, k=3, which='SM') eigvecs = eigvecs[:, np.argsort(eigvals)] _, x, y = (Degs @ eigvecs).T

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

def f(x): y = x**4 - 3*x return y def integrate_f(a, b, n): dx = (b - a) / n dx2 = dx / 2 s = f(a) * dx2 for i in range(1, n): s += f(a + i * dx) * dx s += f(b) * dx2 return s http://bit.ly/aspp-cython

Slide 26

Slide 26 text

def f(x): y = x**4 - 3*x return y def integrate_f(a, b, n): dx = (b - a) / n dx2 = dx / 2 s = f(a) * dx2 for i in range(1, n): s += f(a + i * dx) * dx s += f(b) * dx2 return s cdef double f(double x): cdef double y = x**4 - 3*x return y def integrate_f(double a, double b, int n): cdef: double dx = (b - a) / n double dx2 = dx / 2 double s = f(a) * dx2 int i = 0 for i in range(1, n): s += f(a + i * dx) * dx s += f(b) * dx2 return s http://bit.ly/aspp-cython

Slide 27

Slide 27 text

https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/ struct _longobject { long ob_refcnt; PyTypeObject *ob_type; size_t ob_size; long ob_digit[1]; };

Slide 28

Slide 28 text

function quadratic(a, sqr_term, b) return (-b + sqr_term) / 2a end function quadratic2(a, b, c) sqr_term = sqrt(b^2-4a*c) r1 = quadratic(a, sqr_term, b) r2 = quadratic(a, -sqr_term, b) return r1, r2 end adapted from https://samuelcolvin.github.io/JuliaByExample

Slide 29

Slide 29 text

def euclidean_distance(a, b): n = len(a) sqdist = 0 for i in range(n): d = a[i] - b[i] sqdist += d * d return np.sqrt(sqdist)

Slide 30

Slide 30 text

def f(x): y = x**4 - 3*x return y def integrate_f(a, b, n): dx = (b - a) / n dx2 = dx / 2 s = f(a) * dx2 for i in range(1, n): s += f(a + i * dx) * dx s += f(b) * dx2 return s

Slide 31

Slide 31 text

def f(x): y = x**4 - 3*x return y def integrate_f(a, b, n): dx = (b - a) / n dx2 = dx / 2 s = f(a) * dx2 for i in range(1, n): s += f(a + i * dx) * dx s += f(b) * dx2 return s import numba @numba.jit(nopython=True) def f(x): y = x**4 - 3*x return y @numba.jit(nopython=True) def integrate_f(a, b, n): dx = (b - a) / n dx2 = dx / 2 s = f(a) * dx2 for i in range(1, n): s += f(a + i * dx) * dx s += f(b) * dx2 return s

Slide 32

Slide 32 text

def euclidean_distance(a, b): n = len(a) sqdist = 0 for i in range(n): d = a[i] - b[i] sqdist += d * d return np.sqrt(sqdist)

Slide 33

Slide 33 text

def euclidean_distance(a, b): n = len(a) sqdist = 0 for i in range(n): d = a[i] - b[i] sqdist += d * d return np.sqrt(sqdist) https://jakevdp.github.io/blog/2012/09/20/why-python-is-the-last/

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

Take-homes Community Elegance Performance

Slide 37

Slide 37 text

Take-homes Community Elegance Performance BONUS: USE PYTHON 3.5!

Slide 38

Slide 38 text

Questions? [email protected] @jnuneziglesias (Twitter) @jni (GitHub)