Slide 1

Slide 1 text

NumPy, SciPy, and matplotlib Orlando Python User Group May 22, 2014 Craig Finch

Slide 2

Slide 2 text

Background ● You need a working knowledge of Python ● Basic knowledge of Git to get the code ● Code can be found at https://github.com/westover/OrlandoPythonUserGroup ● These examples can be found in subdirectory Craig_Finch_NumPy_SciPy_matplotlib ● See README.txt for details

Slide 3

Slide 3 text

NumPy, SciPy, and matplotlib ● NumPy is the foundation ● SciPy is built upon NumPy, with some overlapping functionality ● matplotlib complements both NumPy SciPy matplotlib

Slide 4

Slide 4 text

NumPy

Slide 5

Slide 5 text

NumPy Python interpreter NumPy C Extension Python script import numpy a = numpy.array([...]) System linear algebra library

Slide 6

Slide 6 text

NumPy Arrays ● Implemented in C for efficiency ● Python indexing and slicing ● Elements are strongly typed NumPy Type Subtypes (bits) Boolean Unsigned Integer C types, 8, 16, 32, 64 Integer C types, 8, 16, 32, 64 Floating Point C types, 8, 16, 32, 64, 96, 128 Complex Floating Point 2x32, 2x64, 2x96, 2x128 Python object references Much like a Python list

Slide 7

Slide 7 text

Taking advantage of NumPy ● Think in parallel! ● Replace loops with vector operations ● Demo: collision detection Premature optimization is the root of all evil (or at least most of it) in programming. Donald Knuth

Slide 8

Slide 8 text

Start with a collection of spheres, randomly placed in space: Example: finding pairwise distances between spheres

Slide 9

Slide 9 text

Distance between two circles The same principle applies to distance between spheres in three dimensions.

Slide 10

Slide 10 text

Benchmarking ● Requires good experimental design ○ Experiments must be controlled ○ Results must be repeatable ○ Results must be statistically significant ● Tools ○ Python cProfile ○ Robert Kern’s line profiler https://pypi.python. org/pypi/line_profiler/1.0b3 ○ Linux profilers ■ oprofile ■ valgrind (specifically, callgrind)

Slide 11

Slide 11 text

Optimizing numerical code ● Module NumPy/distance.py contains six functions that show how to optimize the calculation ○ Also contains a decorator that times each function ● Script NumPy/distance_benchmarking.py compares results ○ Not a very sophisticated benchmarking tool ● Script NumPy/distance_benchmarking.py ensures all functions return the same result

Slide 12

Slide 12 text

Results ● Results were presented as a live demo ● Results on an Intel Core i5-4460 3.2GHz (using only one core) Benchmarking with 1000 particles. Function=find_distances_2, Time=2.15743207932 sec Function=find_distances_3, Time=0.0198380947113 sec Benchmarking with 3000 particles. Function=find_distances_3, Time=0.0985090732574 sec Function=find_distances_4, Time=0.102454900742 sec Function=find_distances_5, Time=0.0703949928284 sec Function=find_distances_6, Time=0.195425987244 sec

Slide 13

Slide 13 text

Take-Aways: NumPy ● Use arrays instead of lists for numerical data ● Use vector operations instead of loops whenever possible ● If it’s still too slow…. ○ Write C/C++/Fortran code ○ Parallelize (multi-core, multi-server) ● Understand NumPy, because it’s the foundation for: ○ SciPy ○ Pandas ○ mpi4py

Slide 14

Slide 14 text

SciPy

Slide 15

Slide 15 text

SciPy is a big box of tools* ● Special functions (scipy.special) ● Integration (scipy.integrate) ● Optimization (scipy.optimize) ● Interpolation (scipy.interpolate) ● Fourier Transforms (scipy.fftpack) ● Signal Processing (scipy.signal) Continued on next slide... * or a candy store...

Slide 16

Slide 16 text

SciPy tools, continued ● Linear Algebra (scipy.linalg) ● Sparse Eigenvalue Problems with ARPACK ● Compressed Sparse Graph Routines scipy.sparse. csgraph ● Spatial data structures and algorithms (scipy.spatial) ● Statistics (scipy.stats) ● Multi-dimensional image processing (scipy.ndimage) ● File IO (scipy.io) ● Weave (scipy.weave) ○ Write compiled extensions by putting C/C++ code inline with your Python code! ○ f2py in NumPy for writing extensions in Fortran

Slide 17

Slide 17 text

A few thoughts on SciPy ● Contains linear algebra routines that overlap with NumPy ○ SciPy’s linear algebra routines always run on the optimized system libraries (LAPACK, ATLAS, Intel Math Kernel Library, etc.) ● Sparse matrix support ● Extends NumPy’s statistical capabilities ● Under active development ○ New toys added constantly!

Slide 18

Slide 18 text

SciPy example 1: Filter a signal Signal Filter Filtered Signal Example code: SciPy/linear_filters.py

Slide 19

Slide 19 text

Input signal: a step function

Slide 20

Slide 20 text

Finite impulse response (FIR) filter

Slide 21

Slide 21 text

Filtered signal

Slide 22

Slide 22 text

Code snippets # Filter parameters cutoff = 0.2 numtaps = 100 # Define filter lpf = scipy.signal.firwin(numtaps, cutoff, window= ('hamming')) # Frequency response lpf_freq, lpf_response = scipy.signal.freqz(lpf) # Filter signal, no initial conditions y1 = scipy.signal.lfilter(lpf, [1.0], x)

Slide 23

Slide 23 text

SciPy Example 2: Model a linear system Signal x(t) System h(t) Output y(t) Example code: SciPy/LTI_simulation.py A linear time invariant (LTI) system is a generalization of the digital filter from the previous example:

Slide 24

Slide 24 text

1st-order LTI system ● Model with SciPy class signal.lti ○ Methods for impulse and step response ● Create our own function ○ Define a discrete-time step function ○ Apply step function to input of system ○ Compare results to results from SciPy class ○ De-convolve to recover original signal

Slide 25

Slide 25 text

Step Response

Slide 26

Slide 26 text

Impulse Response

Slide 27

Slide 27 text

Applying step function at input

Slide 28

Slide 28 text

matplotlib

Slide 29

Slide 29 text

matplotlib ● pyplot implements Matlab-style plotting ● Object-oriented API for more advanced graphics

Slide 30

Slide 30 text

Basic plotting with matplotlib.pyplot import numpy as np import scipy import matplotlib.pyplot as plt x = np.arange(-2 * np.pi, 2 * np.pi, 0.1) cos_x = np.cos(x) sin_x = np.sin(x) plt.figure() plt.plot(x, cos_x, label='cos(x)', linewidth=2) plt.plot(x, sin_x, label='sin(x)', linewidth=2) plt.xlabel('x') plt.ylabel('y') plt.legend() plt.figure() plt.show()

Slide 31

Slide 31 text

Basic plotting with matplotlib.pyplot

Slide 32

Slide 32 text

How I made this figure with matplotlib

Slide 33

Slide 33 text

Advanced graphics with the API import matplotlib.pyplot as plt from matplotlib.patches import Circle ... fig = plt.figure() ax = fig.add_subplot(111) ... ax.add_patch(Circle(center1, radius, edgecolor='blue', facecolor='lightgray')) ax.add_patch(Circle(center2, radius, edgecolor='red', facecolor='lightgray'))

Slide 34

Slide 34 text

Other matplotlib tricks ● Creating math with LaTeX ● Axis bounds and aspect ratio plt.text(-1.25, 2.25, "$d=\sqrt{\Delta x^2 + \Delta y^2}$", fontsize=18) plt.axis([-1.5, 3, -1.5, 3]) ax.set_aspect(1.0)

Slide 35

Slide 35 text

Summary ● NumPy is the foundation of scientific and numerical computing with Python ○ Learn it well! ● SciPy is a collection of mathematical and scientific tools ● matplotlib is a technical plotting package ○ Primarily 2D plotting ○ Basic 3D plots available with mplot3d import mpl_toolkits.mplot3d