Slide 1

Slide 1 text

Analyzing tomography data with Python and the scikit-image library Emmanuelle Gouillart joint Unit CNRS/Saint-Gobain SVI @EGouillart scikit-image team

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

What is scikit-image? An open-source (BSD) generic image processing library for the Python language (and NumPy data arrays)

Slide 5

Slide 5 text

What is scikit-image? An open-source (BSD) generic image processing library for the Python language (and NumPy data arrays) for 2D & 3D images simple API & gentle learning curve

Slide 6

Slide 6 text

NumpPy: Python objects for numerical arrays Multi-dimensional numerical data container (based on compiled code) + utility functions to create/manipulate them >>> a = np.random. r a n d o m i n t e g e r s (0, 1, (2, 2, 2)) >>> a a r r a y ([[[0 , 1], [1, 0]], [[0, 0], [0, 1]]]) >>> a. shape , a. dtype ((2, 2, 2), dtype ( ’ int64 ’ )) x

Slide 7

Slide 7 text

NumpPy: Python objects for numerical arrays Multi-dimensional numerical data container (based on compiled code) + utility functions to create/manipulate them >>> a = np.random. r a n d o m i n t e g e r s (0, 1, (2, 2, 2)) >>> a a r r a y ([[[0 , 1], [1, 0]], [[0, 0], [0, 1]]]) >>> a. shape , a. dtype ((2, 2, 2), dtype ( ’ int64 ’ )) x Efficient and versatile data access indexing and slicing fancy indexing

Slide 8

Slide 8 text

First steps with scikit-image >>> from skimage import data , measure >>> im = data . b i n a r y b l o b s (128 , n dim =3, v o l u m e f r a c t i o n =0.2) >>> im. shape # 3-D array (128 , 128, 128) >>> l a b e l s = measure . l a b e l (im) >>> type(im), type( l a b e l s ) # images are numpy arrays (, ) >>> l a b e l s [0, 0, 0] # pixels are array elements 0 x Visualization with Mayavi [Ramachandran and Varoquaux, 2011] Images are NumPy arrays Pixels are array elements Simple API relying on functions new = function(image) Most functions work for 2D & 3D Optional parameters: keyword arguments

Slide 9

Slide 9 text

The Scientific Python ecosystem 1 75 ... 32 ... 10 3 ... 2 8 23 ... 36 ... 1 13 ... 0 12 43 ... 6 ... 66 ... 97 1 8 5 ... 73 ... 78 98 ... 9 4 10 ... 55 ... 10 3 ... 9 1 41 ... 98 ... 38 75 ... 6 ... visualization Development environment machine learning image properties object properties image file(s) array

Slide 10

Slide 10 text

Denoising tomography images In-situ imaging of phase separation in silicate melts [Bouttes et al., 2014, Bouttes et al., 2015] Experiments: ID19 & ID16, ESRF From basic (generic) to advanced (specific) filters

Slide 11

Slide 11 text

Denoising tomography images Histogram of pixel values From basic (generic) to advanced (specific) filters bilateral = restoration . denoise bilateral (dat) bilateral = restoration . denoise bilateral (dat, sigma range=2.5, sigma spatial=2) tv = restoration . denoise tv chambolle (dat, weight=0.5)

Slide 12

Slide 12 text

Segmentation: labelling regions skimage.segmentation

Slide 13

Slide 13 text

Example: segmentation of low-constrast regions In-situ imaging of glass batch reactive melting [Gouillart et al., 2012] Non-local means denoising to preserve texture Histogram-based markers extraction Random walker segmentation Non-local means: average similar patches [Buades et al., 2005] Random walker:anisotropic diffusion from markers [Grady, 2006] Random walker less sensitive to noise than watershed, but slower

Slide 14

Slide 14 text

Features a) flltering b) feature extraction c) segmentation d) measures non-local means denoising total variation denoising blob detection super-pixels skeleton & local diameter particle properties segmentation from markers ridge detection

Slide 15

Slide 15 text

Visualizations using Mayavi

Slide 16

Slide 16 text

New features (Almost) All functions are 3D New algorithms: 3D skeletonization, compact watershed, thresholding methods . . . 3D viz in Jupyter notebook Wish list graph cut segmentation trainable segmentation (weka) your suggestions?

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Getting started: finding documentation

Slide 19

Slide 19 text

Gallery of examples

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Datasheet Package statistics http://scikit-image.org/ Release 0.13 (1 - 2 release per year) 30000 unique visitors/month Development model Mature algorithms Only Python + Cython code for easier maintainability Focus on good practices: testing, documentation, version control Hosted on GitHub: thorough code reivew + continuous integration Core team of 5 − 10 persons (close to applications)

Slide 22

Slide 22 text

We code when(ever) we can 00:00 06:00 12:00 18:00 24:00 0 100 200 300 400 500 600 Coding hours 0 200 400 600 800 1000 1200 1400 Number of commits per day Sun Sat Fri Thu Wed Tue Mon

Slide 23

Slide 23 text

2 4 6 8 10 12 Individual Committer 0.0 0.2 0.4 0.6 0.8 1.0 Commit rate Normalized commit rates since 2014-06 matplotlib numpy pandas scikit-image scikit-learn scipy A small number of core contributors Fernando Perez & Aaron Meurer Gist 5843625

Slide 24

Slide 24 text

Challenges for the future

Slide 25

Slide 25 text

Massive data processing and parallelization Competitive environment: some other tools use GPUs, Spark, etc. scikit-image uses NumPy! I/O: large images might not fit into memory use memory mapping of different file formats (raw binary with NumPy, hdf5 with pytables). Divide into blocks: use util.view as blocks to iterate conveniently over blocks Parallel processing: use joblib or dask Better integration desirable

Slide 26

Slide 26 text

Chunking data + easy parallel computation >>> from skimage import u t i l , data >>> im = data . camera () >>> im. shape (512 , 512) >>> # chunks of size 134 with 8-pixel overlap >>> chunks = u t i l . view as windows (im , (134 , 134) , s t e p =126) >>> chunks . shape (4, 4, 134, 134) >>> from j o b l i b import P a r a l l e l , d e l a y e d >>> f i l t e r e d c h u n k s = P a r a l l e l ( n j o b s =4)( d e l a y e d ( f i l t e r s . g a u s s i a n )( chunks [ i , j ]) f o r i in range (4) f o r j in range (4)) >>> f i l t e r e d i m = u t i l . a p p l y p a r a l l e l ( f i l t e r s . gaussian , im , depth =8) x

Slide 27

Slide 27 text

A platform to build an ecosystem upon Tool for users, platform for other tools $ apt-cache rdepends python-matplotlib ... 96 Python packages & applications Specific applications that could build on scikit-image Imaging techniques; microscopy, tomography, ... Fields: cell biology, astronomy, ... Requirements: stable API, good docs

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

Try it out! http://scikit-image.org/ Next tutorials/trainings Scipy, Austin TX July 10-16 Euroscipy, Erlangen August 23-27 Please cite the paper(s) Let’s talk about scikit-image @EGouillart

Slide 30

Slide 30 text

Bibliography I Bouttes, D., Gouillart, E., Boller, E., Dalmas, D., and Vandembroucq, D. (2014). Fragmentation and limits to dynamical scaling in viscous coarsening: An interrupted in situ x-ray tomographic study. Physical review letters, 112(24):245701. Bouttes, D., Lambert, O., Claireaux, C., Woelffel, W., Dalmas, D., Gouillart, E., Lhuissier, P., Salvo, L., Boller, E., and Vandembroucq, D. (2015). Hydrodynamic coarsening in phase-separated silicate melts. Acta Materialia, 92:233–242. Buades, A., Coll, B., and Morel, J. (2005). A non-local algorithm for image denoising. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005, pages 60–65.

Slide 31

Slide 31 text

Bibliography II Gouillart, E., Toplis, M. J., Grynberg, J., Chopinet, M.-H., Sondergard, E., Salvo, L., Su´ ery, M., Di Michiel, M., and Varoquaux, G. (2012). In situ synchrotron microtomography reveals multiple reaction pathways during soda-lime glass synthesis. Journal of the American Ceramic Society, 95(5):1504–1507. Grady, L. (2006). Random walks for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1768–1783. Ramachandran, P. and Varoquaux, G. (2011). Mayavi: 3d visualization of scientific data. Computing in Science & Engineering, 13(2):40–51.

Slide 32

Slide 32 text

Goodies from the ecosystem joblib: easy simple parallel computing + lazy re-evaluation >>> from j o b l i b import P a r a l l e l , d e l a y e d >>> from math import s q r t >>> P a r a l l e l ( n j o b s =1)( d e l a y e d ( s q r t )( i **2) f o r i in range (10)) [0.0 , 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0] x Familiar with this mess? from skimage import f i l t e r s # Comment to save some time # filter_im = filters.median(im) # binary_im = filters. threshold_otsu (filter_im) v a l u e s = np. unique (im) x

Slide 33

Slide 33 text

Goodies from the ecosystem joblib: easy simple parallel computing + lazy re-evaluation Familiar with this mess? from skimage import f i l t e r s # Comment to save some time # filter_im = filters.median(im) # binary_im = filters. threshold_otsu (filter_im) v a l u e s = np. unique (im) x >>> from j o b l i b import Memory >>> mem = Memory( c a c h e d i r = ’ /tmp/ j o b l i b ’ ) >>> square = mem. cache (np. square ) >>> b = square (a) [Memory] C a l l i n g square ... square ( a r r a y ([[ 0., 0., 1.], [ 1., 1., 1.], [ 4., 2., 1.]])) s q u a r e - 0... s , 0.0 min >>> c = square (a) >>> # The above call did not trigger an evaluation x

Slide 34

Slide 34 text

A few tricks for large images I/O: tomography images might not fit into memory use memory mapping of different file formats (raw binary with NumPy, hdf5 with pytables) for input data, and intermediate results Divide into blocks: use util.view as blocks to iterate conveniently over blocks Use NumPy wisely: slicing over an array makes no copy

Slide 35

Slide 35 text

Filtering: transforming image data skimage.filters, skimage.exposure, skimage.restoration