Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ICTMS Lund 2017

ICTMS Lund 2017

Emmanuelle Gouillart

June 20, 2017
Tweet

More Decks by Emmanuelle Gouillart

Other Decks in Research

Transcript

  1. Analyzing tomography data with Python and the scikit-image library Emmanuelle

    Gouillart joint Unit CNRS/Saint-Gobain SVI @EGouillart scikit-image team
  2. What is scikit-image? An open-source (BSD) generic image processing library

    for the Python language (and NumPy data arrays)
  3. What is scikit-image? An open-source (BSD) generic image processing library

    for the Python language (and NumPy data arrays) for 2D & 3D images simple API & gentle learning curve
  4. NumpPy: Python objects for numerical arrays Multi-dimensional numerical data container

    (based on compiled code) + utility functions to create/manipulate them >>> a = np.random. r a n d o m i n t e g e r s (0, 1, (2, 2, 2)) >>> a a r r a y ([[[0 , 1], [1, 0]], [[0, 0], [0, 1]]]) >>> a. shape , a. dtype ((2, 2, 2), dtype ( ’ int64 ’ )) x
  5. NumpPy: Python objects for numerical arrays Multi-dimensional numerical data container

    (based on compiled code) + utility functions to create/manipulate them >>> a = np.random. r a n d o m i n t e g e r s (0, 1, (2, 2, 2)) >>> a a r r a y ([[[0 , 1], [1, 0]], [[0, 0], [0, 1]]]) >>> a. shape , a. dtype ((2, 2, 2), dtype ( ’ int64 ’ )) x Efficient and versatile data access indexing and slicing fancy indexing
  6. First steps with scikit-image >>> from skimage import data ,

    measure >>> im = data . b i n a r y b l o b s (128 , n dim =3, v o l u m e f r a c t i o n =0.2) >>> im. shape # 3-D array (128 , 128, 128) >>> l a b e l s = measure . l a b e l (im) >>> type(im), type( l a b e l s ) # images are numpy arrays (<type ’numpy . ndarray ’ >, <type ’numpy . ndarray ’ >) >>> l a b e l s [0, 0, 0] # pixels are array elements 0 x Visualization with Mayavi [Ramachandran and Varoquaux, 2011] Images are NumPy arrays Pixels are array elements Simple API relying on functions new = function(image) Most functions work for 2D & 3D Optional parameters: keyword arguments
  7. The Scientific Python ecosystem 1 75 ... 32 ... 10

    3 ... 2 8 23 ... 36 ... 1 13 ... 0 12 43 ... 6 ... 66 ... 97 1 8 5 ... 73 ... 78 98 ... 9 4 10 ... 55 ... 10 3 ... 9 1 41 ... 98 ... 38 75 ... 6 ... visualization Development environment machine learning image properties object properties image file(s) array
  8. Denoising tomography images In-situ imaging of phase separation in silicate

    melts [Bouttes et al., 2014, Bouttes et al., 2015] Experiments: ID19 & ID16, ESRF From basic (generic) to advanced (specific) filters
  9. Denoising tomography images Histogram of pixel values From basic (generic)

    to advanced (specific) filters bilateral = restoration . denoise bilateral (dat) bilateral = restoration . denoise bilateral (dat, sigma range=2.5, sigma spatial=2) tv = restoration . denoise tv chambolle (dat, weight=0.5)
  10. Example: segmentation of low-constrast regions In-situ imaging of glass batch

    reactive melting [Gouillart et al., 2012] Non-local means denoising to preserve texture Histogram-based markers extraction Random walker segmentation Non-local means: average similar patches [Buades et al., 2005] Random walker:anisotropic diffusion from markers [Grady, 2006] Random walker less sensitive to noise than watershed, but slower
  11. Features a) flltering b) feature extraction c) segmentation d) measures

    non-local means denoising total variation denoising blob detection super-pixels skeleton & local diameter particle properties segmentation from markers ridge detection
  12. New features (Almost) All functions are 3D New algorithms: 3D

    skeletonization, compact watershed, thresholding methods . . . 3D viz in Jupyter notebook Wish list graph cut segmentation trainable segmentation (weka) your suggestions?
  13. Datasheet Package statistics http://scikit-image.org/ Release 0.13 (1 - 2 release

    per year) 30000 unique visitors/month Development model Mature algorithms Only Python + Cython code for easier maintainability Focus on good practices: testing, documentation, version control Hosted on GitHub: thorough code reivew + continuous integration Core team of 5 − 10 persons (close to applications)
  14. We code when(ever) we can 00:00 06:00 12:00 18:00 24:00

    0 100 200 300 400 500 600 Coding hours 0 200 400 600 800 1000 1200 1400 Number of commits per day Sun Sat Fri Thu Wed Tue Mon
  15. 2 4 6 8 10 12 Individual Committer 0.0 0.2

    0.4 0.6 0.8 1.0 Commit rate Normalized commit rates since 2014-06 matplotlib numpy pandas scikit-image scikit-learn scipy A small number of core contributors Fernando Perez & Aaron Meurer Gist 5843625
  16. Massive data processing and parallelization Competitive environment: some other tools

    use GPUs, Spark, etc. scikit-image uses NumPy! I/O: large images might not fit into memory use memory mapping of different file formats (raw binary with NumPy, hdf5 with pytables). Divide into blocks: use util.view as blocks to iterate conveniently over blocks Parallel processing: use joblib or dask Better integration desirable
  17. Chunking data + easy parallel computation >>> from skimage import

    u t i l , data >>> im = data . camera () >>> im. shape (512 , 512) >>> # chunks of size 134 with 8-pixel overlap >>> chunks = u t i l . view as windows (im , (134 , 134) , s t e p =126) >>> chunks . shape (4, 4, 134, 134) >>> from j o b l i b import P a r a l l e l , d e l a y e d >>> f i l t e r e d c h u n k s = P a r a l l e l ( n j o b s =4)( d e l a y e d ( f i l t e r s . g a u s s i a n )( chunks [ i , j ]) f o r i in range (4) f o r j in range (4)) >>> f i l t e r e d i m = u t i l . a p p l y p a r a l l e l ( f i l t e r s . gaussian , im , depth =8) x
  18. A platform to build an ecosystem upon Tool for users,

    platform for other tools $ apt-cache rdepends python-matplotlib ... 96 Python packages & applications Specific applications that could build on scikit-image Imaging techniques; microscopy, tomography, ... Fields: cell biology, astronomy, ... Requirements: stable API, good docs
  19. Try it out! http://scikit-image.org/ Next tutorials/trainings Scipy, Austin TX July

    10-16 Euroscipy, Erlangen August 23-27 Please cite the paper(s) Let’s talk about scikit-image @EGouillart
  20. Bibliography I Bouttes, D., Gouillart, E., Boller, E., Dalmas, D.,

    and Vandembroucq, D. (2014). Fragmentation and limits to dynamical scaling in viscous coarsening: An interrupted in situ x-ray tomographic study. Physical review letters, 112(24):245701. Bouttes, D., Lambert, O., Claireaux, C., Woelffel, W., Dalmas, D., Gouillart, E., Lhuissier, P., Salvo, L., Boller, E., and Vandembroucq, D. (2015). Hydrodynamic coarsening in phase-separated silicate melts. Acta Materialia, 92:233–242. Buades, A., Coll, B., and Morel, J. (2005). A non-local algorithm for image denoising. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005, pages 60–65.
  21. Bibliography II Gouillart, E., Toplis, M. J., Grynberg, J., Chopinet,

    M.-H., Sondergard, E., Salvo, L., Su´ ery, M., Di Michiel, M., and Varoquaux, G. (2012). In situ synchrotron microtomography reveals multiple reaction pathways during soda-lime glass synthesis. Journal of the American Ceramic Society, 95(5):1504–1507. Grady, L. (2006). Random walks for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1768–1783. Ramachandran, P. and Varoquaux, G. (2011). Mayavi: 3d visualization of scientific data. Computing in Science & Engineering, 13(2):40–51.
  22. Goodies from the ecosystem joblib: easy simple parallel computing +

    lazy re-evaluation >>> from j o b l i b import P a r a l l e l , d e l a y e d >>> from math import s q r t >>> P a r a l l e l ( n j o b s =1)( d e l a y e d ( s q r t )( i **2) f o r i in range (10)) [0.0 , 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0] x Familiar with this mess? from skimage import f i l t e r s # Comment to save some time # filter_im = filters.median(im) # binary_im = filters. threshold_otsu (filter_im) v a l u e s = np. unique (im) x
  23. Goodies from the ecosystem joblib: easy simple parallel computing +

    lazy re-evaluation Familiar with this mess? from skimage import f i l t e r s # Comment to save some time # filter_im = filters.median(im) # binary_im = filters. threshold_otsu (filter_im) v a l u e s = np. unique (im) x >>> from j o b l i b import Memory >>> mem = Memory( c a c h e d i r = ’ /tmp/ j o b l i b ’ ) >>> square = mem. cache (np. square ) >>> b = square (a) [Memory] C a l l i n g square ... square ( a r r a y ([[ 0., 0., 1.], [ 1., 1., 1.], [ 4., 2., 1.]])) s q u a r e - 0... s , 0.0 min >>> c = square (a) >>> # The above call did not trigger an evaluation x
  24. A few tricks for large images I/O: tomography images might

    not fit into memory use memory mapping of different file formats (raw binary with NumPy, hdf5 with pytables) for input data, and intermediate results Divide into blocks: use util.view as blocks to iterate conveniently over blocks Use NumPy wisely: slicing over an array makes no copy