Clustering Lightning into Storms

Ee78d71d8c39eaf0880a1e9ed7187a95?s=47 datawookie
September 12, 2013

Clustering Lightning into Storms

A short talk that I gave at the LIGHTS 2013 Conference (Johannesburg, 12 September 2013). The slides are a little short on text because I like the audience to hear the content rather than read it, but the central message is that clustering lightning discharges into storms is not a trivial task. But it is a worthwhile challenge because it can lead to some very interesting science!

Ee78d71d8c39eaf0880a1e9ed7187a95?s=128

datawookie

September 12, 2013
Tweet

Transcript

  1. Clustering Lightning Andrew B. Collier andrew@exegetic.biz http://www.exegetic.biz/

  2. None
  3. None
  4. None
  5. Clustering & Complexity k-means • Time: O(nk) • Space: O(n+k)

    Hierarchical • Time: O(n2 log n) • Space: O(n2) where n = number of points k = number of clusters
  6. Hierarchical Clustering A method of cluster analysis which tries to

    build a hierarchy of clusters. Agglomerative: each observation starts in its own cluster, and pairs of clusters are merged. Divisive: all observations start in one cluster, and splits are performed recursively.
  7. Distance Matrix [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]

    [,9] [,10] [1,] 0.00000 0.90660 0.98676 2.59730 1.21076 0.64162 1.37218 1.83724 1.33919 1.21728 [2,] 0.90660 0.00000 1.54973 1.81014 0.51097 1.22292 0.76644 1.29365 0.46367 1.37455 [3,] 0.98676 1.54973 0.00000 2.69411 2.01335 1.56353 1.51223 1.73953 1.80822 0.61954 [4,] 2.59730 1.81014 2.69411 0.00000 1.97469 3.03009 1.24766 0.97643 1.35618 2.12959 [5,] 1.21076 0.51097 2.01335 1.97469 0.00000 1.27080 1.18563 1.68454 0.69746 1.88420 [6,] 0.64162 1.22292 1.56353 3.03009 1.27080 0.00000 1.88041 2.38623 1.68528 1.85847 [7,] 1.37218 0.76644 1.51223 1.24766 1.18563 1.88041 0.00000 0.52860 0.53994 1.04872 [8,] 1.83724 1.29365 1.73953 0.97643 1.68454 2.38623 0.52860 0.00000 0.99706 1.15659 [9,] 1.33919 0.46367 1.80822 1.35618 0.69746 1.68528 0.53994 0.99706 0.00000 1.47596 [10,] 1.21728 1.37455 0.61954 2.12959 1.88420 1.85847 1.04872 1.15659 1.47596 0.00000 [11,] 11.70194 11.19894 11.09504 9.49063 11.45953 12.30475 10.45600 9.93351 10.79570 10.57239 [12,] 11.01346 10.37439 10.57979 8.58636 10.55138 11.55993 9.67963 9.18187 9.93618 9.99942 [13,] 10.15958 9.63720 9.59680 7.92908 9.89656 10.75400 8.89718 8.37645 9.23239 9.05385 [14,] 8.65864 7.98640 8.31356 6.19131 8.15378 9.18362 7.30701 6.82147 7.54379 7.71149 [15,] 11.16103 10.79475 10.43696 9.21898 11.12921 11.79252 10.03042 9.50193 10.43790 9.97271
  8. • Euclidean Distance (Pythagoras' Theorem) • Geographical Distance (“great circle”)

    – Cosine – Haversine – Vincenty Sphere – Vincenty Ellipsoid Distance Measures
  9. Big steps (SIGNIFICANT) Small steps (INSIGNIFICANT)

  10. None
  11. • Minimum between-cluster distance • Maximum within-cluster distance

  12. None
  13. We could just take a statistical approach... … but why

    ignore domain-specific knowledge?
  14. None
  15. None
  16. None
  17. None
  18. None
  19. None
  20. Conclusion • Isolate storms easily identified • Clustering not as

    easy as it looks • Need to use other information
  21. Strauss, C., Rosa, M. B., & Stephany, S. (2013). Spatio-temporal

    clustering and density estimation of lightning data for the tracking of convective events. Atmospheric Research, 134, 87–99. doi:10.1016/j.atmosres.2013.07.008.
  22. Kernel Density Kernel Density & Spatio- Temporal Clustering

  23. Why is this important? To gain a better understanding of

    • spatial and • temporal distribution of lightning within a storm we need to actually isolate individual storms. http://www.wallconvert.com/ Tracking convective events in countries that lack weather radar coverage.
  24. Strauss, C., Rosa, M. B., & Stephany, S. (2013). Spatio-temporal

    clustering and density estimation of lightning data for the tracking of convective events. Atmospheric Research, 134, 87–99. doi:10.1016/j.atmosres.2013.07.008. black = precipitation grey = lightning
  25. Yair, Y. Y., Aviv, R., & Ravid, G. (2009). Clustering

    and synchronization of lightning flashes in adjacent thunderstorm cells from lightning location networks data. Journal of Geophysical Research, 114, D09210. doi:10.1029/2008JD010738.