Slide 1

Slide 1 text

Clustering Lightning Andrew B. Collier [email protected] http://www.exegetic.biz/

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Clustering & Complexity k-means ● Time: O(nk) ● Space: O(n+k) Hierarchical ● Time: O(n2 log n) ● Space: O(n2) where n = number of points k = number of clusters

Slide 6

Slide 6 text

Hierarchical Clustering A method of cluster analysis which tries to build a hierarchy of clusters. Agglomerative: each observation starts in its own cluster, and pairs of clusters are merged. Divisive: all observations start in one cluster, and splits are performed recursively.

Slide 7

Slide 7 text

Distance Matrix [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.00000 0.90660 0.98676 2.59730 1.21076 0.64162 1.37218 1.83724 1.33919 1.21728 [2,] 0.90660 0.00000 1.54973 1.81014 0.51097 1.22292 0.76644 1.29365 0.46367 1.37455 [3,] 0.98676 1.54973 0.00000 2.69411 2.01335 1.56353 1.51223 1.73953 1.80822 0.61954 [4,] 2.59730 1.81014 2.69411 0.00000 1.97469 3.03009 1.24766 0.97643 1.35618 2.12959 [5,] 1.21076 0.51097 2.01335 1.97469 0.00000 1.27080 1.18563 1.68454 0.69746 1.88420 [6,] 0.64162 1.22292 1.56353 3.03009 1.27080 0.00000 1.88041 2.38623 1.68528 1.85847 [7,] 1.37218 0.76644 1.51223 1.24766 1.18563 1.88041 0.00000 0.52860 0.53994 1.04872 [8,] 1.83724 1.29365 1.73953 0.97643 1.68454 2.38623 0.52860 0.00000 0.99706 1.15659 [9,] 1.33919 0.46367 1.80822 1.35618 0.69746 1.68528 0.53994 0.99706 0.00000 1.47596 [10,] 1.21728 1.37455 0.61954 2.12959 1.88420 1.85847 1.04872 1.15659 1.47596 0.00000 [11,] 11.70194 11.19894 11.09504 9.49063 11.45953 12.30475 10.45600 9.93351 10.79570 10.57239 [12,] 11.01346 10.37439 10.57979 8.58636 10.55138 11.55993 9.67963 9.18187 9.93618 9.99942 [13,] 10.15958 9.63720 9.59680 7.92908 9.89656 10.75400 8.89718 8.37645 9.23239 9.05385 [14,] 8.65864 7.98640 8.31356 6.19131 8.15378 9.18362 7.30701 6.82147 7.54379 7.71149 [15,] 11.16103 10.79475 10.43696 9.21898 11.12921 11.79252 10.03042 9.50193 10.43790 9.97271

Slide 8

Slide 8 text

● Euclidean Distance (Pythagoras' Theorem) ● Geographical Distance (“great circle”) – Cosine – Haversine – Vincenty Sphere – Vincenty Ellipsoid Distance Measures

Slide 9

Slide 9 text

Big steps (SIGNIFICANT) Small steps (INSIGNIFICANT)

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

● Minimum between-cluster distance ● Maximum within-cluster distance

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

We could just take a statistical approach... … but why ignore domain-specific knowledge?

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

Conclusion ● Isolate storms easily identified ● Clustering not as easy as it looks ● Need to use other information

Slide 21

Slide 21 text

Strauss, C., Rosa, M. B., & Stephany, S. (2013). Spatio-temporal clustering and density estimation of lightning data for the tracking of convective events. Atmospheric Research, 134, 87–99. doi:10.1016/j.atmosres.2013.07.008.

Slide 22

Slide 22 text

Kernel Density Kernel Density & Spatio- Temporal Clustering

Slide 23

Slide 23 text

Why is this important? To gain a better understanding of ● spatial and ● temporal distribution of lightning within a storm we need to actually isolate individual storms. http://www.wallconvert.com/ Tracking convective events in countries that lack weather radar coverage.

Slide 24

Slide 24 text

Strauss, C., Rosa, M. B., & Stephany, S. (2013). Spatio-temporal clustering and density estimation of lightning data for the tracking of convective events. Atmospheric Research, 134, 87–99. doi:10.1016/j.atmosres.2013.07.008. black = precipitation grey = lightning

Slide 25

Slide 25 text

Yair, Y. Y., Aviv, R., & Ravid, G. (2009). Clustering and synchronization of lightning flashes in adjacent thunderstorm cells from lightning location networks data. Journal of Geophysical Research, 114, D09210. doi:10.1029/2008JD010738.