PopChange and geostatistical ways of looking at segregation

Slide 1

Slide 1 text

PopChange and geostatistical ways of looking at segregation Nick Bearman Project team: Chris Lloyd, Gemma Catney and Paul Williamson, University of Liverpool, UK Email: [email protected] / [email protected] @nickbearmanuk #RMF18 ESRC Research Methods Festival, Bath, 4th July 2018

Slide 2

Slide 2 text

Outline 1. Population Change and Geographic Inequalities in the UK, 1971-2011: ESRC project outline 2. Creating population surfaces 3. Geostatistical analysis of deprivation

Slide 3

Slide 3 text

PopChange project outline • Identifcation of comparable variables from the UK Censuses of 1971, 1981, 1991, 2001 and 2011 • Creation of population surfaces for Britain for all comparable variables (1km cells nationally and 100m cells for urban areas; in Northern Ireland grid square counts for 1971-2011 are already available) • Provision of population surfaces, code in R programming language to manipulate data and an online atlas of population change

Slide 4

Slide 4 text

Creating population surfaces • Selected comparable variables for 1971, 1981, 1991, 2001 and 2011 • Create intensity 1km grid using postcodes • Overlay enumeration districts or output areas with 1km postcode intensity grid • Use areal weighting to estimate populations of each overlapping area with postcode intensities as weights • Aggregate counts within grid cells • Smooth grid cells to make neighbouring cells similar

Slide 5

Slide 5 text

Gridded data • Benefts are that all units are of the same size and shape and this makes it easier to assess scale efects without the need to account for zones whose size and shape difers • With grid cells, there are holes where there are no people; this is conceptually more sensible than zones (e.g., output areas or wards) which cover the land area completely

Slide 6

Slide 6 text

Population change 1971-2011 • Population surfaces generated for 1971 and 2011 for - Total persons - Unemployed persons (% of employed and unemployed) - Non owner occupied households (%) - Households without access to a car or van (%) - Households with more than one person per room (%) • From the latter four, z scores were derived (percentage- mean / standard deviation) and these were summed to derive a deprivation score (following Townsend)

Slide 7

Slide 7 text

https://popchange.liverpool.ac.uk

Slide 8

Slide 8 text

Population change 1971-2011 • Gridded counts and diference maps (2011-1971) - Total persons, Unemployed persons (%), Townsend score • Analysis of population spatial distribution using index of dissimilarity and Moran’s I autocorrelation coefcient • Correlations between counts/percentages/scores for 1971 and 2011

Slide 9

Slide 9 text

Total persons in 2011 Total persons 2011-1971

Slide 10

Slide 10 text

Unemployed persons (%) in 2011 Unemployed persons 2011-1971

Slide 11

Slide 11 text

Townsend score in 2011 Townsend score 2011-1971

Slide 12

Slide 12 text

Unevenness: 1971-2011 Index of dissimilarity for Townsend input counts Example: unevenness in owner occupation reduced 1981 to 2011 – partly a function of ‘right to buy’ scheme, resulting in mixed tenures in areas formally dominated by social housing Index of dissimilarity, D (1 = uneven, 0 = % identical) Unemployed Non owner occupied No car van access Overcrowded 1971 0.22 0.40 0.29 0.32 1981 0.26 0.41 0.30 0.36 1991 0.25 0.36 0.31 0.33 2001 0.25 0.33 0.31 0.38 2011 0.22 0.32 0.32 0.40 Cells with > 0.5 persons/HH for all 4 variables for specific year

Slide 13

Slide 13 text

Variograms Most multi-scale analyses of segregation are based on • An a priori idea of the scales we are interested in - e.g., spatial segregation measures for bandwidths (neighbourhood) of 500m and 5km • Nested hierarchy - e.g., output area > middle layer super output area > local authorities Using variograms (part of geostatistics), ● the spatial scales of variation are determined from the data & ● the range parameter(s) of a model ftted to the variogram provide information on the dominant scale(s) of spatial variation. Variograms are a multi-scale measure of the clustering domain of segregation

Slide 14

Slide 14 text

Characterising spatial structure z(x i ) is a percentage, or score at location x i p(h) is the number of data pairs separated by the lag (distance and direction) h Variogram: spatial dependence at diferent spatial scales 1) For each pair of data points a) Store the squared diference in the value b) And the spatial distance between them 2) Group these value diferences into distance bins a) all squared diferences for pairs separated by 1-2, 2-3km, ... b) compute half of the average of these diferences 3) Plot (half) average diferences (value) against distances 4) Plot shows how diference between values changes as a function of distance 2 ) ( 1 )} ( ) ( { ) ( 2 1 ) ( ˆ h x x h h h      i i p i z z p 

Slide 15

Slide 15 text

Provides a composite measure of clustering and polarisation: small nugget indicates localised clustering – with a large sill this indicates polarisation Variogram model Value differences Distance Range(s) – intra-city and inter-city

Slide 16

Slide 16 text

Simulated surfaces: spherical model with range (a) = 2 and 40. Short range = marked diferences in neighbouring areas (intra-city) (LLTI, variation over small space) Long range = neighbouring areas are similar (inter-city) (Ethnicity, variation over large spaces) Variograms

Slide 17

Slide 17 text

Variograms for Townsend score: 1971-2011 Decreased variation in 2011 but less spatial continuity in scores over regional scales (smaller ranges) Nugget (c 0 ) Str. comp. 1 (c 1 ) Range 1 (a 1 ) Model 1 type Str. comp. 2 (c 2 ) Range 1 (a 2 ) Model 2 type Nugget/Sill 1971 2.406 2.080 11299.7 sph. 1.154 67960.8 sph. 0.427 1981 2.559 1.527 10452.1 sph. 0.859 42066.4 sph. 0.517 1991 2.537 1.575 10309.6 sph. 1.127 46083.3 sph. 0.484 2001 2.156 1.428 9956.55 sph. 1.158 42036 sph. 0.455 2011 1.744 1.354 10066.1 sph. 1.271 44917.1 sph. 0.399

Slide 18

Slide 18 text

Variograms for Townsend input variables: 1971 and 2011 Note the presence of two ranges highlighting major scales of spatial variation – these correspond to variation across (short range) and between (large range) urban areas NB. Inputs are log-ratio transformed percentages

Slide 19

Slide 19 text

Variogram model coefficients for Townsend input variables: 1971 and 2011 Nugge t (c0) Str. comp. 1 (c1) Range 1 (a1) m Model 1 type Str. comp. 2 (c2) Range 1 (a2) m Model 2 type Nugge t/Sill Unemployment, 1971 0.130 0.069 11548.1 sph. 0.109 117367 sph. 0.422 Unemployment, 2011 0.154 0.041 11836.6 sph. 0.039 110811 sph. 0.658 Non owner occupied, 1971 0.179 0.115 10815 sph. 0.045 53527.2 sph. 0.528 Non owner occupied, 2011 0.109 0.061 10152.7 sph. 0.030 47993.5 sph. 0.545 No car or van, 1971 0.043 0.044 11427.2 sph. 0.031 56772.5 sph. 0.364 No car or van, 2011 0.112 0.036 9947.9 sph. 0.042 40713.8 sph. 0.589 Overcrowded, 1971 2.578 1.132 11623.9 sph. 1.303 110897 sph. 0.514 Overcrowded, 2011 0.410 0.145 11211.2 sph. 0.161 84762.8 sph. 0.573

Slide 20

Slide 20 text

Variogram model coefficients for Townsend input variables: 1971 and 2011 Unemployment: spatial distribution is similar, but the magnitude of variation was greater in 1971 than in 2011 – the places with large and small rates are similar, but the differences between places have reduced. Tenure: again, spatial distribution is similar (slightly less spatially continuous), but the magnitude of variation was greater in 1971 than in 2011 – the places with large and small rates are similar, but the differences between places have reduced. Car or van access: increased short range spatial variation – this suggests that there is more variation over rerlatively short distances – there are more pronounced distinctions between places with small and large rates in the same regions. Overcrowding: marked decrease in variation (note differences between Scotland and England and Wales in 1971 make comparisons difficult). Reduction in the spatial scale of variation – overcrowding is ‘spreading out’ of urban areas (especially London).

Slide 21

Slide 21 text

Summary • The PopChange resource enables geographically - and attribute-rich - analyses of population change in the UK and, specifcally, the ways in which the population has become more or less geographically unequal • The variogram ofers a powerful means of characterising the spatial distribution of population variables • The spatial structure of socioeconomic variables is consistent across time, but the magnitude of variation has reduced for most variables – locations with large percentages of, for example, unemployment, are broadly the same but the diferences between locations have reduced

Slide 22

Slide 22 text

Acknowledgements Support from the ESRC is acknowledged gratefully (Grant Ref No ES/L014769/1). The Ofce for National Statistics are thanked for provision of the data. Ofce for National Statistics, 2011 Census: Digitised Boundary Data (England and Wales) [computer fle]. ESRC/JISC Census Programme, Census Geography Data Unit (UKBORDERS), EDINA (University of Edinburgh)/Census Dissemination Unit. Census output is Crown copyright and is reproduced with the permission of the Controller of HMSO and the Queen's Printer for Scotland.

Slide 23

Slide 23 text

Questions? PopChange and geostatistical ways of looking at segregation Nick Bearman Project team: Chris Lloyd, Gemma Catney and Paul Williamson, University of Liverpool, UK ESRC Research Methods Festival, Bath, 4th July 2018 Email: [email protected] / [email protected] Lloyd, C. D., Catney, G., Williamson, P. and Bearman, N. (2017) Exploring the utility of grids for analysing long term population change. Computers, Environment and Urban Systems, 66, 1–12. doi:10.1016/j.compenvurbsys.2017.07.003, Open Access @nickbearmanuk #RMF18

Slide 24

Slide 24 text

Additional Slides

Slide 25

Slide 25 text

Characterising spatial structure z(x i ) is a percentage, or score at location x i p(h) is the number of data pairs separated by the lag (distance and direction) h Variogram: spatial dependence at diferent spatial scales 1. Take each data value in turn and compute its squared diference from each of the other values in the data set and store the distances between them 2. Group these diferences into distance bins – e.g., all squared diferences for pairs separated by 1 to 2 km and compute half of the average of these diferences 3. Plot these (half) average diferences against distances 4. The plot shows how diference between values changes as a function of distance 2 ) ( 1 )} ( ) ( { ) ( 2 1 ) ( ˆ h x x h h h      i i p i z z p 

Slide 26

Slide 26 text

Bounded variogram model: nugget and efect and spherical component. Provides a composite measure of clustering and polarisation: small nugget indicates localised clustering – with a large sill this indicates polarisation Variogram model Value differences Distance Range(s) – intra-city and inter-city