Slide 1

Slide 1 text

Mar 2023 Profª. MACHADO Milena Associate Professor –IFES, Brazil Visiting researcher Time series, principal component analysis and logistic regression: An application to the association between annoyance and air pollution

Slide 2

Slide 2 text

Summary • Introduction • Objective • Methodology • Results • Conclusions • Further works

Slide 3

Slide 3 text

Espírito Santo Espírito Santo Brazil

Slide 4

Slide 4 text

Prevailing wind direction The Vitoria region (Brazil) Metropolitan area: 1,689,714 hab Density: 1490 hab/km²  3º largest port system in Latin America Industrial sites: steel plant, iron ore pellet mill, stone quarrying, cement and food industry, asphalt plant, etc.

Slide 5

Slide 5 text

Objective The aim of this study is investigate the annoyance caused by air pollution.

Slide 6

Slide 6 text

Contribution ETE Barueri (SP) Regresseion models only one pollutant Time series analysis PCA Logistic regression + + Regresseion models more than one pollution Relative risk References for this talk: • Melo M.M. et al. “STUDY OF A SPATIAL AND TEMPORAL ANALYSIS FOR PARTICULATE MATTER” (Award The best oral presentation – Dust Conference – Italy, 2014). • Melo M.M. et al, Santos J., Reisen V. (2018) “A new methodology to derive settleable particulate matter guidelines to assist policy- makers on reducing public nuisance” (Journal of Atmospheric Environment) • Machado M, Reisen VA, Santos JM, Reis NC, Frère S, Bondon P, Ispány M, Cotta HHA., ( 2020) “Use of multivariate time series techniques to estimate the impact of particulate matter on the perceived annoy” ( Journal of Atmospheric Environment)

Slide 7

Slide 7 text

Methodology  Vitoria (Brazil) Survey face to face (n= 2638) Mesurement of air pollutants Time series analysis, Principal component analysis, Logistic regression, Relative risk Panel by phone (n= 519) from 2011 to 2014 Settled Particles (SP = PM 2.5 , PM10 and TSP )

Slide 8

Slide 8 text

0,00 2,00 4,00 6,00 8,00 10,00 12,00 14,00 jan-11 mai-12 out-13 fev-15 Settled particles flux (SP) Meses 0,00 5,00 10,00 15,00 20,00 25,00 30,00 35,00 40,00 jan-11 mai-12 out-13 fev-15 PM10 (Mean 30 days) Meses 0,00 10,00 20,00 30,00 40,00 50,00 60,00 jan-11 mai-12 out-13 fev-15 PM10 (max 30 days) Meses 0,00 10,00 20,00 30,00 40,00 50,00 60,00 70,00 jan-11 mai-12 out-13 fev-15 PTS (Mean 30 days) Meses 0,00 20,00 40,00 60,00 80,00 100,00 120,00 jan-11 mai-12 out-13 fev-15 PTS (Max (30 days) Meses Time series analysis

Slide 9

Slide 9 text

RESULTS Question: Think about this month, how do you fell annoyed by dust, in a scale from 1 to 10, where 1 is not annoyed and 10 extremely annoyed ? 1-2-3-4-5-6-7-8-9-10 n= 2638 8% 8% 18% 32% 34% 0% 5% 10% 15% 20% 25% 30% 35% 40% 0 a 2 3 a 4 5 a 6 7 a 8 9 a 10 Percentage of respondents Levels of annoyance P (x= 5) P(x=10) P(x=14) RMGV -3,32 ,449 1,566 25% 76% 95% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Probabilidade de incomodados

Slide 10

Slide 10 text

CORRELATION MATRIX FOR THE ORIGINAL VARIABLES (BEFORE TIME SERIES ANALYSIS – VAR 1) Variables SP PM10 (mean) TSP (mean) PM10 (maxim) TSP (maxim) SP 1. PM10 (mean) 0.424** 1 TSP (mean) 0.278 0.764** 1 PM10 (maxim) 0.409** 0.681** 0.654** 1 TSP (maxim) 0.342* 0.701** 0.754** 0.772** 1 **p-value=0,01; *p-value=0,05 RESULTS Zamprogno (2013), the PCA technique requires variables that are not correlated in time, i. e., and also stationary time series (serially independent). Thus, it is necessary to apply a Vector Autoregressive Model as a filter to eliminate the temporal correlation. To avoid spurious results.

Slide 11

Slide 11 text

Auto-correlation function and partial correlation function PS – Settled Particles PM10 Mean 30 days TSP Mean 30 days PM10 Max 30 days TSP Max 30 days 2011 2013 2014 6 7 8 9 10 11 12 13 Time Deposition rate (g/m3 30 days) 0 5 15 25 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 Lag ACF 0 5 15 25 -0.2 0.0 0.2 0.4 Lag Partial ACF 2011 2012 2013 2014 24 26 28 30 32 34 Time PM10 : Monthly mean ( /m3 ) 0 5 10 15 20 25 30 -0.4 0.0 0.2 0.4 0.6 0.8 1.0 Lag ACF 0 5 10 15 20 25 30 -0.2 0.0 0.2 0.4 0.6 Lag Partial ACF 2011 2013 2014 30 35 40 45 50 Time PM10 : Monthly maximum ( /m3 ) 0 5 15 25 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 Lag ACF 0 5 15 25 -0.3 -0.1 0.0 0.1 0.2 0.3 Lag Partial ACF 2011 2012 2013 2014 35 40 45 50 55 60 Time TSP: Monthly mean( /m3 ) 0 5 10 15 20 25 30 -0.4 0.0 0.2 0.4 0.6 0.8 1.0 Lag ACF 0 5 10 15 20 25 30 -0.2 0.0 0.2 0.4 0.6 0.8 Lag Partial ACF 2011 2013 2014 50 60 70 80 90 Time TSP: Monthly maximum( /m3 ) 0 5 15 25 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 Lag ACF 0 5 15 25 -0.2 0.0 0.2 0.4 Lag Partial ACF

Slide 12

Slide 12 text

SP (Settled Particles – 30 days) PM10 (Mean 30 days) TSP (Mean 30 days) PM10 (Max 30 days) TSP (Máx 30 days) After Var (1) filter 0 5 10 15 20 25 30 -0.2 0.2 0.6 1.0 Lag ACF 0 5 10 15 20 25 30 -0.3 -0.1 0.1 0.3 Lag Partial ACF 0 5 10 15 20 25 30 -0.2 0.2 0.6 1.0 Lag ACF 0 5 10 15 20 25 30 -0.3 -0.1 0.1 0.3 Lag Partial ACF 0 5 10 15 20 25 30 -0.2 0.2 0.6 1.0 Lag ACF 0 5 10 15 20 25 30 -0.3 -0.1 0.1 0.3 Lag Partial ACF 0 5 10 15 20 25 30 -0.2 0.2 0.6 1.0 Lag ACF 0 5 10 15 20 25 30 -0.3 -0.1 0.1 0.3 Lag Partial ACF 0 5 10 15 20 25 30 -0.2 0.2 0.6 1.0 Lag ACF 0 5 10 15 20 25 30 -0.3 -0.1 0.1 0.3 Lag Partial ACF

Slide 13

Slide 13 text

Variables SP PM10 (mean) TSP (mean) PM10 (maxim) TSP (maxim) SP 1,000 PM10 (mean) 0,214 1,000 TSP (mean) -0,004 ,573** 1,000 PM10 (maxim) 0,234 ,428** ,344* 1,000 TSP (maxim) ,378* ,533** ,337* ,685** 1,000 **p-value=0,01 *p-value=0,05 CORRELATION MATRIX FOR THE VARIABLES AFTER APPLYING THE FILTERING MODEL RESULTS The PCA technique is going to be applied at the filtered series in order to avoid the cross-correlation (multicolinearity) among variables.

Slide 14

Slide 14 text

PC1 PC2 PC3 PC4 PC5 Eigenvalue 2,576 1,071 0,681 0,396 0,276 Variability (%) 51,528 21,426 13,622 7,913 5,510 Cumulative % 51,528 72,955 86,577 94,490 100,000 SP (monthly rate) 0,267 0,733* -0,554 -0,269 -0,112 PM10 (monthly mean) 0,495* -0,257 -0,365 0,674 -0,319 TSP (monthly mean) 0,400* -0,583 -0,318 -0,607 0,172 PM10 (monthly maxim) 0,492* 0,104 0,611* -0,254 -0,557 TSP (monthly maxim) 0,531* 0,214 0,293 0,200 0,739 RESULTS OF FACTOR LOADINGS STATISTICS AND APPLICATION OF PCA RESULTS The components PC1, PC2 and PC3 explain about 86% of the total variability the original data.

Slide 15

Slide 15 text

Pollutants RR IC (95%) Dif IC SP 1.462 (1.070; 1.854) 0,784 PM 10 (monthly mean) 1.649 (1.061; 2.237) 1,176 TSP (monthly mean) 2.181 (1.471; 2.891) 1,42 PM 10 (monthly maxim) 2.411 (1.401; 3.421) 2,02 TSP (monthly maxim) 1.822 (0.592; 3.052) 2,46 THE RELATIVE RISK ESTIMATED BY MODEL VAR-PCA-LOGISITC REGRESSION RESULTS The estimated relative risks increased the probability of annoyance by a factor of 1.5 considering the interquartile variation equal to 2g/m² 30 days. ̰ 𝜷 Standard error PC1 0,053 0,202 PC2 0,058 0,309 PC3 -0,245 0,390 Intercept 0,204 0,320 Parameters estimated by the multiple logistic model ෢ 𝑅𝑅∗ 𝑥𝑖 ≈ 𝑒𝑥𝑖 ෡ 𝛽𝑖∗ The RR can be defined as the association that an effect can be occur (annoyance) following a certain exposure to a risk factor. y= Degree ≥7 (Extremely annoyed) = 1 Degree <7 (not/ little annoyed) = 0 X = SP, TSP, PM10 𝑃 𝑌 = 1 = 𝜋 𝑿 = 𝑒𝛽0+⋯+𝛽𝑝𝑥 1 + 𝑒𝛽0+⋯+𝛽𝑝𝑥

Slide 16

Slide 16 text

Conclusions  By combining VAR-PCA-LOG statistical techniques, is proposed as useful tool to considering a group of pollutants at the same model.  This study provide evidence of a significant correlation between particulate matter and perceived annoyance levels, indicating that, at least for particulate matter, perceived annoyance is not related only to one pollutant but to a group of pollutants.  The estimates relative risk showed that, in general, an increase in air pollutant concentrations (i.e., the particulate matter metrics examined here: TSP, PM10 and SP) significantly contributes in increasing the probability of being annoyed.

Slide 17

Slide 17 text

Further work… 1. Use the bootstrap technique, or others, to estimate the most accurate confidence intervals of the results. 2. Add other pollutants in a multiple model. Papers published: MACHADO, M.; SANTOS, J. M.; FRERE, S.; CHAGNON, P.; REISEN, V. A.; BONDON, P.; ISPÁNY, M.; MAVROIDIS, I.; REIS JR, N. C. Deconstruction of annoyance due to air pollution by multiple correspondence analyses. Environmental Science and Pollution Research, v. 28, n. 35, p. 47904-47920, 2021. MACHADO, M. ; SANTOS, J. M.; REISEN, V. A.; PEGO E SILVA, A. F.; REIS JUNIOR, N. C.; BONDON, P.; MAVROIDIS, I.; PREZOTTI FILHO, P. R.; FRERE, S.; LIMA, A. T. Parameters influencing population annoyance pertaining to air pollution. Journal of Environmental Management, v. 323, p. 115955, 2022.

Slide 18

Slide 18 text

Contact Information: Milena Machado de Melo ([email protected]) Merci !