Sampling technique and sample size calculation

Sampling & sample size Prof. Dr. Jamalludin Ab Rahman B.Med.Sc.,
MD, MPH, FPHM, AMM Department of Community Medicine Kulliyyah of Medicine International Islamic University Malaysia

Why we sample? =Can’t afford to select the entire population
=Do not need to select the entire population 11 November 2019 2 Research Methodology Workshop

Population vs. Sample 11 November 2019 Research Methodology Workshop 3
Must be random

Sample represents a population =To represent a population at certain
location e.g. adults in Malaysia, male in Pahang =To represent population with specific condition e.g. diabetic, pregnant women 11 November 2019 4 Research Methodology Workshop

Sampling with purpose Descriptive study =Measure burden of illness e.g.
prevalence study or survey; NHMS, NOHSA etc =Sampling by location Analytical study =Causality in mind e.g. new drug for diabetics – assuming diabetic in Malaysia is similar with any diabetic elsewhere =Sampling for specific condition 11 November 2019 5 Research Methodology Workshop

Sampling statements 1. Target population 2. Study population 3. Sampling
frame 4. Sampling unit 5. Observation unit 11 November 2019 6 Research Methodology Workshop

Example – NHMS III 2006 =Target population =Study population =Sampling
frame =Sampling unit =Observation unit 11 November 2019 7 All Malaysian Household up to strata 6 List of Enumeration Block & Living Quarters Enumeration Block & Living Quarters All household in the selected Living Quarters Research Methodology Workshop

Random Non- random 11 November 2019 8 Research Methodology Workshop

Random 1. Simple random sampling (SRS) 2. Systematic 3. Stratified
4. Cluster Non-random 1. Convenience 2. Purposive 3. Snowball 4. Volunteer 5. etc 11 November 2019 # 9 Research Methodology Workshop

“ ” If you are not able to determine the
sampling frame, it is non- random sampling unless proven otherwise 11 November 2019 10 Research Methodology Workshop

Random sampling Random Sampling Simple Systematic Stratified Cluster Multistage 11
November 2019 11 The ideal method. Randomly sample 10 students from a class of 50 students. Follow certain & systematic pattern, order. Only first sample is random. Study population divided intro strata. All strata selected. Portion of sample in each strata sampled. Study population divided intro clusters. Assume all clusters are the same. Not all clusters selected. Only some will be sampled. Several sampling techniques applied at different stage. Research Methodology Workshop

Non random sampling Non random Sampling Convenience Purposive Quota Volunteer
Snowball 11 November 2019 16 Haphazard. Grab sampling. No rules. Just get the sample most convenience to researcher. Samples that fulfil certain criteria. E.g. diabetic patients on single insulin therapy. When the sampling stop after achieving certain size or number. Sample by reference or recommendation. Research Methodology Workshop Participants self-select to become part of a study because when asked, or respond to an advert.

Example – NHMS III 2006 Two stage stratified random sampling
= Target population = Study population = Strata = Clusters = Sampling frame = Sampling unit = Observation unit = Sample distribution 11 November 2019 17 All Malaysian Household up to strata 6 List of Enumeration Block & Living Quarters Enumeration Block & Living Quarters All household in the selected Living Quarters State & location (urban or rural) Enumeration Block & Living Quarters Proportionate to size Research Methodology Workshop

Example – A clinical trial = Target population All diabetic
patients = Study population Diabetic for at least 1 year on insulin therapy who attended MOPD from Jan-Dec 2014 = Sampling frame N/A = Sampling unit Any diabetic who fulfilled the selection criteria = Observation unit Same as SU 11 November 2019 18 Research Methodology Workshop

Sample size 19

Why we calculate sample size? 1. Represent target population -
We can’t afford to sample everyone & everything 2. Enough sample to detect difference statistically 3. Logistic preparation – how much money, time & manpower required 11 November 2019 20 Research Methodology Workshop

“ ” Calculate sample size or power before conducting the
study 11 November 2019 Research Methodology Workshop 21

Justify sample size for 11 November 2019 Research Methodology Workshop
22 1. representing target population, or 2. measure treatment effect

“ ” Sample size is always an estimate, never an
exact value 11 November 2019 Research Methodology Workshop 23

Sample size is an estimate =Never an exact value, always
an estimate =Calculated for certain expected estimates (outcomes) at certain degree of precision =Expected values - estimated from previous studies or from intelligent ‘guess’ 11 November 2019 24 Research Methodology Workshop

Principle =Calculated for certain expected estimates (outcomes) at certain degree
of precision =Adjusted for design effect (based on type of sampling), alpha error, power, stratification and anticipated response rate 11 November 2019 Research Methodology Workshop 25

So, what do you need to calculate sample size? =
Objective determined = Sampling method known = Estimate the outcome = Precision required = Statistical test used is known = Set the power (b) and confidence level (a) = Anticipated non-response rate 11 November 2019 26 Research Methodology Workshop

Calculate Power or Margin of Error instead =If the sample
size is pre-determined – e.g. You have fixed budget, so sample size depends on the budget. Sample size is not a variable. 11 November 2019 Research Methodology Workshop 27

Common formulae Depends on your objective, it could be 1.
Calculate single proportion 2. Compare two proportions 3. Calculate single mean 4. Compare two means 11 November 2019 Research Methodology Workshop 28

Sample size to represent population 29

Representing population =Represent population e.g. state, district, village, institution etc
=Usually for prevalence study e.g. to measure prevalence of hypertension in Malaysia; or mean DMF among adult in Pahang 11 November 2019 30 Research Methodology Workshop

Single proportion = = #$/& &'()*') ,& = Where z
= value from standard normal distribution corresponding to desired confidence level, a (.// =1.96 for 95%CI) = p is expected true proportion = d is desired precision = For small populations n can be adjusted so that 0 = 12 132 11 November 2019 31 Research Methodology Workshop

11 November 2019 32 95% CI is a = 0.05
a/2 = 0.025 .// =1.96 Power 80% or 0.8 1-b = 1-0.8 = 0.2 b = 0.2 4 =0.84 Research Methodology Workshop

Example =Study to measure prevalence of hypertension in a village
of 1000 people =Expected prevalence = 40%, Precision = 5%, CI=95%, expected non-response 20% = = ).67& ∗9.:()*9.:) 9.9;& = 369 ≅ 370 =Anticipating non response, = 450 11 November 2019 33 Research Methodology Workshop

Single mean =Study to measure one sample mean = =
(#$/&∗D , )/, where s = expected standard deviation (SD) =We use smallest d to get largest n possible 11 November 2019 34 Research Methodology Workshop

Example =Study to measure average DMF among adults in a
village of 2000 people =Estimated DMF = 11 (SD 10) with the precision of 2 at 95%CI = = ().67∗)9 / )/= 96.04 ≅ 100 11 November 2019 35 Research Methodology Workshop

Sample size to measure effect size 36

Hypothesis testing =How one variable is different from the other
=E.g. different % of hypertension between male & female 11 November 2019 37 Research Methodology Workshop

Compare two proportions = = (#$ & 3#F)&∗('G )*'G 3'&
)*'& ) ('G*'&)& = Total sample size = 2*n = Where p1 and p2 are the expected sample proportions of the two groups; za/2 is the critical value of the Normal distribution at a/2 and zβ is the critical value of the Normal distribution at β (e.g. for a power of 80%, β is 0.2 and the critical value is 0.84) 11 November 2019 38 Research Methodology Workshop

Example = A study to compare prevalence of diabetes mellitus
between male and female. Expected prevalence are 30% vs. 40% respectively. Power 80% at 95% confidence level. = = (1.96 + 0.84)/∗ 9.J )*9.J 39.: )*9.: 9.J*9.: & = 7.9 ∗ 45=355.5 ≅ 360 = Total sample size = 360 * 2 = 720 11 November 2019 39 Research Methodology Workshop

Compare two means = = /∗ #$ & 3#F &∗D&
,& =Where d=smallest means difference and s = standard deviation =Total sample size = 2*n 11 November 2019 40 Research Methodology Workshop

Example =A study to compare means of HbA1c between diabetic
treated with new drug versus the standard drug (control). Expected difference of 10.5% vs. 11.5% with SD estimated of 5% = = /∗;& )& ∗ 1.96 + 0.84 / = 395 =Total sample size = 790 11 November 2019 41 Research Methodology Workshop

Sample size applications 42

Using applications 1. PS: Power and Sample Size Calculation http://biostat.mc.vanderbilt.edu/wiki/Main/PowerS
ampleSize 2. Epi Info http://wwwn.cdc.gov/epiinfo/7/ 3. OpenEpi http://www.openepi.com/oe2.3/menu/openepime nu.htm 11 November 2019 43 Research Methodology Workshop

Epi Info – Population survey 11 November 2019 44 Research
Methodology Workshop

Epi Info – Cohort & Cross-sectional 11 November 2019 45
Research Methodology Workshop

Open Epi – Compare two means 11 November 2019 46
Research Methodology Workshop

PS - Dichotomous 11 November 2019 47 Research Methodology Workshop

PS – Compare two means 11 November 2019 48 Research
Methodology Workshop

Summary 49

Sampling & sample size =Decide best design to represent your
study population – random vs. non random =Calculate sample size for a specific purpose =Sample size is an estimate 11 November 2019 Research Methodology Workshop 50

Sampling technique and sample size calculation

Sampling technique and sample size calculation

More Decks by Jamalludin Ab Rahman

Other Decks in Education

Featured

Transcript