Sampling technique and sample size calculation

Sampling technique and sample size calculation

405aa5a1f269bb9926bf03d587ec06b1?s=128

Jamalludin Ab Rahman

November 11, 2019
Tweet

Transcript

  1. 1.

    Sampling & sample size Prof. Dr. Jamalludin Ab Rahman B.Med.Sc.,

    MD, MPH, FPHM, AMM Department of Community Medicine Kulliyyah of Medicine International Islamic University Malaysia
  2. 2.

    Why we sample? =Can’t afford to select the entire population

    =Do not need to select the entire population 11 November 2019 2 Research Methodology Workshop
  3. 4.

    Sample represents a population =To represent a population at certain

    location e.g. adults in Malaysia, male in Pahang =To represent population with specific condition e.g. diabetic, pregnant women 11 November 2019 4 Research Methodology Workshop
  4. 5.

    Sampling with purpose Descriptive study =Measure burden of illness e.g.

    prevalence study or survey; NHMS, NOHSA etc =Sampling by location Analytical study =Causality in mind e.g. new drug for diabetics – assuming diabetic in Malaysia is similar with any diabetic elsewhere =Sampling for specific condition 11 November 2019 5 Research Methodology Workshop
  5. 6.

    Sampling statements 1. Target population 2. Study population 3. Sampling

    frame 4. Sampling unit 5. Observation unit 11 November 2019 6 Research Methodology Workshop
  6. 7.

    Example – NHMS III 2006 =Target population =Study population =Sampling

    frame =Sampling unit =Observation unit 11 November 2019 7 All Malaysian Household up to strata 6 List of Enumeration Block & Living Quarters Enumeration Block & Living Quarters All household in the selected Living Quarters Research Methodology Workshop
  7. 9.

    Random 1. Simple random sampling (SRS) 2. Systematic 3. Stratified

    4. Cluster Non-random 1. Convenience 2. Purposive 3. Snowball 4. Volunteer 5. etc 11 November 2019 # 9 Research Methodology Workshop
  8. 10.

    “ ” If you are not able to determine the

    sampling frame, it is non- random sampling unless proven otherwise 11 November 2019 10 Research Methodology Workshop
  9. 11.

    Random sampling Random Sampling Simple Systematic Stratified Cluster Multistage 11

    November 2019 11 The ideal method. Randomly sample 10 students from a class of 50 students. Follow certain & systematic pattern, order. Only first sample is random. Study population divided intro strata. All strata selected. Portion of sample in each strata sampled. Study population divided intro clusters. Assume all clusters are the same. Not all clusters selected. Only some will be sampled. Several sampling techniques applied at different stage. Research Methodology Workshop
  10. 12.

    Non random sampling Non random Sampling Convenience Purposive Quota Volunteer

    Snowball 11 November 2019 16 Haphazard. Grab sampling. No rules. Just get the sample most convenience to researcher. Samples that fulfil certain criteria. E.g. diabetic patients on single insulin therapy. When the sampling stop after achieving certain size or number. Sample by reference or recommendation. Research Methodology Workshop Participants self-select to become part of a study because when asked, or respond to an advert.
  11. 13.

    Example – NHMS III 2006 Two stage stratified random sampling

    = Target population = Study population = Strata = Clusters = Sampling frame = Sampling unit = Observation unit = Sample distribution 11 November 2019 17 All Malaysian Household up to strata 6 List of Enumeration Block & Living Quarters Enumeration Block & Living Quarters All household in the selected Living Quarters State & location (urban or rural) Enumeration Block & Living Quarters Proportionate to size Research Methodology Workshop
  12. 14.

    Example – A clinical trial = Target population All diabetic

    patients = Study population Diabetic for at least 1 year on insulin therapy who attended MOPD from Jan-Dec 2014 = Sampling frame N/A = Sampling unit Any diabetic who fulfilled the selection criteria = Observation unit Same as SU 11 November 2019 18 Research Methodology Workshop
  13. 16.

    Why we calculate sample size? 1. Represent target population -

    We can’t afford to sample everyone & everything 2. Enough sample to detect difference statistically 3. Logistic preparation – how much money, time & manpower required 11 November 2019 20 Research Methodology Workshop
  14. 17.

    “ ” Calculate sample size or power before conducting the

    study 11 November 2019 Research Methodology Workshop 21
  15. 18.

    Justify sample size for 11 November 2019 Research Methodology Workshop

    22 1. representing target population, or 2. measure treatment effect
  16. 19.

    “ ” Sample size is always an estimate, never an

    exact value 11 November 2019 Research Methodology Workshop 23
  17. 20.

    Sample size is an estimate =Never an exact value, always

    an estimate =Calculated for certain expected estimates (outcomes) at certain degree of precision =Expected values - estimated from previous studies or from intelligent ‘guess’ 11 November 2019 24 Research Methodology Workshop
  18. 21.

    Principle =Calculated for certain expected estimates (outcomes) at certain degree

    of precision =Adjusted for design effect (based on type of sampling), alpha error, power, stratification and anticipated response rate 11 November 2019 Research Methodology Workshop 25
  19. 22.

    So, what do you need to calculate sample size? =

    Objective determined = Sampling method known = Estimate the outcome = Precision required = Statistical test used is known = Set the power (b) and confidence level (a) = Anticipated non-response rate 11 November 2019 26 Research Methodology Workshop
  20. 23.

    Calculate Power or Margin of Error instead =If the sample

    size is pre-determined – e.g. You have fixed budget, so sample size depends on the budget. Sample size is not a variable. 11 November 2019 Research Methodology Workshop 27
  21. 24.

    Common formulae Depends on your objective, it could be 1.

    Calculate single proportion 2. Compare two proportions 3. Calculate single mean 4. Compare two means 11 November 2019 Research Methodology Workshop 28
  22. 26.

    Representing population =Represent population e.g. state, district, village, institution etc

    =Usually for prevalence study e.g. to measure prevalence of hypertension in Malaysia; or mean DMF among adult in Pahang 11 November 2019 30 Research Methodology Workshop
  23. 27.

    Single proportion = = #$/& &'()*') ,& = Where z

    = value from standard normal distribution corresponding to desired confidence level, a (.// =1.96 for 95%CI) = p is expected true proportion = d is desired precision = For small populations n can be adjusted so that 0 = 12 132 11 November 2019 31 Research Methodology Workshop
  24. 28.

    11 November 2019 32 95% CI is a = 0.05

    a/2 = 0.025 .// =1.96 Power 80% or 0.8 1-b = 1-0.8 = 0.2 b = 0.2 4 =0.84 Research Methodology Workshop
  25. 29.

    Example =Study to measure prevalence of hypertension in a village

    of 1000 people =Expected prevalence = 40%, Precision = 5%, CI=95%, expected non-response 20% = = ).67& ∗9.:()*9.:) 9.9;& = 369 ≅ 370 =Anticipating non response, = 450 11 November 2019 33 Research Methodology Workshop
  26. 30.

    Single mean =Study to measure one sample mean = =

    (#$/&∗D , )/, where s = expected standard deviation (SD) =We use smallest d to get largest n possible 11 November 2019 34 Research Methodology Workshop
  27. 31.

    Example =Study to measure average DMF among adults in a

    village of 2000 people =Estimated DMF = 11 (SD 10) with the precision of 2 at 95%CI = = ().67∗)9 / )/= 96.04 ≅ 100 11 November 2019 35 Research Methodology Workshop
  28. 33.

    Hypothesis testing =How one variable is different from the other

    =E.g. different % of hypertension between male & female 11 November 2019 37 Research Methodology Workshop
  29. 34.

    Compare two proportions = = (#$ & 3#F)&∗('G )*'G 3'&

    )*'& ) ('G*'&)& = Total sample size = 2*n = Where p1 and p2 are the expected sample proportions of the two groups; za/2 is the critical value of the Normal distribution at a/2 and zβ is the critical value of the Normal distribution at β (e.g. for a power of 80%, β is 0.2 and the critical value is 0.84) 11 November 2019 38 Research Methodology Workshop
  30. 35.

    Example = A study to compare prevalence of diabetes mellitus

    between male and female. Expected prevalence are 30% vs. 40% respectively. Power 80% at 95% confidence level. = = (1.96 + 0.84)/∗ 9.J )*9.J 39.: )*9.: 9.J*9.: & = 7.9 ∗ 45=355.5 ≅ 360 = Total sample size = 360 * 2 = 720 11 November 2019 39 Research Methodology Workshop
  31. 36.

    Compare two means = = /∗ #$ & 3#F &∗D&

    ,& =Where d=smallest means difference and s = standard deviation =Total sample size = 2*n 11 November 2019 40 Research Methodology Workshop
  32. 37.

    Example =A study to compare means of HbA1c between diabetic

    treated with new drug versus the standard drug (control). Expected difference of 10.5% vs. 11.5% with SD estimated of 5% = = /∗;& )& ∗ 1.96 + 0.84 / = 395 =Total sample size = 790 11 November 2019 41 Research Methodology Workshop
  33. 39.

    Using applications 1. PS: Power and Sample Size Calculation http://biostat.mc.vanderbilt.edu/wiki/Main/PowerS

    ampleSize 2. Epi Info http://wwwn.cdc.gov/epiinfo/7/ 3. OpenEpi http://www.openepi.com/oe2.3/menu/openepime nu.htm 11 November 2019 43 Research Methodology Workshop
  34. 42.
  35. 46.

    Sampling & sample size =Decide best design to represent your

    study population – random vs. non random =Calculate sample size for a specific purpose =Sample size is an estimate 11 November 2019 Research Methodology Workshop 50