Geographic data and confidentiality in health research

60d0e0af6e89ae0f6114f89cb72b21d3?s=47 Research Data Services
November 21, 2016

Geographic data and confidentiality in health research

Presentation given as part of the RDS Holz Brown Bag series, November 2016.


Research Data Services

November 21, 2016


  1. 1.

    Geographic data and confidentiality in health research Matthew J Moehr|

    Survey of the Health of Wisconsin | UW-Madison
  2. 2.

    Geographic data and confidentiality in health research Matthew J Moehr

    Survey of the Health of Wisconsin November 16, 2016
  3. 3.

    The plan • What are your questions? Why are you

    here? • Introduce myself and the survey • 4 stories about geography, confidentiality, and health • 2 ideas • Some discussion
  4. 4.

    • Survey of the Health of Wisconsin (SHOW) ◦ Randomly

    selected households -- not a hospital population ◦ Face to face interviews ◦ 5,000 subjects with 3,000 variables per subject ◦ Environmental || Social || Psychological || Biological • We give you data to use for your research. • We collaborate on new research projects. ◦ Vitamin D ◦ Hair cortisol ◦ Sleep quality in children
  5. 7.
  6. 9.

    HIPAA - Health Information Portability and Accountability Act 1. Name

    2. Geographic subdivisions smaller than state 3. All dates 4. Telephone numbers 5. FAX number 6. Email address 7. Social Security number 8. Medical record number 9. Health plan beneficiary number 10. Account number 11. Certificate/license number 12. Vehicle identifiers and serial numbers, including license plate numbers 13. Device identifiers or serial numbers 14. Web URLs 15. IP address 16. Biometric identifiers, including finger or voice prints 17. Full-face photographic images and any comparable images 18. Any other unique identifying number, characteristic, or code
  7. 10.
  8. 11.

    Genes Geography Other Data Future utility Increases ?? Decreases Does

    it change? “immutable” slowly Depends: age, gender, illnesses, income … Can we obfuscate it? no A little. But lose a lot of info. A lot. Most info will be preserved. Others give it away? yes yes sometimes “Why are genes different?” from Vitaly Shmatikov
  9. 12.

    Idea 1: Confidentiality is measurement error Duncan G, Fienberg S,

    Krishnan R, Padman R, Roehrig SF. Disclosure Limitation Methods and Information Loss for Tabular Data. In: Doyle P, et al., editors. Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies. Amsterdam: North Holland: 2001. pp. 135–166.
  10. 13.

    Idea 2: Alter the statistics for partitioned data. Xiaoqian Jiang

    and colleagues doing research for hospital records and genes. Li, Y., Jiang, X., Wang, S., Xiong, H., & Ohno-Machado, L. (2016). VERTIcal Grid lOgistic regression (VERTIGO). Journal of the American Medical Informatics Association, 23(3), 570–579.