• 5 clinical themes sharing data across 5 BRCs • Acute coronary syndromes (Imperial) • Viral hepatitis (GST/Kings) • Ovarian cancer (Cambridge) • Renal transplantation (Oxford) • Critical care (UCL) • To develop • an IT capability supporting research and patient care • a capability that is sustainable (beyond 2015) • a capability that is scalable (on a national basis) • a capability that is expandable • the capability in the most efficient way possible
house data linkage ‘engine’ and expert to validate • Trust database to repetively query HES until patient death • Pseudo-anonymisation prior to export to UCL (as records would need updating) • Advantages: • Identifiable patient data does not leave Trust, • Trust is able to monitor linkage quality • Disadvantage: • Cost of linkage engine (£30,000p.a.) and IT validation and support • Cost of querying HES (~£3000p.a) • Difficult to update if new database linkage is added • Not easily scalable to smaller hospitals with smaller IT capability
is securely exported from Hospital to UCL safe haven • Repetitive linkage is performed within UCL safe haven • Data de-identified on death • Advantages • Reduces burden on local IT infra-structure • Data linkage can be done by handful of experts • Reduces times sensitive data is handled, simplified audit trail • Facilitates the addition of other datasets • Enables less ‘fortunate’ hospitals to partake • Disadvantage • Identifiable data is outside Trust boundaries • More complex data sharing agreements
BSI ISO 27001 security standard • Information Governance toolkit for compliance with • The Data Protection Act 1998. • The common law duty of confidentiality. • The Confidentiality NHS Code of Practice. • The NHS Care Record Guarantee for England. • The Social Care Record Guarantee for England. • The international information security standard: ISO/IEC 27002: 2013 and ISO/IEC 27001: 2013. • The Information Security NHS Code of Practice. • The Records Management NHS Code of Practice. • The Freedom of Information Act 2000. • The Human Rights Act article 8 • R&D approval • Individual trust Data sharing agreements • Research ethics approval • Section 251 (NHS Act 2006) approval • "was established to enable the common law duty of confidentiality to be overridden to enable disclosure of confidential patient information for medical purposes, where it was not possible to use anonymised information and where seeking consent was not practical, having regard to the cost and technology available"
Single US centre Explicitly linkable Complete physiology Complete treatment Explicit research purpose Explicit research purpose Audit first 1st 24 hour treatment Multi-centre Complete physiology Complete treatment Potentially linkable
stored data. The size of the data determines the value and potential insight- and whether it can actually be considered big data or not. • Variety: The type and nature of the data. This helps people who analyze it to effectively use the resulting insight. • Velocity: In this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development. • Veracity: The quality of captured data can vary greatly, affecting accurate analysis. • Variability: Inconsistency of the data set can hamper processes to handle and manage it. • Viable: A usable and practicable resource.
Information Commissioner's Office (ICO) code of practice with respect to the Data Protection Act (DPA) • Dual use • Research specific data release • Development data
Delete direct identifiers Convert dates from absolute to relative measures Remove high risk individuals and patient opt outs Specify k-anonymity Data requested [Field list and date range] Micro-aggregation of all key continuous and date-time variables Initial anonymisation configuration
K-anonymity L-diversity acceptable No Micro-aggregation or Local suppression Adjust anonymisation configuration Measure information loss Record data release Anonymised data Additional safeguards (e.g. remove living subjects) Yes
BRC hospitals Secure storage Researcher Electronic health records Standard XML schema Research database Statistical analysis engine Final scientific report Data quality reporting Data validation Cloned analysis engine Example anonymised or synthetic data Analysis script Example scientific report returned to research team CCHIC Researcher Ready