Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Secure foundations for healthcare startups

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.
Avatar for Goce Bonev Goce Bonev
September 04, 2025

Secure foundations for healthcare startups

Startups building healthcare applications need to embed security from day one. This means designing on secure infrastructure, encrypting data at rest and in transit, and managing keys safely. Founders should understand healthcare regulations like HIPAA and GDPR, map sensitive data flows, and apply threat modeling to identify risks early. Secure coding practices, access controls, and proper handling of files and logs help prevent common vulnerabilities. The main takeaway is that secure-by-design choices not only reduce compliance risk but also build user trust and scalability into the product.

Avatar for Goce Bonev

Goce Bonev

September 04, 2025
Tweet

More Decks by Goce Bonev

Other Decks in Programming

Transcript

  1. What I am going to talk about • A real

    app for a healthcare startup • Challenges and dangers • Domain discovery • Infrastructure • Secure coding practices • Encryption, types, key management, real applications and challenges • Securing files • Logs and append only records
  2. High level requirements • Secure infrastructure to run the application

    • Reliable, scalable storage that can store many GB of data • Data encryption at rest and transit • Secure secret storage • Access controls • Backups • Monitoring • Audit logs
  3. Healthcare data regulations • HIPAA (Health Insurance Portability and Accountability

    Act) GDPR (General Data Protection Regulation) • Health Data Hosting (HDS) certification for French patients • HITECH (Health Information Technology for Economic and Clinical Health Act) • PIPEDA (Personal Information Protection and Electronic Documents Act)
  4. • A minimum security checklist for enterprise-ready products and services

    • Not a standard • Application security, not enterprise security • Simple to implement • Provide a good foundation • https://mvsp.dev/ MVSP - Minimum Viable Secure Product
  5. • Detailed security architecture guidance providing secure coding checklists •

    ASVS Level 1 is for low assurance level • ASVS Level 2 - minimum for applications that contain any sensitive data, recommended level for most apps • ASVS Level 3 - critical applications that perform high value transactions, contain sensitive medical data, or any application that requires the highest level of trust OWASP Application Security Verification Standard
  6. • What data flows in the system • Sensitive data

    mapping • Define processes • What data each process needs • Need to know basis • Data minimization • Data ownership Data / Process discovery and mapping
  7. • Security first vs security last • Security as a

    core business requirement • Identify potential threats and vulnerabilities early in the development process Secure by design
  8. The activity of identifying threads and figuring out how to

    mitigate them. PHI / Business risk OWASP Threat Dragon Threat modeling Image: Continuous Architecture in Practice: Software Architecture in the Age of Agility and DevOps, Murat Erder Pierre Pureur and Eoin Woods
  9. Threat modeling example (STRIDE) Threat Type Mitigated Risk Mitigation Unauthorized

    access to the database An attacker could access the database. Information disclosure Yes High Valid user/password is required to access the database. Database credential theft An attacker could obtain the credentials and use them to make unauthorized calls to the database. Information disclosure Yes High The database is placed behind a firewall and is only accessible from inside the ECS cluster. No access from public internet. Database MITM attacks An attacker could intercept the database queries and obtain sensitive information. Information disclosure No High All connections to the database require SSL. Database data theft An attacker or an insider can steal the database data. Information disclosure Yes High Personal and all sensitive data in the database is encrypted using envelope encryption, different DEKs per row approach. The DEK can only be decrypted by the KEK stored in AWS KMS… Message tampering / Fake messages could be placed on the queue Fake/tampered messages could be placed on the queue, resulting in incorrect processing by the service. Tampering / Spoofing Yes High All messages are signed by service that sent them and the receiving service checks the signature using the service public key. An attacker could upload malicious code An attacker could upload malicious code on the server. Tampering Yes High File uploads are disabled on application services. Container filesystems are read-only. MITM attacks Attacker can intercept requests between the user and the application or between the applications. Information disclosure Yes High SSL is used on all endpoints. Valid HSTS policy is present on the website. Microservice communication is secured with mTLS.
  10. ✔ Business Associate Addendum (BAA) ✔ Only HIPAA-Eligible services ✔

    AWS Well-Architected Framework ✔ Architecting for HIPAA Security and Compliance on AWS Whitepaper ✔ https://aws.amazon.com/compliance/hipaa-compliance/ ✔ https://aws.amazon.com/health/providers/ ✔ https://aws.amazon.com/compliance/programs/ AWS and Compliance
  11. RDS / SQL • Enable storage encryption (at rest, KMS

    key) • Transport encryption ◦ CA (ex: rds-ca-ecc384-g1 / rds-ca-rsa4096-g1) ◦ REQUIRE SSL for all users • Enable backups, retention period (30-35 days) • No public access • Enable deletion protection
  12. What can go wrong? • Leak patient information publicly •

    Leaked / stolen credentials • Allow doctors to view information for patients who are not theirs (isolation of resources); BOLA (Broken Object Level Authorization) • Create / modify / delete cases of other doctors; BFLA (Broken Function Level Authorization) • Someone can see or modify a object property that they should no have access to; BOPLA (Broken Object Property Level Authorization = Excessive data exposure / Mass assignment) • Mix up patient records or prescriptions or expose sensitive health data in wellness apps • Complexity • So many other…
  13. Treating doctors vs Backoffice Client portal • Access to his/her

    own cases only • Limited access and functions • Scoped to his/her account • Limited impact • …. Backoffice • Access to all cases for all doctors • Accounting • Administrative features • Depending on the role • For all accounts • Huge impact • ….
  14. Backoffice security • Network access ◦ IP Restriction / VPN

    ◦ mTLS • MFA - Mandatory • Roles (Principle of least privilege)
  15. Broken Object Level Authorization Broken Function Level Authorization • Permission

    checker in command / query handlers • Security context for commands (type of logged user / meta) • Teammate vs Client • Unit testing
  16. Broken Object Property Level Authorization • Query handlers return view

    models (DTO) • View models are reviewed and approved • Different view models for teammates / clients
  17. ✔ Encryption in transit ✔ Encryption at rest ✔ Key

    rotation ✔ Encryption as a service ✔ Crypto-shredding Battle-tested libraries: https://github.com/paragonie/halite Encryption
  18. Do not store the keys and data on the same

    system! ▪ Version control ▪ Container images ▪ In the database* ▪ On the application server (env/memory) ▪ Encryption as a service Now add the following app vulnerabilities in the mix: ▪ SQL Injection ▪ Upload and execute code on the server Key storage
  19. Encrypt the plaintext data with a data key and then

    encrypt the data key with another key. Envelope encryption
  20. 1. Assuming the attacker can see the encrypted data and

    the blind indexes (access to a full SQL dump) 2. The application server has access / can use the encryption service and blind index keys (blind / data keys stored on the server encrypted with KEK) 3. The database server stores the encrypted data and blind indexes (no keys are stored on the database server) 4. The attacker has can not upload or execute malicious code on the application server Encryption threat model
  21. AWS KMS • Fully managed service • Symmetric / asymmetric

    encryption • Envelope encryption • Key rotation / management • Compliance with regulations • Cost ◦ $0.03 per 10,000 requests ◦ $0.10 per 10,000 generated data keys
  22. ✔ Encryption granularity that fits your risk model ✔ One

    DEK client (for all bounded contexts) ✔ One DEK per aggregate or bounded context / service ✔ One DEK per projection / stream / use case ✔ One DEK per database row Security, performance and cost
  23. Write model != read model Write / command model •

    Data model is built for ACID writes • Patient data is encrypted with a different DEK per database row • Encrypted blobs contain a lot of different data elements • Optimized for reading one case / case element at a time • Inefficient encryption model for building listings Read / query / view model • Listings containing personal or sensitive information • Only some of the elements from each entity are needed • Listings need to display information from different bounded contexts (case progress, notifications, payments) • Cost effective encryption model that supports listings
  24. • Read only data model used for queries • Creatable

    / rebuildable from primary data models • Specifically created for a problem / question at hand • Eventually consistent or not (same transaction, async… implementation detail) • Stored in the same persistence as primary data or another (SQL, NoSQL, Elasticsearch, Neo4j, etc.) • Disposable Projections
  25. ✔ Blind index is created by applying hash functions and/or

    key-stretching algorithms on plaintext, using a secret key ✔ Blind indexes for exact match search ✔ Blind index size – smaller has more false positives, bigger is vulnerable to leakage attacks ✔ Normalize the query (uppercase / lowercase or something else) ✔ Calculate the blind index from the normalized query ✔ Do a exact match on the blind index ✔ Database returns X results ✔ Decrypt the results to find the one you are looking for Battle-tested libraries: https://github.com/paragonie/ciphersweet Searchable encryption / blind indexes
  26. Search/sort by patient name in all cases belonging to the

    same doctor (example) • Extract and group data by tenant / doctor and create a projection / helper view (ex: Get the patient names and case IDs for all active cases grouped by doctor) • Normalize data if necessary (encoding, case, etc…) • Store data in a single encrypted blob • Event driven updates • When searching decrypt the blob • Perform the search/sorting in memory and return the list of case IDs • Find the matching cases by ID from the DB or combine with other filters by unencrypted fields • Cursor pagination Searching and sorting encrypted data
  27. File uploads • Treating doctors can add photos, x-rays, scans,medical

    imaging and other health-related files • Orthodontists can add also do the same • Externally connected systems too • File size up to 4GB • Minimum retention period of case files (project specific) ◦ 3 months for unsubmitted cases ◦ 10 years for submitted cases
  28. File upload risks • Execution of the uploaded file (Script

    injection / Directory traversal attacks) • Resource exhaustion (CPU/Network) • Storage space exhaustion (DoS) • Malware uploads • Many more…
  29. File names and metadata • Client provided names, ex: Goce_Bonev_Maxilla_20230217_1135.stl

    • Never include sensitive data such as names or SSN in the file name Goce_Bonev_Sofia_Hayduska_Gora_37_SSN_123.jpg • Don’t use sequential or predictable names 123.jpg (enumeration attacks) • Standardize filenames with filename patterns (new CaseFilenamePattern(CaseId $caseId))->originalFile() • Use file IDs that are not sequential (UUID4, ULID) + checksum / signature 4iOgG1pA41D5maKytUqWe-RJf0XP.jpg • Always store the original file + checksum (data integrity) • Encrypt the extracted EXIF metadata (In transit / At rest) • Disable directory indexing
  30. AWS S3 • Scalability, durability, and high availability • Security

    features (access control, versioning, logs) ◦ Use only HTTPS endpoints (encryption in transit) ◦ Enable bucket encryption (at rest) ◦ Versioning • Presigned URLs • Object Lock, WORM (Write once, read many) • Scalable • Compliance with regulations • Cost
  31. Standard upload flow • Limited by the resources of the

    web app, max file size we can process • We need to validate the file on the web app (resource usage) • Temporary upload file is stored on the local filesystem of the web app • File uploads are enabled on the server • Bottleneck • Scalability issues
  32. Pros Supports large file uploads Autoscaling Low resource usage of

    the main app Final destination bucket events (file overwrite lambda, deletions etc.. CloudWatch alarms) Versioning on TB allows checking if the token has been reused. TFPL checks if the file has more than one version and if it does triggers and exception. TFPL lambda can do validation of the filename (signature provided by the app) and the file contents. {file_id}_{signature}.{extension} Real file name is overridden when uploading to bucket, all files are uploaded as file.extension. The bucket does not need to know the real file name.
  33. Standard download flow • High data transfer through the app

    server • Bottleneck • Scalability issues
  34. Monolith vs microservices • Compartmentalization on service and database level

    • Per service encryption keys • Reduced information exposure in if one service is breached (no access to data from other services – depending on the system design) • PoLP - Each service has access to only the permissions that it needs (billing has no access to the patient file bucket) • red • Complexity + new attack vectors
  35. Non repudiation • Non repudiation - I did not do

    that! • Append only records • Immutable – once written, records cannot be altered • Audit trail • Compliance requirement • WORM Storage • MySQL/MariaDB – Allow only INSERT, SELECT and disable UPDATE, REPLACE and DELETE for the app database user • MySQL/MariaDB Use the ARCHIVE storage engine
  36. Logs / APM / Monitoring • Logs and logging solutions

    • Error tracking and performance monitoring • Proper data scrubbing • Proper data anonymization
  37. Data Anonymization for Testing Environments • Production data is highly

    sensitive • Production data is sometimes needed for to ensure realistic testing conditions • Export function: extracts data, strips it of PII, and replaces it with fictional but realistic data • Comply with HIPAA and GDPR for data handling and anonymization processes • Limited access to the export function, audit logs