$30 off During Our Annual Pro Sale. View Details »

ComplianceOps: Containers in regulated environments

ComplianceOps: Containers in regulated environments

Container Days Boston 2016, an overview of using container technologies in regulated environments. Discusses secure configuration practices, mental models for navigating regulations, and practices to enable secure delivery of polyglot software.

Elliot Murphy

May 24, 2016
Tweet

More Decks by Elliot Murphy

Other Decks in Technology

Transcript

  1. CONTAINERS IN REGULATED
    ENVIRONMENTS
    Elliot Murphy @sstatik [email protected]
    Goal is an overview, launch point to help you be successful with using some very nice tools in a regulated environment. Delighted to see Jeff talking in detail about Vault,
    and the talk about Clair.

    * concepts and mental models that are helpful when trying to survive a regulated environment

    * containers as a virtualization technology

    * containers as packaging and deployment technology

    View Slide

  2. CAN I USE CONTAINERS?
    Docker or Rkt?
    Alpine or RancherOS?
    Kubernetes or ECS?
    AWS or Google Cloud or Azure?
    Usually asking a more specific question.

    View Slide

  3. THE ANSWER IS ALWAYS YES
    containers?
    mobile phones?
    pencil and paper?
    database technology from the 1960s?
    The answer is always yes, you can use any given technology with any given regulatory environment. The question is, how does X affect my controls.

    Strengthen?

    Weaken?

    Call for completely new controls?

    View Slide

  4. CONTROLS
    Technical controls
    Administrative controls
    Example of a technical control would be you must use encryption when transmitting data, you must authenticate users before exposing any data, you must have roles
    that define which data can be accessed.

    Example of administrative controls would be your firewall logs must be reviewed daily, you must screen personnel prior to hiring using background checks, you must
    monitor your service providers and vendors for their compliance status.

    Given a set of controls, some technology is a good fit and some is a bad fit. Containers are getting there and are here to stay.

    As soon as you start talking about controls, you run smack into people problems.

    View Slide

  5. IGNORANCE
    Not knowing what the rules are or not knowing how the
    technology works.
    This one is sad, and it happens at all levels: legislators, managers, regulators, auditors, technologists.

    Regularly talk to and hear about companies breaking the rules because of pure ignorance. One company was processing healthcare data using AWS RDS PostgreSQL
    and didn’t know that PG is currently excluded from the AWS BAA.

    One doctor doing drug research was not following the rules for CFR 21.11 about accuracy, reliability, integrity, authenticity of records, a bunch of drug research was
    thrown out.

    View Slide

  6. OPTIMISM BIAS
    The person who will do my audit doesn’t understand technology
    as much as I do
    This regulation is wasteful and enforcement is lax
    Assuming that insiders (developers, managers, system
    administrators) are honest
    Maybe a set of regulations doesn’t specifically address or account for a new technology invention - happened recently with PCI 2 and the invention of javascript based
    credit card processor integration with client side encryption. When PCI 3 came out it specifically addressed sloppy practices by customers of Recurly and Stripe.

    Sure regulatory requirements have cost: what you get for that cost varies, but it is generally intended as additional resilience against failures. Running load balancers and
    clusters is also wasteful.

    Optimism bias is a nice trait of people, probably crucial for the survival of humanity, but it can interfere with sound reasoning about a regulated system.

    View Slide

  7. LACK OF SYSTEMS THINKING
    Reliability of a component vs the safety of a system
    Safety as an emergent property
    Group cognition
    The right way to work must be the EASIEST way to work
    If you design a bunch of reliable components, surely when assembled into a system they will be safe? Nope. Safety is a property that emerges from interactions given a
    specific context. It is totally possible for a system to be safe but unreliable. Very commonly held ideas about causality, root causes, are totally wrong. Safety and reliability
    often conflict! Is safety part of the mission, or merely a constraint? Accidents are complex processes involving the entire socio-technical system. Often mental models
    contribute to human error.

    Even more complex when consider that it is not just a human making a decision, it is multiple humans networked together with multiple cognitive devices (other brains,
    other computers) and often no one person can comprehend the entire system as it runs. Example of group cognition is the navigation team for a ship, or the engineering
    team for any modern application that you might want to…run in containers.

    View Slide

  8. INVENTION, REGULATION,
    ENFORCEMENT CYCLE
    Time delay between technology availability and update of regulations
    Sometimes the laws stay the same but the interpretation and
    enforcement changes
    • Eventually technology is refined to make compliance easier
    • Castle, D., Kumagai, K., Berard, C., Cloutier, M., & Gold, R. (2009). 

    A model of regulatory burden in technology diffusion: The case of
    plant-derived vaccines.

    http://www.agbioforum.org/v12n1/v12n1a10-castle.htm
    Many examples in our careers of technology leapfrogging regulations: introduction of networks, explosion of the web, explosion of mobile phones.

    In 2011 joint commission ruled that it is not acceptable for docs to text orders for patient care, services, or treatment. In May 2016, joint commission revised it’s position
    allowing secure texting for transmission of orders, and defined characteristics of a secure texting platform (based on review of industry developed technology)

    PCI DSS 3.0 updated in 2014, look at SAQ A for card-not-present merchants with all cardholder data functions fully outsourced.

    PCI DSS 3.0 section 2.2.1 specifically talks about virtualization, one primary function per server to prevent functions that require different security levels from co-existing
    on the same server (web, DB, DNS on different servers)

    Interesting example of trying to model out different approaches, this paper discusses 3 models for vaccine development, production, and distribution with varying
    regulatory burdens and tries to model the impact on disease for a given population with each approach.

    View Slide

  9. DO A GOOD JOB
    Why do regulations exist?
    Safety, harm reduction, risk management
    Don’t lose sight of the big picture
    Some developers have a selective allergy to cost. Don’t do that, be willing to invest the same energy that goes into optimizing, debugging, inventing cool things.

    Don’t become so obsessed with checking off the boxes that you lose sight of the big picture. For example, sometimes folks working on HIPAA become so focused on the
    obligation to protect privacy that they forget about the patients right to disclose. Sadly some corporations mis-apply regulations in an attempt to justify anti-competitive
    behavior and obstruct data sharing that would result in better, cheaper, safer patient care.

    View Slide

  10. READING LIST
    • https://mitpress.mit.edu/books/engineering-safer-world Nancy
    Leveson
    • https://mitpress.mit.edu/books/cognition-wild Edwin Hutchins
    • https://www.hks.harvard.edu/fs/msparrow/Publications--Books--
    Character%20of%20Harms.html Malcom Sparrow
    • http://www.tempobook.com/ Venkatash Rao
    Nancy Leveson discusses safety, causality, and a model for safety. Fascinating analysis of accidents and the entire socio-technical systems involved.

    Edwin Hutchins work on group cognition is amazing, case study of a navy team operating a ship and how computations are performed in the group as a weird sort of
    distributed system.

    Character of harms is interesting, takes a rather adversarial approach focused on mitigation of bad things and is worth reading to understand the mindset of an auditor or
    regulator and temporarily snap you out of optimism bias.

    Tempo book is about narrative driven decision making, and is incredibly helpful when deciding how to engage and interact with the various authorities - I have to imagine
    some of this was going on with the vendors that spearheaded the work to get the ruling reversed on secure texting.

    View Slide

  11. CONTAINERS AS
    VIRTUALIZATION TECHNOLOGY
    Distinguish between infrastructure or execution concerns and the
    application management and configuration concerns.
    Excellent overview from Randy Bias talking about VT-x, hypervisor
    security, paravirtualization http://cloudscaling.com/blog/cloud-
    computing/will-containers-replace-hypervisors-almost-certainly/
    Another way to put it is running the containers vs building the containers.

    View Slide

  12. UNDERSTANDING AND
    HARDENING LINUX CONTAINERS
    Important paper from NCC Group published in April 2016
    https://www.nccgroup.trust/globalassets/our-research/us/
    whitepapers/2016/april/
    ncc_group_understanding_hardening_linux_containers-10pdf/
    Covers Docker, LXC, Rkt with specific hardening
    recommendations
    Managing security artifacts such as secrets, keys, passwords
    Will hit a few of the key areas to think about from the paper, but it’s far too detailed to cover in a single talk, there are specific hardening recommendations for these three
    container engines.

    Also talks about managing security artifacts - don’t put passwords and keys in your source tree! Don’t put passwords in your docker files! Environment variables still
    carry a level of risk. Use Vault.

    View Slide

  13. CONSIDER THE RULES FOR YOUR
    ENVIRONMENT
    Isolation from different types of containers?
    Isolation from other tenants?
    Updates of host systems?
    Use a host distro that was designed with containers in mind:
    CoreOS, RancherOS, AtomicHost
    As you select and configure your orchestration layer, do you have specific requirements to separate different types of containers from each other? Does your scheduling
    layer allow you to express those constraints and then enforce them? (i.e. DB container can’t run on the same host as the web app containers).

    Does your environment prove sufficient isolation from other tenants? For example, in AWS to be HIPAA compliant you have to use dedicated EC2 instances - you can run
    containers on those, but you can’t use Elastic Beanstalk.

    How will you update the host systems? If you are using a pre-cloud distro, how will you handle rebooting the container hosts when needed (kernel updates). How about
    hardening the host? Recommend using a bistro that was developed with container hosting in mind: CoreOS, RancherOS, AtomicHost

    View Slide

  14. SECURE TRAFFIC TO/FROM
    CONTAINERS
    Encrypted from load balancer to container?
    Encrypted from container to database?
    Encrypted from container to message queue?
    Message queue durable storage on encrypted disks?
    Other services?
    Who will run your internal CA?

    This is an area that has been poorly documented for a long time.

    Vault can be your internal CA and you should use it!

    View Slide

  15. LOGGING
    Do logs contain protected info?
    https://fpf.org/2016/04/25/a-visual-guide-to-practical-data-de-
    identification/
    Do you need to make logging or audit trails tamper resistant?
    Does your logging system provide support for automating the
    detection and alerting of key events to reduce your
    administrative burden?
    SumoLogic stands out as a particularly useful vendor (will sign a BAA), many competing options available here for collecting application logs.

    View Slide

  16. INTRUSION DETECTION &
    MONITORING
    Time for security team to learn some new tricks, ossec and
    auditd don’t fully support container environments and are harder
    to configure
    threatstack.com
    http://sysdig.org/falco
    datadoghq.com
    Sysdig Cloud
    Just like with the distros, don’t use a tool that was designed before containers, better options are available that will greatly simplify your life.

    Threatstack is great, container aware, gives you compliance reports specifically tied back to chapter and verse of particular regulations, alerts on specific types of
    activity.

    falco new “Behavioral Activity Monitor with Container Support”, describes itself as an easy to use combo of snort, ossec, and strace.

    View Slide

  17. MALWARE DEFENSE
    Traditional antivirus is widely mocked as ineffective or actively
    harmful
    AWS reference architecture for PCI-DSS 3.0 completely ignores
    requirement 5: “Use and regularly updated anti-virus software or
    programs” (on servers)
    strongarm.io
    This is an area where we get eye-rolls and derision instead of thinking about how to make a responsible choice in line with the spirit of the requirement and what would
    be the best way to address this risk in the new environment.

    Boston startup over in wakefield that I like (and work with) uses DNS as control point and interrogates the malware as well as alerting on infection. DNS can be injected
    via VPC DHCP options.

    View Slide

  18. CONTAINERS AS PACKAGING
    AND DEPLOYMENT TOOL
    http://rhelblog.redhat.com/2016/05/18/architecting-containers-
    part-5-building-a-secure-and-manageable-container-software-
    supply-chain/
    https://blog.acolyer.org/2016/03/30/diplomat-using-delegations-to-
    protect-community-repositories/
    CPAN, PyPI, RubyGems, PHP PEAR, NPM, go get, cradle
    Consider the provenance of all the source code in your container.

    Interesting perspective from Scott McCarty on modeling your container contents as a supply chain.

    Other interesting work being done is Diplomat, looking at how to protect community repositories

    View Slide

  19. CONTAINER SCANNING
    https://github.com/OpenSCAP/container-compliance
    RedHat atomic scan http://developers.redhat.com/blog/2016/05/02/introducing-
    atomic-scan-container-vulnerability-detection/
    Docker Cloud security scanning https://docs.docker.com/docker-cloud/builds/
    image-scan/
    Docker best practices checking https://github.com/docker/docker-bench-security
    CoreOS clair static vulnerability analysis: https://github.com/coreos/clair
    https://www.twistlock.com/
    atomic scan defaults to OpenSCAP but can add other scanners

    docker cloud image scanning

    will be a talk on Clair tomorrow

    Twistlock is another offering that offers specific support for achieving compliance

    Saw aqua in the hall, also help with image assurance

    View Slide

  20. CONTAINER REGISTRY
    AWS ECS Registry
    Google Compute Engine Container Registry
    quay.io
    docker cloud
    VMWare Harbor
    quay.io from the CoreOS folks integrates Clair

    docker cloud has security scanning, I think it costs extra

    VMWare harbor has RBAC and auditing of image changes but no scanning

    many other registries

    Ideal is to use something like quay.io for your base images, consider your attestation and signing requirements and where those will be enforced. The cloud specific
    registries are appealing because they are convenient.

    View Slide

  21. SCANNING YOUR APP CODE
    Add static analysis of your app code to your build pipeline
    https://codeclimate.com/engines
    https://www.blackducksoftware.com/products/hub
    Run additional checks after build and before production
    http://gauntlt.org
    CodeClimate checks for code smells, security vulnerabilities, some security issues can be found via static analysis, is extensible with additional rules.

    BlackDuck maps known vulnerabilities, some overlap with the container scanners we talked about recently.

    Gauntlet can run interactive checks against your running software during acceptance test phase, for example even if you were implementing in a language where you
    didn’t have a static analysis scanner you could use gauntly runs against your running app, it can find SQL injection holes, check for HTTP headers (based on mozilla
    secure coding standard)

    Containers make setting up this kind of infrastructure much much easier even when dealing with polyglot applications.

    View Slide

  22. DOCUMENTATION
    Recording of configuration changes is more likely to happen in a
    container environment because version control.
    Don’t forget to record the rationale for the changes!
    https://aws.amazon.com/about-aws/whats-new/2016/05/pci-dss-
    standardized-architecture-on-the-aws-cloud-quick-start-
    reference-deployment/
    Don’t make your commit message something stupid like “compliance LOL”

    Create a document mapping specific design decisions back to a particular regulation. This will empower future developers working on the project to change, improve, or
    drop obsolete configurations given future changes in context.

    A great starting point is the spreadsheet that amazon published on Monday May 23 to accompany their reference PCI-DSS 3.0 architecture, the spreadsheet maps each
    requirement back to a specific design feature. Notice that a bunch are blank, they are not accounted for in the infrastructure level design, you should fill those in with your
    application level choices.

    View Slide

  23. GOOD STARTING POINTS
    Amazon ECS running on Dedicated Instances
    Mesosphere DC/OS on Microsoft Azure
    Tectonic: Kubernetes, Rocket, CoreOS on AWS or packet.net
    Google Cloud Platform: Compute Engine, Cloud SQL,
    Kubernetes images.
    We’ve covered a LOT of ground, where to get started?

    Taking HIPAA compliance as one example, here are a few totally reasonable options that will not paint you into a corner with compliance

    ECS - you can’t use Elastic Beanstalk

    Microsoft - can’t currently use Azure Container Service, kudos to MS for actually publishing their BAA to the public, unlike Amazon, Google.

    packet.net is bare metal, full TPM story down to the firmware attestation)

    Google Cloud it’s unclear to me whether Container Engine is covered because it runs on Compute Engine.

    View Slide

  24. QUESTIONS?
    Elliot Murphy @sstatik [email protected]
    If you are working in a regulated environment I would love to talk to you afterwards, please come say hello.

    View Slide