ComplianceOps: Containers in regulated environments
Container Days Boston 2016, an overview of using container technologies in regulated environments. Discusses secure configuration practices, mental models for navigating regulations, and practices to enable secure delivery of polyglot software.
an overview, launch point to help you be successful with using some very nice tools in a regulated environment. Delighted to see Jeﬀ talking in detail about Vault, and the talk about Clair. * concepts and mental models that are helpful when trying to survive a regulated environment * containers as a virtualization technology * containers as packaging and deployment technology
paper? database technology from the 1960s? The answer is always yes, you can use any given technology with any given regulatory environment. The question is, how does X aﬀect my controls. Strengthen? Weaken? Call for completely new controls?
would be you must use encryption when transmitting data, you must authenticate users before exposing any data, you must have roles that deﬁne which data can be accessed. Example of administrative controls would be your ﬁrewall logs must be reviewed daily, you must screen personnel prior to hiring using background checks, you must monitor your service providers and vendors for their compliance status. Given a set of controls, some technology is a good ﬁt and some is a bad ﬁt. Containers are getting there and are here to stay. As soon as you start talking about controls, you run smack into people problems.
how the technology works. This one is sad, and it happens at all levels: legislators, managers, regulators, auditors, technologists. Regularly talk to and hear about companies breaking the rules because of pure ignorance. One company was processing healthcare data using AWS RDS PostgreSQL and didn’t know that PG is currently excluded from the AWS BAA. One doctor doing drug research was not following the rules for CFR 21.11 about accuracy, reliability, integrity, authenticity of records, a bunch of drug research was thrown out.
safety of a system Safety as an emergent property Group cognition The right way to work must be the EASIEST way to work If you design a bunch of reliable components, surely when assembled into a system they will be safe? Nope. Safety is a property that emerges from interactions given a speciﬁc context. It is totally possible for a system to be safe but unreliable. Very commonly held ideas about causality, root causes, are totally wrong. Safety and reliability often conﬂict! Is safety part of the mission, or merely a constraint? Accidents are complex processes involving the entire socio-technical system. Often mental models contribute to human error. Even more complex when consider that it is not just a human making a decision, it is multiple humans networked together with multiple cognitive devices (other brains, other computers) and often no one person can comprehend the entire system as it runs. Example of group cognition is the navigation team for a ship, or the engineering team for any modern application that you might want to…run in containers.
update of regulations Sometimes the laws stay the same but the interpretation and enforcement changes • Eventually technology is reﬁned to make compliance easier • Castle, D., Kumagai, K., Berard, C., Cloutier, M., & Gold, R. (2009). A model of regulatory burden in technology diffusion: The case of plant-derived vaccines. http://www.agbioforum.org/v12n1/v12n1a10-castle.htm Many examples in our careers of technology leapfrogging regulations: introduction of networks, explosion of the web, explosion of mobile phones. In 2011 joint commission ruled that it is not acceptable for docs to text orders for patient care, services, or treatment. In May 2016, joint commission revised it’s position allowing secure texting for transmission of orders, and deﬁned characteristics of a secure texting platform (based on review of industry developed technology) PCI DSS 3.0 updated in 2014, look at SAQ A for card-not-present merchants with all cardholder data functions fully outsourced. PCI DSS 3.0 section 2.2.1 speciﬁcally talks about virtualization, one primary function per server to prevent functions that require diﬀerent security levels from co-existing on the same server (web, DB, DNS on diﬀerent servers) Interesting example of trying to model out diﬀerent approaches, this paper discusses 3 models for vaccine development, production, and distribution with varying regulatory burdens and tries to model the impact on disease for a given population with each approach.
reduction, risk management Don’t lose sight of the big picture Some developers have a selective allergy to cost. Don’t do that, be willing to invest the same energy that goes into optimizing, debugging, inventing cool things. Don’t become so obsessed with checking oﬀ the boxes that you lose sight of the big picture. For example, sometimes folks working on HIPAA become so focused on the obligation to protect privacy that they forget about the patients right to disclose. Sadly some corporations mis-apply regulations in an attempt to justify anti-competitive behavior and obstruct data sharing that would result in better, cheaper, safer patient care.
• https://www.hks.harvard.edu/fs/msparrow/Publications--Books-- Character%20of%20Harms.html Malcom Sparrow • http://www.tempobook.com/ Venkatash Rao Nancy Leveson discusses safety, causality, and a model for safety. Fascinating analysis of accidents and the entire socio-technical systems involved. Edwin Hutchins work on group cognition is amazing, case study of a navy team operating a ship and how computations are performed in the group as a weird sort of distributed system. Character of harms is interesting, takes a rather adversarial approach focused on mitigation of bad things and is worth reading to understand the mindset of an auditor or regulator and temporarily snap you out of optimism bias. Tempo book is about narrative driven decision making, and is incredibly helpful when deciding how to engage and interact with the various authorities - I have to imagine some of this was going on with the vendors that spearheaded the work to get the ruling reversed on secure texting.
and the application management and conﬁguration concerns. Excellent overview from Randy Bias talking about VT-x, hypervisor security, paravirtualization http://cloudscaling.com/blog/cloud- computing/will-containers-replace-hypervisors-almost-certainly/ Another way to put it is running the containers vs building the containers.
published in April 2016 https://www.nccgroup.trust/globalassets/our-research/us/ whitepapers/2016/april/ ncc_group_understanding_hardening_linux_containers-10pdf/ Covers Docker, LXC, Rkt with speciﬁc hardening recommendations Managing security artifacts such as secrets, keys, passwords Will hit a few of the key areas to think about from the paper, but it’s far too detailed to cover in a single talk, there are speciﬁc hardening recommendations for these three container engines. Also talks about managing security artifacts - don’t put passwords and keys in your source tree! Don’t put passwords in your docker ﬁles! Environment variables still carry a level of risk. Use Vault.
of containers? Isolation from other tenants? Updates of host systems? Use a host distro that was designed with containers in mind: CoreOS, RancherOS, AtomicHost As you select and conﬁgure your orchestration layer, do you have speciﬁc requirements to separate diﬀerent types of containers from each other? Does your scheduling layer allow you to express those constraints and then enforce them? (i.e. DB container can’t run on the same host as the web app containers). Does your environment prove suﬃcient isolation from other tenants? For example, in AWS to be HIPAA compliant you have to use dedicated EC2 instances - you can run containers on those, but you can’t use Elastic Beanstalk. How will you update the host systems? If you are using a pre-cloud distro, how will you handle rebooting the container hosts when needed (kernel updates). How about hardening the host? Recommend using a bistro that was developed with container hosting in mind: CoreOS, RancherOS, AtomicHost
Encrypted from container to database? Encrypted from container to message queue? Message queue durable storage on encrypted disks? Other services? Who will run your internal CA? This is an area that has been poorly documented for a long time. Vault can be your internal CA and you should use it!
need to make logging or audit trails tamper resistant? Does your logging system provide support for automating the detection and alerting of key events to reduce your administrative burden? SumoLogic stands out as a particularly useful vendor (will sign a BAA), many competing options available here for collecting application logs.
some new tricks, ossec and auditd don’t fully support container environments and are harder to conﬁgure threatstack.com http://sysdig.org/falco datadoghq.com Sysdig Cloud Just like with the distros, don’t use a tool that was designed before containers, better options are available that will greatly simplify your life. Threatstack is great, container aware, gives you compliance reports speciﬁcally tied back to chapter and verse of particular regulations, alerts on speciﬁc types of activity. falco new “Behavioral Activity Monitor with Container Support”, describes itself as an easy to use combo of snort, ossec, and strace.
actively harmful AWS reference architecture for PCI-DSS 3.0 completely ignores requirement 5: “Use and regularly updated anti-virus software or programs” (on servers) strongarm.io This is an area where we get eye-rolls and derision instead of thinking about how to make a responsible choice in line with the spirit of the requirement and what would be the best way to address this risk in the new environment. Boston startup over in wakeﬁeld that I like (and work with) uses DNS as control point and interrogates the malware as well as alerting on infection. DNS can be injected via VPC DHCP options.
protect-community-repositories/ CPAN, PyPI, RubyGems, PHP PEAR, NPM, go get, cradle Consider the provenance of all the source code in your container. Interesting perspective from Scott McCarty on modeling your container contents as a supply chain. Other interesting work being done is Diplomat, looking at how to protect community repositories
security scanning https://docs.docker.com/docker-cloud/builds/ image-scan/ Docker best practices checking https://github.com/docker/docker-bench-security CoreOS clair static vulnerability analysis: https://github.com/coreos/clair https://www.twistlock.com/ atomic scan defaults to OpenSCAP but can add other scanners docker cloud image scanning will be a talk on Clair tomorrow Twistlock is another oﬀering that oﬀers speciﬁc support for achieving compliance Saw aqua in the hall, also help with image assurance
quay.io docker cloud VMWare Harbor quay.io from the CoreOS folks integrates Clair docker cloud has security scanning, I think it costs extra VMWare harbor has RBAC and auditing of image changes but no scanning many other registries Ideal is to use something like quay.io for your base images, consider your attestation and signing requirements and where those will be enforced. The cloud speciﬁc registries are appealing because they are convenient.
code to your build pipeline https://codeclimate.com/engines https://www.blackducksoftware.com/products/hub Run additional checks after build and before production http://gauntlt.org CodeClimate checks for code smells, security vulnerabilities, some security issues can be found via static analysis, is extensible with additional rules. BlackDuck maps known vulnerabilities, some overlap with the container scanners we talked about recently. Gauntlet can run interactive checks against your running software during acceptance test phase, for example even if you were implementing in a language where you didn’t have a static analysis scanner you could use gauntly runs against your running app, it can ﬁnd SQL injection holes, check for HTTP headers (based on mozilla secure coding standard) Containers make setting up this kind of infrastructure much much easier even when dealing with polyglot applications.
in a container environment because version control. Don’t forget to record the rationale for the changes! https://aws.amazon.com/about-aws/whats-new/2016/05/pci-dss- standardized-architecture-on-the-aws-cloud-quick-start- reference-deployment/ Don’t make your commit message something stupid like “compliance LOL” Create a document mapping speciﬁc design decisions back to a particular regulation. This will empower future developers working on the project to change, improve, or drop obsolete conﬁgurations given future changes in context. A great starting point is the spreadsheet that amazon published on Monday May 23 to accompany their reference PCI-DSS 3.0 architecture, the spreadsheet maps each requirement back to a speciﬁc design feature. Notice that a bunch are blank, they are not accounted for in the infrastructure level design, you should ﬁll those in with your application level choices.
DC/OS on Microsoft Azure Tectonic: Kubernetes, Rocket, CoreOS on AWS or packet.net Google Cloud Platform: Compute Engine, Cloud SQL, Kubernetes images. We’ve covered a LOT of ground, where to get started? Taking HIPAA compliance as one example, here are a few totally reasonable options that will not paint you into a corner with compliance ECS - you can’t use Elastic Beanstalk Microsoft - can’t currently use Azure Container Service, kudos to MS for actually publishing their BAA to the public, unlike Amazon, Google. packet.net is bare metal, full TPM story down to the ﬁrmware attestation) Google Cloud it’s unclear to me whether Container Engine is covered because it runs on Compute Engine.