Container Days Boston 2016, an overview of using container technologies in regulated environments. Discusses secure configuration practices, mental models for navigating regulations, and practices to enable secure delivery of polyglot software.
CONTAINERS IN REGULATED
Elliot Murphy @sstatik [email protected]
Goal is an overview, launch point to help you be successful with using some very nice tools in a regulated environment. Delighted to see Jeﬀ talking in detail about Vault,
and the talk about Clair.
* concepts and mental models that are helpful when trying to survive a regulated environment
* containers as a virtualization technology
* containers as packaging and deployment technology
CAN I USE CONTAINERS?
Docker or Rkt?
Alpine or RancherOS?
Kubernetes or ECS?
AWS or Google Cloud or Azure?
Usually asking a more speciﬁc question.
THE ANSWER IS ALWAYS YES
pencil and paper?
database technology from the 1960s?
The answer is always yes, you can use any given technology with any given regulatory environment. The question is, how does X aﬀect my controls.
Call for completely new controls?
Example of a technical control would be you must use encryption when transmitting data, you must authenticate users before exposing any data, you must have roles
that deﬁne which data can be accessed.
Example of administrative controls would be your ﬁrewall logs must be reviewed daily, you must screen personnel prior to hiring using background checks, you must
monitor your service providers and vendors for their compliance status.
Given a set of controls, some technology is a good ﬁt and some is a bad ﬁt. Containers are getting there and are here to stay.
As soon as you start talking about controls, you run smack into people problems.
Not knowing what the rules are or not knowing how the
This one is sad, and it happens at all levels: legislators, managers, regulators, auditors, technologists.
Regularly talk to and hear about companies breaking the rules because of pure ignorance. One company was processing healthcare data using AWS RDS PostgreSQL
and didn’t know that PG is currently excluded from the AWS BAA.
One doctor doing drug research was not following the rules for CFR 21.11 about accuracy, reliability, integrity, authenticity of records, a bunch of drug research was
The person who will do my audit doesn’t understand technology
as much as I do
This regulation is wasteful and enforcement is lax
Assuming that insiders (developers, managers, system
administrators) are honest
credit card processor integration with client side encryption. When PCI 3 came out it speciﬁcally addressed sloppy practices by customers of Recurly and Stripe.
Sure regulatory requirements have cost: what you get for that cost varies, but it is generally intended as additional resilience against failures. Running load balancers and
clusters is also wasteful.
Optimism bias is a nice trait of people, probably crucial for the survival of humanity, but it can interfere with sound reasoning about a regulated system.
LACK OF SYSTEMS THINKING
Reliability of a component vs the safety of a system
Safety as an emergent property
The right way to work must be the EASIEST way to work
If you design a bunch of reliable components, surely when assembled into a system they will be safe? Nope. Safety is a property that emerges from interactions given a
speciﬁc context. It is totally possible for a system to be safe but unreliable. Very commonly held ideas about causality, root causes, are totally wrong. Safety and reliability
often conﬂict! Is safety part of the mission, or merely a constraint? Accidents are complex processes involving the entire socio-technical system. Often mental models
contribute to human error.
Even more complex when consider that it is not just a human making a decision, it is multiple humans networked together with multiple cognitive devices (other brains,
other computers) and often no one person can comprehend the entire system as it runs. Example of group cognition is the navigation team for a ship, or the engineering
team for any modern application that you might want to…run in containers.
Time delay between technology availability and update of regulations
Sometimes the laws stay the same but the interpretation and
• Eventually technology is reﬁned to make compliance easier
• Castle, D., Kumagai, K., Berard, C., Cloutier, M., & Gold, R. (2009).
A model of regulatory burden in technology diffusion: The case of
Many examples in our careers of technology leapfrogging regulations: introduction of networks, explosion of the web, explosion of mobile phones.
In 2011 joint commission ruled that it is not acceptable for docs to text orders for patient care, services, or treatment. In May 2016, joint commission revised it’s position
allowing secure texting for transmission of orders, and deﬁned characteristics of a secure texting platform (based on review of industry developed technology)
PCI DSS 3.0 updated in 2014, look at SAQ A for card-not-present merchants with all cardholder data functions fully outsourced.
PCI DSS 3.0 section 2.2.1 speciﬁcally talks about virtualization, one primary function per server to prevent functions that require diﬀerent security levels from co-existing
on the same server (web, DB, DNS on diﬀerent servers)
Interesting example of trying to model out diﬀerent approaches, this paper discusses 3 models for vaccine development, production, and distribution with varying
regulatory burdens and tries to model the impact on disease for a given population with each approach.
DO A GOOD JOB
Why do regulations exist?
Safety, harm reduction, risk management
Don’t lose sight of the big picture
Some developers have a selective allergy to cost. Don’t do that, be willing to invest the same energy that goes into optimizing, debugging, inventing cool things.
Don’t become so obsessed with checking oﬀ the boxes that you lose sight of the big picture. For example, sometimes folks working on HIPAA become so focused on the
obligation to protect privacy that they forget about the patients right to disclose. Sadly some corporations mis-apply regulations in an attempt to justify anti-competitive
behavior and obstruct data sharing that would result in better, cheaper, safer patient care.
• https://mitpress.mit.edu/books/engineering-safer-world Nancy
• https://mitpress.mit.edu/books/cognition-wild Edwin Hutchins
Character%20of%20Harms.html Malcom Sparrow
• http://www.tempobook.com/ Venkatash Rao
Nancy Leveson discusses safety, causality, and a model for safety. Fascinating analysis of accidents and the entire socio-technical systems involved.
Edwin Hutchins work on group cognition is amazing, case study of a navy team operating a ship and how computations are performed in the group as a weird sort of
Character of harms is interesting, takes a rather adversarial approach focused on mitigation of bad things and is worth reading to understand the mindset of an auditor or
regulator and temporarily snap you out of optimism bias.
Tempo book is about narrative driven decision making, and is incredibly helpful when deciding how to engage and interact with the various authorities - I have to imagine
some of this was going on with the vendors that spearheaded the work to get the ruling reversed on secure texting.
Distinguish between infrastructure or execution concerns and the
application management and conﬁguration concerns.
Excellent overview from Randy Bias talking about VT-x, hypervisor
security, paravirtualization http://cloudscaling.com/blog/cloud-
Another way to put it is running the containers vs building the containers.
HARDENING LINUX CONTAINERS
Important paper from NCC Group published in April 2016
Covers Docker, LXC, Rkt with speciﬁc hardening
Managing security artifacts such as secrets, keys, passwords
Will hit a few of the key areas to think about from the paper, but it’s far too detailed to cover in a single talk, there are speciﬁc hardening recommendations for these three
Also talks about managing security artifacts - don’t put passwords and keys in your source tree! Don’t put passwords in your docker ﬁles! Environment variables still
carry a level of risk. Use Vault.
CONSIDER THE RULES FOR YOUR
Isolation from different types of containers?
Isolation from other tenants?
Updates of host systems?
Use a host distro that was designed with containers in mind:
CoreOS, RancherOS, AtomicHost
As you select and conﬁgure your orchestration layer, do you have speciﬁc requirements to separate diﬀerent types of containers from each other? Does your scheduling
layer allow you to express those constraints and then enforce them? (i.e. DB container can’t run on the same host as the web app containers).
Does your environment prove suﬃcient isolation from other tenants? For example, in AWS to be HIPAA compliant you have to use dedicated EC2 instances - you can run
containers on those, but you can’t use Elastic Beanstalk.
How will you update the host systems? If you are using a pre-cloud distro, how will you handle rebooting the container hosts when needed (kernel updates). How about
hardening the host? Recommend using a bistro that was developed with container hosting in mind: CoreOS, RancherOS, AtomicHost
SECURE TRAFFIC TO/FROM
Encrypted from load balancer to container?
Encrypted from container to database?
Encrypted from container to message queue?
Message queue durable storage on encrypted disks?
Who will run your internal CA?
This is an area that has been poorly documented for a long time.
Vault can be your internal CA and you should use it!
Do logs contain protected info?
Do you need to make logging or audit trails tamper resistant?
Does your logging system provide support for automating the
detection and alerting of key events to reduce your
SumoLogic stands out as a particularly useful vendor (will sign a BAA), many competing options available here for collecting application logs.
INTRUSION DETECTION &
Time for security team to learn some new tricks, ossec and
auditd don’t fully support container environments and are harder
Just like with the distros, don’t use a tool that was designed before containers, better options are available that will greatly simplify your life.
Threatstack is great, container aware, gives you compliance reports speciﬁcally tied back to chapter and verse of particular regulations, alerts on speciﬁc types of
falco new “Behavioral Activity Monitor with Container Support”, describes itself as an easy to use combo of snort, ossec, and strace.
Traditional antivirus is widely mocked as ineffective or actively
AWS reference architecture for PCI-DSS 3.0 completely ignores
requirement 5: “Use and regularly updated anti-virus software or
programs” (on servers)
This is an area where we get eye-rolls and derision instead of thinking about how to make a responsible choice in line with the spirit of the requirement and what would
be the best way to address this risk in the new environment.
Boston startup over in wakeﬁeld that I like (and work with) uses DNS as control point and interrogates the malware as well as alerting on infection. DNS can be injected
via VPC DHCP options.
CONTAINERS AS PACKAGING
AND DEPLOYMENT TOOL
CPAN, PyPI, RubyGems, PHP PEAR, NPM, go get, cradle
Consider the provenance of all the source code in your container.
Interesting perspective from Scott McCarty on modeling your container contents as a supply chain.
Other interesting work being done is Diplomat, looking at how to protect community repositories
RedHat atomic scan http://developers.redhat.com/blog/2016/05/02/introducing-
Docker Cloud security scanning https://docs.docker.com/docker-cloud/builds/
Docker best practices checking https://github.com/docker/docker-bench-security
CoreOS clair static vulnerability analysis: https://github.com/coreos/clair
atomic scan defaults to OpenSCAP but can add other scanners
docker cloud image scanning
will be a talk on Clair tomorrow
Twistlock is another oﬀering that oﬀers speciﬁc support for achieving compliance
Saw aqua in the hall, also help with image assurance
AWS ECS Registry
Google Compute Engine Container Registry
quay.io from the CoreOS folks integrates Clair
docker cloud has security scanning, I think it costs extra
VMWare harbor has RBAC and auditing of image changes but no scanning
many other registries
Ideal is to use something like quay.io for your base images, consider your attestation and signing requirements and where those will be enforced. The cloud speciﬁc
registries are appealing because they are convenient.
SCANNING YOUR APP CODE
Add static analysis of your app code to your build pipeline
Run additional checks after build and before production
CodeClimate checks for code smells, security vulnerabilities, some security issues can be found via static analysis, is extensible with additional rules.
BlackDuck maps known vulnerabilities, some overlap with the container scanners we talked about recently.
Gauntlet can run interactive checks against your running software during acceptance test phase, for example even if you were implementing in a language where you
didn’t have a static analysis scanner you could use gauntly runs against your running app, it can ﬁnd SQL injection holes, check for HTTP headers (based on mozilla
secure coding standard)
Containers make setting up this kind of infrastructure much much easier even when dealing with polyglot applications.
Recording of conﬁguration changes is more likely to happen in a
container environment because version control.
Don’t forget to record the rationale for the changes!
Don’t make your commit message something stupid like “compliance LOL”
Create a document mapping speciﬁc design decisions back to a particular regulation. This will empower future developers working on the project to change, improve, or
drop obsolete conﬁgurations given future changes in context.
A great starting point is the spreadsheet that amazon published on Monday May 23 to accompany their reference PCI-DSS 3.0 architecture, the spreadsheet maps each
requirement back to a speciﬁc design feature. Notice that a bunch are blank, they are not accounted for in the infrastructure level design, you should ﬁll those in with your
application level choices.
GOOD STARTING POINTS
Amazon ECS running on Dedicated Instances
Mesosphere DC/OS on Microsoft Azure
Tectonic: Kubernetes, Rocket, CoreOS on AWS or packet.net
Google Cloud Platform: Compute Engine, Cloud SQL,
We’ve covered a LOT of ground, where to get started?
Taking HIPAA compliance as one example, here are a few totally reasonable options that will not paint you into a corner with compliance
ECS - you can’t use Elastic Beanstalk
Microsoft - can’t currently use Azure Container Service, kudos to MS for actually publishing their BAA to the public, unlike Amazon, Google.
packet.net is bare metal, full TPM story down to the ﬁrmware attestation)
Google Cloud it’s unclear to me whether Container Engine is covered because it runs on Compute Engine.
Elliot Murphy @sstatik [email protected]
If you are working in a regulated environment I would love to talk to you afterwards, please come say hello.