Slide 1

Slide 1 text

HTTP and Internet security Henrique Vicente

Slide 2

Slide 2 text

HTTP: stateless • HTTP is a RESTful protocol:
 REST: Representational State Transfer • Each request MUST be treated as unique • What is a request?
 Answer: each to any resource (i.e., image, text, script, redirect)

Slide 3

Slide 3 text

HTTP request / response messages • The request/response message consists of the following: • Request line, such as GET /logo.gif HTTP/1.1 or Status line, such as HTTP/1.1 200 OK, • Headers • An empty line • Optional HTTP message body data • From

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

What about SPDY, HTTP 2, QUIC • Today we’re at HTTP/1.1 • Some large web sites are playing with SPDY and QUIC
 for better performance

Slide 6

Slide 6 text

HTTP is a textual protocol response body …
 … response header request header request header no request body for this request

Slide 7

Slide 7 text

Advantages and drawbacks of a textual protocol • Easy for human beings to read, write, and edit without specialized tools • Less compact than binary (or is it?) • More easily adaptable (WebSockets, SPDY, HTTP 2, QUIC…)

Slide 8

Slide 8 text

First request to the server

Slide 9

Slide 9 text

Browsers sends the
 cookie back to the server

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

• Cookies were not designed with security in mind • You’ve to make careful use of this technology • Example:
 University lets their students and professores have their own web pages at and a library store at even if no member of the academy has intention to do harm this would be dangerous since you can read/write cookies for subdomains • It’d be against the rules of PCI DSS:
 Payment Card Industry
 Data Security Standard

Slide 12

Slide 12 text

Cloud computing / Content Delivery Network • Collection of servers distributed across multiple locations to deliver services and content more efficiently • lower latency • higher data transfer speeds • reliability • more resilience to disasters and attacks: both physical and virtual some you may recognize
 a* for th* for farm* for

Slide 13

Slide 13 text

Why the different domains? • Principle of least privilege
 Twitter adopts doesn’t use “www.” on their links, so it avoids exchanging the application cookie for static assets requests, which has two benefits:
 a) avoids overhead
 b) less security risks are involved • The trade-off is just a additional DNS-lookup • You can request more content at once if you request them from different domains at once, due to browser limits

Slide 14

Slide 14 text

What do you usually store in authentication cookies? • Hopefully, a user session identifier • If you store the user’s password you’re doing it wrong • But before continuing:
 Never use http://example/?SESSION_ID=faf151515
 URI parameters are evil:
 Prone to unintentional disclosure and other risks
 - If you see anything like SID= or session_id= on the URI params chances are the page you’re accessing is compromised
 And POST-based sessions is a horrible workaround and breaks the REST paradigm, don’t use it

Slide 15

Slide 15 text

Using authentication cookies the right way • Avoid session fixation issue
 Never trust a cookie NOT created by the server: if the server ever receives a authentication cookie value it doesn’t recognize it must be destroyed and a new one created to replace it
 Regenerate your authentication session when the user logs in

Slide 16

Slide 16 text

• Learn how to use things like setting the cookie with the HttpOnly flag and what it is for (helps in avoiding XSS attacks from hijacking your cookies due to a bad implemented HTML sanitizer for user content, etc), HTTPS-only cookies, etc • Supercookies: if can
 read/write a cookie for, why
 can’t it write for .com?
 Answer: there’s a blacklist
 but it has failed in the past

Slide 17

Slide 17 text

Secure cookie • Set-Cookie with the Secure flag (over the HTTPS, of course) • So it is not sent over a insecure (regular HTTP) connection. • Otherwise if your user connect over regular HTTP the cookie data is compromised.

Slide 18

Slide 18 text

Privacy concern: Zoombie cookie • localStorage, evil extensions, and so on • when the cookie gets erased, it is recreated ! • Please, don’t be evil.

Slide 19

Slide 19 text

How DNS works • Shared-nothing architecture: decentralized / no SPOF - single point of failure • Root servers • Many servers all over the world… • When solve a name to a IP address
 (or more) or another name (cname)… • You can solve a name based on service: HTTP, mail, samba, or… • Geolocation, available resources, load balancing, etc.

Slide 20

Slide 20 text

DNS cache poisoning • A“rogue” DNS server may contain wrong information:
 this may happen either by mistake or intentionally • This may cause Denial-of-Service or cause your system to route data trough a transparent proxy to intercept confidential unencrypted information:
 the proxy may strip the certificate for you like this:
 HTTP (you) <—> evil proxy <—> HTTPS (server) • The new HTTP Strict Transport Security mechanism tries to fix it by including a blacklist on browsers of web address that MUST NOT use HTTP for communication

Slide 21

Slide 21 text

HTTP on wireless networks • If you use a shared wi-fi network with shared passphrase anyone that has access to it can see what data is being transferred • If you don’t use a passphrase, anyone nearby can access what data is being transfered • You still have to deal with making sure to use WPA 2 the right way, with WPS disabled, and so on… (we take it for granted and, yeap, usually this is enough)

Slide 22

Slide 22 text

HTTPS = HTTP + SSL/TLS • A combination of protocols • Reduces the point of failure • Renders the man-in-the-middle attacks inefficient • Renders the DNS poisoning attack (alone) inefficient • TLS = evolution of SSL • Limitation: Limitation: no more than one validly certified secure web site on a IP due to the HTTPS protocol design. But this should change or IPv6 will fix it sooner than later.

Slide 23

Slide 23 text

• Keep in mind: you are using a secure protocol for a reason • This means: your JavaScript, CSS, and images should also use HTTPS • At least your JS (given that’s a programming client-side script) and private images or files associated with your system (not part of the layout).

Slide 24

Slide 24 text

How the HTTPS protocol works • We have Certificate Authorities (CAs):
 - VeriSign
 - Thawte
 - RSA Security
 - Cisco
 - AOL Time Warner
 - (many others...)

Slide 25

Slide 25 text

About certificates • A Certificate Authority is a entity that issues digital certificates • Most browsers have the root certificate of a dozen of CAs • A certificate is a document which can be used to verify that a public key belongs to an individual

Slide 26

Slide 26 text

About certificates • Not every certificate is the same. There are different levels of certificates. • A certificate is (hopefully) signed by a recognized CA root certificate and checked against its invalidation list • A certificate can be self-signed (stupid, not recommended, and useful) • They have expiration dates

Slide 27

Slide 27 text

Extended Validation (EV) Certificate • Extensive verification of identity before emitting the certificate to the requester • No more secure than a non-EV certificate • Might make part of your address bar green, etc.

Slide 28

Slide 28 text

Certificates • Software vendors (like Apple or Ubuntu) trusts the most appreciated CAs and embed their public keys in their systems. If you are a incompetent CA you’re out of the market (hopefully). • You trust your browser & operational system • You check the identity to whom the certificate belongs to (don’t you?)

Slide 29

Slide 29 text

Certificates for intranet • If you have a certificate for http://secret-docs.intranet/ you’re doing it wrong:
 - There might be others secret-docs.intranet on others intranets, including the one from the bad guy wanting to steal your data • Sadly, some CAs does emit certificates for them • You’d have to read the certificate document each time before allowing your browser to send sensitive data (session IDs, anyone?) to the requesting server: not gonna happen.

Slide 30

Slide 30 text

Fixing your intranet server security • • You can still restrict it to be accessible only inside the intranet, but now it is now safer (just be aware of wildcard SSL certificates which you don’t want to trust).

Slide 31

Slide 31 text

Deploying your website or app • Are you still using FTP?
 - Don’t.

Slide 32

Slide 32 text

FTP is… just don’t use it • really slow • insecure • complicated, prone to errors • …

Slide 33

Slide 33 text

Great deployment tools • WebDAV over HTTPS (great native support in modern machines) • SFTP (based on SSH; and is NOT ‘secure FTP’) • git push • torrent (if you are Facebook, Twitter, and the like - and know what you’re doing this is almost certainly the best option)

Slide 34

Slide 34 text

Sending email • Does your mail sender have proper permission? Like...
 MX records for the domain you are sending?
 SPF1, SPF2 maybe? • If you don’t, it may end up as Spam.

Slide 35

Slide 35 text

Sending email • Don’t keep the user waiting:
 - Queue the message with a local relay • Using PHP? Do yourself a favor and avoid mail();
 - for security, simplicity, and performance
 - use PEAR Mail_Mime or Zend_mail

Slide 36

Slide 36 text

Text Encoding • UTF-8 is here to stay • you must adopt UTF-8 or you lose consumers • however...

Slide 37

Slide 37 text

Intelligent UTF-8 • Adopt with care: • you don’t want two usernames like: “frédéric” and “frederic”
 - (just the second if you don’t mind, please). • But you want different passwords to be different:
 “n” is not “ñ”
 - (ok, don’t use single-letters for passwords)

Slide 38

Slide 38 text

txt.evihcra.exe = exe.archive.txt? • UTF-8 has tricky control characters like one which shows the text inverted • for example: it might make someone execute a file thinking it is just a text file, when it actually is a binary one

Slide 39

Slide 39 text

UTF-8 is tricky • bad encoded UTF-8 with invalid byte sequences is also a headache • Byte order mark (may show like “”) on top of your PHP files makes you insane:
 - “how come the headers are already sent?”
 - “why is this JSON invalid in this browser and not in that one?”
 May happen with other scripting languages as well

Slide 40

Slide 40 text

UTF-8 is tricky • There is more than half a dozen reasons why you use a IDE, or a great text editor. • Don’t use Wordpad or Notepad to make a quick change on your code, please. BOM (0xEF, 0xBB, 0xBF) will happen. • And check what you commit (you do versioning right, don’t you?) and if you see something weird or unexpected... Do something about it.

Slide 41

Slide 41 text

User input data • Always filter and validate each data entry

Slide 42

Slide 42 text

What a filter should do • Filtering in action:
 - telephone:
 input: "700-7202222", output: "+1-700-720-2222"
 - name:
 input: " Henrique Pinto ", output: "Henrique Pinto"
 - price:
 input: "0.51 ", output: "$0.51 USD" • Filter might strip whitespaces, normalize input, etc.

Slide 43

Slide 43 text

What a validator should do • Checks if the given data (post-filtered) is valid • If not valid:
 - give feedback telling why it didn’t pass • Examples:
 - telephone: "+1" (error: incomplete phone number)
 - price: "" (error: price is required, price can not be empty)
 - price: "0" (not a error: product is free) • Avoid mixing filtered and unfiltered data

Slide 44

Slide 44 text

Filtered data != escaped data • Take a whitelist approach (rather than a blacklist one):
 - be awared: parsing HTML is more complicated than it seems to be • Use a solid parser! Don’t try to create one!

Slide 45

Slide 45 text

Sanitize on input, escape on output • i.e., phone number
 - first character may be “+”, others are digits (and no more than ~20) • escape at each step: SQL, HTML, JSON, BASH, Regex, XML, etc.
 - escaping isn’t magic, each step requires different types of escaping

Slide 46

Slide 46 text

Sanitize on input, escape on output • know your tools
 - MySQL is way more permissive (in the bad sense) than Postgresql
 - try to add “test” to a char(2) field on a MySQL DB:
 - it will save “te”, Postgresql would fail emitting a notice that the data is larger than the space of the field. • Don’t take external APIs data for granted, take the same care with them as you would with user input data

Slide 47

Slide 47 text

Use prepared statements • Better performance • Easy way to escape things the right way

Slide 48

Slide 48 text

memcached caveats • No security out of the box: you’ve to protect it yourself • It doesn’t carry the concept of different databases.
 A prefix in the key part will suffice to fix this limitation:
 - e.g., keys: session_S929JDLRJ223, cache_page_index, profile_henvic • By default, it allows connection from any client with no authentication.
 SASL might be used. Strict firewall rules are advised.
 Sadly many memcached installs on production environments are unprotected.

Slide 49

Slide 49 text

Storing passwords: don’t

Slide 50

Slide 50 text

Storing passwords: don’t • Breaking passwords is each day more easily done • Rainbow Tables out there helps cracking passwords easily • Adding a salt helps, but that alone isn’t enough & time-proven • If you are just hashing with MD5, SHA1, SHA256, SHA512, etc you’re doing it wrong. • Even if your system is just a game, remember users are lazy and reuse their passwords from games to banking accounts, so be responsible

Slide 51

Slide 51 text

Solution • Generic algorithm
 - that can not be optimized with dedicated hardware • Adaptive
 - as hardware increases you can make it harder / slower with more iterations than before

Slide 52

Slide 52 text

A Future-Adaptable Password Scheme

Slide 53

Slide 53 text

bcrypt The original implementation was for OpenBSD

Slide 54

Slide 54 text

bcrypt • You can increase the iterations needed to calculate if a password matches with a given hash • Rainbow tables are impossible

Slide 55

Slide 55 text

Password • Alternatives to bcrypt:
 - scrypt: slightly more secure, seems to be less supported
 - PBKDF2: less secure • Read about the subject

Slide 56

Slide 56 text

Brute-force attacks • Doors to Denial-of-service
 - compute-intensive password hashing methods
 - compute-intensive actions triggered by the final users
 - ... • Doors to security compromise
 - dictionary attacks to your login form
 - ... • Limit requests at the front-end servers or the application level
 - i.e., HttpLimitReqModule on nginx, captchas (tip: reCAPTCHA)

Slide 57

Slide 57 text

Two-factor authentication • Extra protection against unauthorized access • Various technologies:
 - SMS one-time password
 - Time-based One-time Password Algorithm
 - Hardware-based (i.e., RSA tokens)
 - Software-based (i.e., Google Authenticator) • Various implementations, services, and APIs
 for using with SSH, HTTP(S), etc are available.

Slide 58

Slide 58 text

Captcha • Completely Automated Public Turing test to tell Computers and Humans Apart • It’s intrusive (you don’t want to use everywhere, every single time):
 - i.e., use [after 3 or 4] failed authentication attempts
 (even have the great side-effect of throttling down the attack) • A bad implementation might make experience for people with disabilities or accessing via mobile a very bad thing

Slide 59

Slide 59 text

• Originally developed at Carnegie Mellon University • Acquired by Google • Web service for free • Support for blind (with audio), etc • Constantly up-to-date • Use it instead of implementing your own

Slide 60

Slide 60 text

Panopticlick • How Unique - and Trackable - Is Your Browser? • • As it turns out our browsers gives up too much information
 - even in private mode

Slide 61

Slide 61 text

Powerful GET? • GET SHOULD NOT be used for transformation / destructive actions. • Why?
 - It should be used to... get content.
 - It can be easily forged.
 - It can be unintentionally / automatically retrieved. • Must use? Add token (hash) + check origin (referrer) + hide link from user + rel="nofollow" + find another way to do so + etc • Really? Isn’t the case for a hidden field in a form or something better?

Slide 62

Slide 62 text

A word on XML • Prefer JSON over XML
 - Why?
 - Because is simpler
 - Less complicated
 - Easer to parse
 Only major drawback:
 - More hard on the human eyes (subjective)

Slide 63

Slide 63 text

Apache and .htaccess • .htaccess is evil
 - slow
 - reduced I/O + bottleneck on the disk
 - even insecure
 - invite for adopting bad practices • A problem: server-side executables / code on the public area
 - public_html/

Slide 64

Slide 64 text

Deploying on the shared hosting • Common issues & disturbances:
 - Low performance
 - Security holes • Cheap & Dirty • You decide: it might be worth it • Static files? Great!

Slide 65

Slide 65 text

Deploying on the shared hosting • Sessions
 - storing on files? Don’t store on a public place like /tmp • A instance (“virtual machine”) on Amazon Web Services, Slicehost, Rackspace, or Linode is not that expensive and might be a better fit.

Slide 66

Slide 66 text

Fail-safe systems • failures will happen someday (it’s a fact) • graciously fail is less damaging than disasters • shared nothing architecture is a good approach • Minimize the number of single points of failure

Slide 67

Slide 67 text

What can cause a failure • Configuration • Database • Filesystem • APIs • Email systems • Sockets • Software • Releases • Hardware • Power outage • Network outage • ...

Slide 68

Slide 68 text

a:link, a:visited, etc. • Remember that, by default (HTML with no styling), a visited link gets a different color on most browsers? • Now what if you check if someone visited a given address?
 - this is a privacy breach
 - modern browsers are starting to work on fixing this •

Slide 69

Slide 69 text

On Quality Assurance (QA) • It should be the goal of the developer that the QA find nothing wrong • The developer should deliver high quality work. The following helps:
 - Test-Driven Development (TDD)
 - Behavior-Driven Development (BDD)
 - Continuous Integration (CI)
 - unit tests, behavior testes, unit tests, integration tests
 - code versioning (git is the most prominent)
 - issues & bug tracking (i.e., JIRA, GitHub Issue Tracker, etc)

Slide 70

Slide 70 text

Physical security: not for granted • Lest We Remember: Cold Boot Attacks on Encryption Keys • Payment Card Industry Data Security Standard (PCI DSS) • Adventures with Daisy in Thunderbolt-DMA-land: Hacking Macs through the Thunderbolt interface thunderbolt-dma-land-hacking-macs-through-the-thunderbolt- interface/

Slide 71

Slide 71 text

Never stop learning • The best way to build rock-solid secure apps • Read research papers • Be part of your local user group • Get involved with open source projects: push code on GitHub

Slide 72

Slide 72 text

Some good resources • Common Vulnerabilities and Exposureses (CVE)
 The Open Web Application Security Project • (talks organized by events, very useful) • Voices of the ElePHPant

Slide 73

Slide 73 text

Some Images are from • • • • • •

Slide 74

Slide 74 text
