Upgrade to Pro — share decks privately, control downloads, hide ads and more …

This is your database on Linux

Sam Kottler
September 22, 2015

This is your database on Linux

Sam Kottler

September 22, 2015
Tweet

More Decks by Sam Kottler

Other Decks in Technology

Transcript

  1. ABOUT ME ▸ Platform engineering @ DigitalOcean ▸ Formerly of

    Red Hat & Venmo ▸ Committer to CentOS, Fedora, Icinga, Ansible
  2. VM.SWAPPINESS ▸ Instructs the OS about when to swap ▸

    Default value of 60 ▸ Set this to 0
  3. "My point is that decreasing the tendency of the kernel

    to swap stuff out is wrong. You really don't want hundreds of megabytes of BloatyApp's untouched memory floating about in the machine. Get it out on the disk, use the memory for something useful." - Andrew Morton
  4. TRANSPARENT HUGE PAGES ▸ Intended to keep fewer entries in

    the TLB ▸ Hands out 2M or 1G pages on malloc() ▸ Breaks when paired with madvise(), particularly with jemalloc ▸ Disable THP
  5. DIRTY RATIO ▸ vm.dirty_background_ratio: start flushing to disk ▸ vm.dirty_ratio:

    require synchronous I/O ▸ /proc/vmstat should be carefully monitored
  6. NUMA ▸ Unlike UMA, NUMA means that memory is addressed

    per-CPU ▸ Modern systems generally have 2+ NUMA nodes ▸ Local operations on a CPU will cause swapping, even if memory is available. ▸ /usr/bin/numactl --interleave all <cmd>
  7. FILESYSTEMS ▸ ext4 is still a very safe bet ▸

    xfs has had performance issues, largely solved now ▸ btrfs is interesting
  8. SSD'S ▸ Use them, they're generally okay ▸ Don't run

    multi-threaded workloads on consumer grade drives ▸ deadline/noop scheduler
  9. MORE BITS ON SSD'S ▸ Controller firmware quality is generally

    bad ▸ Did I mention controller firmware quality is low? ▸ Your drives might just sometimes die because of firmware ▸ Find a working firmware release, rarely change versions
  10. NVME ▸ Non-volatile memory via PCIe ▸ SATA/SAS/Fibre channel are

    too slow for high-end flash ▸ Currently economical for use as read through/write back cache ▸ Supported in Linux since 3.3
  11. TCP METRICS ▸ Stores information about congestion and window size

    for like 1k connections. ▸ Windows size and congestion information based on previous conditions. ▸ Set net.ipv4.tcp_no_metrics_save to 1.
  12. FIN TIMEOUT ▸ Determines how long to wait for a

    FIN ▸ Set net.ipv4.tcp_fin_timeout to something below 60