Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

About PostgreSQL DBA. Linux system administrator. PostgreSQL-Consulting.com: ● 24/7 support. ● Audit, performance optimizations. ● Consulting and Training. ● Monitoring and Emergency. ● Capacity planning. Slides: https://goo.gl/awmZ2H

Slide 3

Slide 3 text

Agenda RDBMS on Linux, why? Databases and Resources. OS subsystems. CPU, Process scheduling, Power saving policies. Memory, VM, NUMA, Huge pages. Storage, File Systems, Input/Output. Other misc.

Slide 4

Slide 4 text

Why Linux? Linux is a good choice: ● Active development & Community support. ● A lot of features & Fast implementation. ● Stable & Mature & Durable.

Slide 5

Slide 5 text

Databases & Resources Concurrency Query speed Sort, group, hash,... OS page cache DB buffer pool Local process cache DB data files Transaction Log Cold start CPU Memory Storage

Slide 6

Slide 6 text

Databases & Resources CPU Scheduling NUMA Power Saving Virtual Memory NUMA Huge Pages File Systems Storage I/O CPU Memory Storage

Slide 7

Slide 7 text

Resources CPU scheduler. Virtual memory and NUMA. Huge pages. File systems. Storage IO. Power saving policy. Others.

Slide 8

Slide 8 text

CPU scheduling CPU scheduler responsible for proper processes planning: Sysctl: ● kernel.sched_migration_cost_ns = 5000000 (default: 500000). ● kernel.sched_autogroup_enabled = 0 (default: 1). http://www.postgresql.org/message-id/[email protected] http://kernelnewbies.org/Linux_2_6_38#head-59575a6aeafa38490226a560ee02de89829a5b20

Slide 9

Slide 9 text

CPU scheduling CPU scheduler responsible for proper processes planning: Sysctl: ● kernel.sched_migration_cost_ns = 5000000 (default: 500000). ● kernel.sched_autogroup_enabled = 0 (default: 1). http://www.postgresql.org/message-id/[email protected] http://kernelnewbies.org/Linux_2_6_38#head-59575a6aeafa38490226a560ee02de89829a5b20 Be aware on Ubuntu: 12.04 #1055222 and 14.04 #1422016. Use noautogroup kernel param instead of sysctl.conf.

Slide 10

Slide 10 text

Virtual Memory What is it? Allocator, Caching, Dirty pages and Writeback.

Slide 11

Slide 11 text

Virtual Memory

Slide 12

Slide 12 text

Virtual Memory Sysctl: vm.dirty_background_ratio & vm.dirty_ratio = disable it. vm.dirty_background_bytes & vm.dirty_bytes = depends on ... RAID cache size, 64MB/128MB otherwise

Slide 13

Slide 13 text

Virtual Memory Out-of-memory & OOM-Killer Sysctl: vm.swappiness = 1 (default: 60)

Slide 14

Slide 14 text

NUMA S — Socket C — CPU core M — Memory bank

Slide 15

Slide 15 text

NUMA BIOS: enable memory node interleaving. Kernel boot: numa=off. numactl utility. Sysctl: ● vm.zone_reclaim_mode = 0 (default: 0). ● kernel.numa_balancing = 0 (default: 0).

Slide 16

Slide 16 text

Huge Pages Huge pages vs. Transparent huge pages. Huge pages are supported by many RDBMS. Always disable transparent huge pages.

Slide 17

Slide 17 text

Huge Pages Huge pages vs. Transparent huge pages. Huge pages are supported by many RDBMS. Always disable transparent huge pages. /etc/rc.local: ● echo never > /sys/kernel/mm/transparent_hugepage/enabled ● echo never > /sys/kernel/mm/transparent_hugepage/defrag

Slide 18

Slide 18 text

Filesystems Ext3 vs Ext4 vs XFS: what is better? Filesystem Barriers.

Slide 19

Slide 19 text

Filesystems Ext3 vs Ext4 vs XFS: what is better? Filesystem Barriers. Disable Write Cache: ● hdparm -W0 /dev/device ● MegaCli64 -LDSetProp -DisDskCache -Lall -aALL

Slide 20

Slide 20 text

Filesystems Ext3 vs Ext4 vs XFS: what is better? Filesystem Barriers. Disable Write Cache: ● hdparm -W0 /dev/device ● MegaCli64 -LDSetProp -DisDskCache -Lall -aALL Hardware RAID + BBU = barrier=0 (disable). Software RAID = barrier=1 (enable).

Slide 21

Slide 21 text

Filesystems Ext3 vs Ext4 vs XFS: what is better? Filesystem Barriers. Disable Write Cache: ● hdparm -W0 /dev/device ● MegaCli64 -LDSetProp -DisDskCache -Lall -aALL Hardware RAID + BBU = barrier=0 (disable). Software RAID = barrier=1 (enable). Enterprise SSD with Power Loss Protection = barrier=0 (disable).

Slide 22

Slide 22 text

Storage IO SATA/SAS vs SSD. IO elevators.

Slide 23

Slide 23 text

Storage IO SATA/SAS vs SSD. IO elevators: ● noop: SSD, PCIe SSD, hi-end storages. ● deadline: RAID, SATA/SAS. ● cfq: good default. ● none (multi-queue block IO): SSD, PCIe SSD.

Slide 24

Slide 24 text

Storage IO SATA/SAS vs SSD. IO elevators: ● noop: SSD, PCIe SSD, hi-end storages. ● deadline: RAID, SATA/SAS. ● cfq: good default. ● none (multi-queue block IO): SSD, PCIe SSD. # echo 'elevator_name' > /sys/block//queue/scheduler kernel boot: elevator= /sys/block/*/queue/: rotational, rq_affinity, read_ahead_kb

Slide 25

Slide 25 text

Power Saving Policy Drivers: acpi_cpufreq vs. intel_pstate. scaling_governor.

Slide 26

Slide 26 text

Power Saving Policy Drivers: acpi_cpufreq vs. intel_pstate. scaling_governor: ● /sys/devices/system/cpu/cpuX/cpufreq/scaling_available_governors ● /sys/devices/system/cpu/cpuX/cpufreq/scaling_governor

Slide 27

Slide 27 text

Power Saving Policy Drivers: acpi_cpufreq vs. intel_pstate. scaling_governor: ● /sys/devices/system/cpu/cpuX/cpufreq/scaling_available_governors ● /sys/devices/system/cpu/cpuX/cpufreq/scaling_governor acpi_cpufreq + performance. intel_pstate + powersave.

Slide 28

Slide 28 text

Misc: Clocksources What is clocksource? acpi_pm vs. hpet vs. tsc. /sys/devices/system/clocksource/clocksource0/available_clocksource. /sys/devices/system/clocksource/clocksource0/current_clocksource.

Slide 29

Slide 29 text

Summary Linux is a good choice for RDBMS: Modern, Universal, Flexible, Stable. Adapt Linux for your workloads. Test → Change → Test → Commit/Rollback.

Slide 30

Slide 30 text

Questions? Alexey Lesovsky [email protected] PostgreSQL-Consulting.com: Data maintenance at its best https://postgresql-consulting.com