Taming memory: performance-tuning a (Crystal) application [SoundCloud HQ edition]

Taming memory: performance-tuning a (Crystal) application [SoundCloud HQ edition]

When developing a game, you need to pay attention to performance. After all, a game needs to run fast, and have a predictable frame rate, and stuttering will throw people off.

I’ve had performance issues even in Crystal, a fast, compiled, statically-typed language with a syntax inspired by Ruby. As it turns out, the way a program handles memory can have a huge impact on performance. Luckily, Crystal gives a great deal of control over how this can be done. It’s also possible to use familiar tools with Crystal to debug issues and identify bottlenecks.

In this talk, I’ll share what I’ve learnt about memory and performance tuning, and give an introduction to several powerful tools for identifying performance issues.

Be732ee41fd3038aa98a0a7e7b7be081?s=128

Denis Defreyne

November 24, 2015
Tweet

Transcript

  1. 3.

    3

  2. 4.

    4

  3. 5.

    C 5

  4. 8.

    8

  5. 9.

    9

  6. 10.
  7. 13.

    13

  8. 14.

    14

  9. 15.

    15

  10. 16.

    16

  11. 17.

    17

  12. 18.

    17

  13. 25.

    24

  14. 26.

    25 0 1 2 3 4 5 6 7 8

    9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 …
  15. 27.

    26 0 1 2 3 4 5 6 7 E

    F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 …
  16. 28.

    27 0 1 2 3 4 5 6 7 3

    G R E Y Ø E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 …
  17. 33.

    32

  18. 34.

    10 LET S = 0 15 MAT INPUT V 20

    LET N = NUM 30 IF N = 0 THEN 99 40 FOR I = 1 TO N 45 LET S = S + V(I) 50 NEXT I 60 PRINT S/N 70 GO TO 10 99 END 33
  19. 35.

    10 LET S = 0 15 MAT INPUT V 20

    LET N = NUM 30 IF N = 0 THEN 99 40 FOR I = 1 TO N 45 LET S = S + V(I) 50 NEXT I 60 PRINT S/N 70 GO TO 10 99 END 34
  20. 37.
  21. 38.

    9074: eb 4a jmp 90c0 <_mysql_init_character_set+0x124> 9076: 8b 43 f8

    mov -0x8(%rbx),%eax 9079: 83 f8 01 cmp $0x1,%eax 907c: 74 04 je 9082 <_mysql_init_character_set+0xe6> 907e: 85 c0 test %eax,%eax 9080: 75 06 jne 9088 <_mysql_init_character_set+0xec> 9082: 4c 8b 7b f0 mov -0x10(%rbx),%r15 9086: eb 38 jmp 90c0 <_mysql_init_character_set+0x124> 9088: 48 8b 4b f0 mov -0x10(%rbx),%rcx 908c: 48 8d 35 15 a3 03 00 lea 0x3a315(%rip),%rsi # 433a8 <_zcfree+0x1fd6> 9093: bf 51 04 00 00 mov $0x451,%edi 9098: 31 d2 xor %edx,%edx 909a: 31 c0 xor %eax,%eax 909c: e8 72 76 02 00 callq 30713 <_my_printf_error> 90a1: 48 8d 35 56 a3 03 00 lea 0x3a356(%rip),%rsi # 433fe <_zcfree+0x202c> 90a8: 4c 8d 3d 61 b9 03 00 lea 0x3b961(%rip),%r15 # 44a10 <_zcfree+0x363e> 90af: bf 51 04 00 00 mov $0x451,%edi 90b4: 31 d2 xor %edx,%edx 90b6: 31 c0 xor %eax,%eax 90b8: 4c 89 f9 mov %r15,%rcx 37
  22. 39.

    0000000000030713 <_my_printf_error>: 30713: 55 push %rbp 30714: 48 89 e5

    mov %rsp,%rbp 30717: 41 57 push %r15 30719: 41 56 push %r14 3071b: 41 54 push %r12 3071d: 53 push %rbx 3071e: 48 81 ec d0 02 00 00 sub $0x2d0,%rsp … 30932: 4c 3b 75 e8 cmp -0x18(%rbp),%r14 30936: 75 0c jne 30944 <_my_printf_warning+0xda> 30938: 48 81 c4 d0 02 00 00 add $0x2d0,%rsp 3093f: 5b pop %rbx 30940: 41 5e pop %r14 30942: 5d pop %rbp 30943: c3 retq 38
  23. 40.
  24. 49.

    41

  25. 50.
  26. 55.
  27. 56.

    42

  28. 57.
  29. 63.
  30. 65.

    44

  31. 66.
  32. 70.
  33. 71.

    45

  34. 73.

    47 param 1 param 2 return address local variable 1

    local variable 2 local variable 3 local variable local variable
  35. 74.

    47 param 1 param 2 return address local variable 1

    local variable 2 local variable 3 local variable local variable
  36. 80.

    52

  37. 90.

    62

  38. 96.

    65

  39. 97.
  40. 101.
  41. 107.

    71

  42. 108.
  43. 109.
  44. 113.
  45. 114.

    72

  46. 115.

    73

  47. 116.

    If data is available in the cache, we have a

    cache hit. If it’s not available in the cache, we have a cache miss. 74
  48. 120.

    77 position velocity 14 used bytes / 32 total bytes

    = 44% efficiency (for movement) rotation armor shield
  49. 121.

    78

  50. 122.

    79

  51. 125.
  52. 127.
  53. 128.
  54. 136.

    91

  55. 139.

    94 lldb, gdb debugger Instruments (Mac OS X) performance analyser

    and visualiser dtrace dynamic tracing framework