Slide 1

Slide 1 text

Separating Allocation from Code A journey of discovery (

Slide 2

Slide 2 text

Separating Allocation from Code A blob implementation in C++ (

Slide 3

Slide 3 text

Separating Allocation from Code Things I learnt coding AAA games that I’ve had to unlearn (

Slide 4

Slide 4 text

Separating Allocation from Code How my mind was blown repeatedly (

Slide 5

Slide 5 text

Separating Allocation from Code How John McCarthy was right after all (

Slide 6

Slide 6 text

Separating Allocation from Code The endless quest for a descriptive subtitle (

Slide 7

Slide 7 text

Elixir Studios (

Slide 8

Slide 8 text

()

Slide 9

Slide 9 text

class Monster : public Entity { int size; string name; int attributes[100]; Animation* animation; public: Monster(string const& n, EntityTemplate* t) { size = t->size; name = n; animation = new Animation(n + “.ani”); } }; (

Slide 10

Slide 10 text

vector monsters; Monster* m = new Monster(t, “Shrek”); monsters.push_back(m); (

Slide 11

Slide 11 text

auto it = monsters.begin(); for (; it != monsters.end(); ++it) { if ((*it)->size > SMALL) { render(*it); } } ()

Slide 12

Slide 12 text

Ruby on Rails, Coaching, Training ()

Slide 13

Slide 13 text

Sol Trader soltrader.net ()

Slide 14

Slide 14 text

Blob ()

Slide 15

Slide 15 text

Blob ()

Slide 16

Slide 16 text

()

Slide 17

Slide 17 text

“I want to say something about implementing blobs that also applies to programming in general. Separate your memory allocation policy from your algorithm code. A higher level module should be used to glue the two together in the most optimal way. It always troubles me to see modules that allocate their own memory…” www.johnmccutchan.com/2011/10/programmer-and-his-blobs.html ()(

Slide 18

Slide 18 text

class Blob { int8* data; Blob(size_t size) { data = new int8[size]; } … }; Blob* obj = new Blob(1024); ()(

Slide 19

Slide 19 text

struct Blob { int8* data; size_t size; }; int8 backend[1024]; Blob blob = { backend, 1024 }; ()(

Slide 20

Slide 20 text

struct Blob { int8* data; size_t size; }; int8* backend = new int8[1024]; Blob blob = { backend, 1024 }; ()(

Slide 21

Slide 21 text

struct Blob { int8* data; size_t size; }; int8* file = getFileArray(“foo.txt”); Blob blob = { file, 1024 }; ()(

Slide 22

Slide 22 text

“Tying allocation to code is a form of coupling, and coupling sucks.” (()(

Slide 23

Slide 23 text

Tying allocation to code is a form of coupling bit.ly/1pHzTNf (()(

Slide 24

Slide 24 text

(()() class Blob { char *_data; public: Blob() : _data(NULL) {} Blob(char* data) : _data(data) {} template T* get(unsigned offset) const { return (T*)(_data + offset); } };

Slide 25

Slide 25 text

Integer(Blob blob) : _blob(blob) {} void set(int value) { _blob.set1(0, BL_INT); _blob.set4(1, value); } (()()

Slide 26

Slide 26 text

Array(Blob blob, unsigned capacity) { _blob = blob; _blob.set1(0, BL_ARRAY); _blob.set4(CAPACITY_OFFSET, capacity); _blob.set4(USED_OFFSET, FIRST_VALUE); _blob.set4(SIZE_OFFSET, 0); } char data[128]; Blob blob(data) Array array(blob, 128); TYPE CAPACITY USED (13) SIZE (0) 13 0 (()()

Slide 27

Slide 27 text

void Array::push_back(int value) { fix(place().set(value)); } TYPE CAPACITY USED (18) SIZE (1) TYPE VALUE 13 0 18 (()()

Slide 28

Slide 28 text

Results Random access was slower Sequential iteration was faster (()()

Slide 29

Slide 29 text

RANDOM ACCESS Random access was at least an order of magnitude slower for all sizes These blobs aren’t designed for random access ((()()

Slide 30

Slide 30 text

SEQUENTIAL TRAVERSAL With a map containing 50 vectors of 500 numbers each: STL implementation was ~20% quicker With a map containing 500 vectors of 50 numbers each: Blob implementation was ~30% quicker ((()()

Slide 31

Slide 31 text

void Array::push_back(int value) { fix(place().set(value)); } ((())()

Slide 32

Slide 32 text

struct Blob { int8* data; size_t size; }; int8 backend[1024]; Blob blob = { backend, 1024 }; ((())()

Slide 33

Slide 33 text

((())()()

Slide 34

Slide 34 text

class AiData { char _goalData[GOAL_DATA_SIZE]; char _knData[MAX_ACTORS][ACTOR_K_SIZE]; char _qData[MAX_ACTORS][ACTOR_Q_SIZE]; char _sharedKnData[SHARED_K_SIZE]; char _sharedQData[SHARED_Q_SIZE]; }; AiData* ai = new AiData; ((())()()

Slide 35

Slide 35 text

Data-oriented design ((())()()

Slide 36

Slide 36 text

Memory is slow ((())()()

Slide 37

Slide 37 text

Memory is getting slower (relatively) ((())()()

Slide 38

Slide 38 text

Memory is expensive ((())()()

Slide 39

Slide 39 text

CPU Intel Core i5 Haswell L2 cache (512 KB) CORE + L1 (64 KB) L3 cache (3,072 KB) Main Memory (16,777,216 KB) Bus CORE + L1 (64 KB) This presentation brought to you by… ((())()()

Slide 40

Slide 40 text

class Blob { int8* data; Blob(size_t size) { data = new int8[size]; } … }; Blob* obj = new Blob(1024); ((())()()

Slide 41

Slide 41 text

CPU Blob (Heap) Data (Heap) 1 2 ((())()()

Slide 42

Slide 42 text

sed https://www.flickr.com/photos/k_putt/sets/72157644129911474/ ((())()()

Slide 43

Slide 43 text

((())()() auto it = monsters.begin(); for (; it != monsters.end(); ++it) { if ((*it)->size > SMALL) { render(*it); } } Monsters render() render() render()

Slide 44

Slide 44 text

((())()() auto it = monsters.begin(); for (; it != monsters.end(); ++it) { // update, then if ((*it)->sizeChanged()) { copyToRender(*it, monstersToRender); } } Monsters

Slide 45

Slide 45 text

Monsters MonstersToRender ((())()()

Slide 46

Slide 46 text

MonstersToRender render() render() render() ((())(s)() auto it = monstersToRender.begin(); for (; it != monstersToRender.end(); ++it) { render(*it); }

Slide 47

Slide 47 text

Advantages ((())(s)()

Slide 48

Slide 48 text

Parallelisation is cheap MonstersToRender Thread 1 Thread 2 Thread 3 ((())(s)p)

Slide 49

Slide 49 text

((())(s)p)

Slide 50

Slide 50 text

(l(())(s)p) * * * * * * * * Monsters * * * MonstersToRender vector monsters;

Slide 51

Slide 51 text

(l(())(s)p) M M M M M M M M Monsters M M M MonstersToRender vector monsters; Now we’re copying the object not a pointer!

Slide 52

Slide 52 text

Unit testing is easy Monsters MonstersToRender (l(())(s)p)

Slide 53

Slide 53 text

(l((i)(s)p) It’s rather like… Functional Programming

Slide 54

Slide 54 text

Functional programming We mapping data to data - easy to reason about, referential transparency for free You’re copying data (or modifying in place as an optimisation) - levering the properties of immutability.

Slide 55

Slide 55 text

So what?

Slide 56

Slide 56 text

Optimise early! (sometimes) Sometimes you need to to completely change your paradigm.

Slide 57

Slide 57 text

Gold-plate it! (sometimes) Sometimes you need to keep cleaning code, learning and investigating. It’s amazing what you’ll find out. Kent Beck: “I’m in the habit of trying stupid things out to see what happens”

Slide 58

Slide 58 text

Pay attention

Slide 59

Slide 59 text

Web developers You’re already doing this kind of work with SQL Consider your data before building elaborate and messy OO code Thinking this way helps to break work into asynchronous jobs Keep an eye on Gary Bernhardt’s new Structured Design work

Slide 60

Slide 60 text

C++/C# developers Check out data-oriented design if you haven’t already Be wary of naïve OO design The future of performance is parallelisation Don’t write a program that’s hard to parallelise Unlock the power of modern GPUs

Slide 61

Slide 61 text

Questions? [email protected] / @chrismdp on Twitter I coach and team teams on agile, code quality, and BDD I make Sol Trader http://soltrader.net and Card Pirates http://cardpirates.com I co-founded Kickstart Academy
 http://kickstartacademy.io