Slide 1

Slide 1 text

@afilina Doctrine Performance Optimization SymfonyCon, Cluj - November 17, 2017

Slide 2

Slide 2 text

Anna Filina • Project rescue • Legacy code • Development • Conferences • Private workshops • elePHPant smuggling

Slide 3

Slide 3 text

Lots of code & numbers.

Slide 4

Slide 4 text

Use science.

Slide 5

Slide 5 text

Resources • CPU • Memory • Disk access

Slide 6

Slide 6 text

Good benchmarking is hard.

Slide 7

Slide 7 text

Time Command $ time php benchmarks/1-no-optimizations.php real 0m5.486s user 0m3.298s sys 0m0.406s

Slide 8

Slide 8 text

Logger $config = new \Doctrine\ORM\Configuration(); $sqlLogger = new \Doctrine\DBAL\Logging\DebugStack(); $sqlLogger->enabled; $config->setSQLLogger($sqlLogger); $entityManager = EntityManager::create($dbParams, $config);

Slide 9

Slide 9 text

Benchmark Output $queryTime = 0; foreach ($sqlLogger->queries as $query) { $queryTime += $query['executionMS']; } $totalTime = microtime(true) - $startTime; $scriptTime = $totalTime - $queryTime; $numQueries = count($sqlLogger->queries);

Slide 10

Slide 10 text

Output Query time: 2.4335s Other time: 3.2928s Num queries: 8001

Slide 11

Slide 11 text

Personalize E-mails. 200,000 of them.

Slide 12

Slide 12 text

The E-mail Hello {{ firstName }}, You have {{ points }} to spend with us! {% if originalSource == 'ebgames' %} Can we interest you in some video games? {% else %} Can we interest you in some other products? {% endif %}

Slide 13

Slide 13 text

Models

Slide 14

Slide 14 text

Copy production data or seed.

Slide 15

Slide 15 text

Production Data • Copy a subset if too big. • Anonymize.

Slide 16

Slide 16 text

fzaninotto/faker

Slide 17

Slide 17 text

Seed $schemaTool = new \Doctrine\ORM\Tools\SchemaTool($entityManager); $metadata = $entityManager->getMetadataFactory()- >getAllMetadata(); $schemaTool->dropSchema($metadata); $schemaTool->createSchema($metadata);

Slide 18

Slide 18 text

Generate Sets function createPropertySet($faker) { foreach (['company', 'originalSource', 'jobTitle', 'dateOfBirth', 'points'] as $name) { $property = new Model\CustomProperty(); $property->name = $name; $property->value = $faker->text(50); $set[] = $property; } return $set; }

Slide 19

Slide 19 text

Generate Sets for ($i=0; $i<2000; $i++) { $user = new Model\User(); $user->firstName = $faker->firstName(); $user->lastName = $faker->lastName(); $user->email = $faker->email(); for ($j=0; $j < 5; $j++) { $property = clone $propertySet[$j]; $property->user = $user; $user->customProperties[] = $property; } $entityManager->persist($user); if ($i % 25 == 0) { $entityManager->flush(); } } $entityManager->flush();

Slide 20

Slide 20 text

Output Batches: real 0m11.490s user 0m9.457s sys 0m0.427s No batches: real 3m34.874s user 3m26.552s sys 0m1.846s

Slide 21

Slide 21 text

1. The Original Benchmark

Slide 22

Slide 22 text

Fetch Property For Each User $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user') ->setMaxResults($numUsers) ->getQuery()->getResult(); foreach ($users as $i => $user) { foreach ($propertyNames as $property) { $value = $entityManager ->getRepository(Model\CustomProperty::class) ->createQueryBuilder('property') ->select('property') ->where('property.name = :property_name') ->andWhere('property.user = :user_id') ->setParameter('property_name', $property) ->setParameter('user_id', $user->id) ->setMaxResults(1) ->getQuery()->getResult(); } }

Slide 23

Slide 23 text

Output Query time: 2.4335s Other time: 3.2928s Num queries: 8001

Slide 24

Slide 24 text

2. Group By Property

Slide 25

Slide 25 text

Original User 1 User 2 User 3 Property 1 Lorem Ipsum Dolorem Property 2 Quibusdam Esse ipsa Sed Property 3 Non quae Magnam ea Veritatis

Slide 26

Slide 26 text

Group User 1 User 2 User 3 Property 1 Lorem Ipsum Dolorem Property 2 Quibusdam Esse ipsa Sed Property 3 Non quae Magnam ea Veritatis

Slide 27

Slide 27 text

Fetch Property For All Users $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user') ->setMaxResults($numUsers) ->getQuery()->getResult(); $userIds = array_map(function($value) { return $value->id; }, $users); foreach ($propertyNames as $property) { $values = $entityManager ->getRepository(Model\CustomProperty::class) ->createQueryBuilder('property') ->select('property') ->where('property.name = :property_name') ->setParameter('property_name', $property) ->andWhere('property.user IN (:user_ids)') ->setParameter('user_ids', $userIds) ->getQuery()->getResult(); }

Slide 28

Slide 28 text

Output Query time: 0.0695s Other time: 0.3923s Num queries: 5

Slide 29

Slide 29 text

Original Original Grouped Diff Query Time 2.4335 0.0695 35 Other time 3.2928 0.3923 8 Num Queries 8001 5 1600

Slide 30

Slide 30 text

The End

Slide 31

Slide 31 text

Wait, what about memory?

Slide 32

Slide 32 text

Log Memory logMemory(); $users = ... logMemory(); $userIds = ... logMemory(); foreach ($propertyNames as $property) { $values = ... logMemory(); }

Slide 33

Slide 33 text

Log Memory $memoryLog = []; function logMemory() { global $memoryLog; $backtrace = debug_backtrace(); $line = $backtrace[0]['line']; $memoryLog[] = [ 'usage' => memory_get_usage(false), 'line' => $line ]; }

Slide 34

Slide 34 text

memory_get_usage • true = total memory allocated • false = used memory reported • 22.01MB used = 24.00MB allocated

Slide 35

Slide 35 text

Memory Log Output • Group logs by line number (in case of loops). • Get highest memory usage for each group. • Calculate the diff between each line. • Get highest memory of all logs (peak).

Slide 36

Slide 36 text

Output Memory used: 1.87MB (0.00B) line:7 Memory used: 10.48MB (+8.61MB) line:17 Memory used: 10.54MB (+68.42kB) line:23 Memory used: 22.01MB (+11.47MB) line:38 Peak memory used: 22.01MB line:38

Slide 37

Slide 37 text

Comparison Original Grouped Diff users 8.61MB 8.61MB 0 userIds 0 68.42kB 68.42kB value(s) 140.20kB 11.47MB 11.33MB Peak 10.61MB 22.01MB 11.4MB

Slide 38

Slide 38 text

3. Reduce Hydration

Slide 39

Slide 39 text

Result Hydration • Create entities. • Copy properties from SQL result. • CPU + memory.

Slide 40

Slide 40 text

Select Specific Columns $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user.id') ->setMaxResults($numUsers) ->getQuery()->getResult(); $userIds = ... foreach ($propertyNames as $property) { $values = $entityManager ->getRepository(Model\CustomProperty::class) ->createQueryBuilder('property') ->select('property.id, property.value') ->where('property.name = :property_name') ->andWhere('property.user IN (:user_ids)') ->setParameter('property_name', $property) ->setParameter('user_ids', $userIds) ->getQuery()->getResult(); }

Slide 41

Slide 41 text

Output Query time: 0.0633s Other time: 0.2389s Num queries: 5 Memory used: 1.87MB (0.00B) line:7 Memory used: 5.09MB (+3.22MB) line:21 Memory used: 5.09MB (+568.00B) line:25 Memory used: 6.43MB (+1.34MB) line:47 Peak memory used: 6.43MB line:47

Slide 42

Slide 42 text

Comparison Original Grouped Arrays Query Time 2.4335 0.0695 0.0663 Other time 3.2928 0.3923 0.1955 Num Queries 8001 5 5 Peak Memory 10.61MB 22.01MB 6.43MB

Slide 43

Slide 43 text

But I needed those entities!

Slide 44

Slide 44 text

Entity Methods $user = new Model\User(); if ($property == 'email') { foreach ($values as $value) { $user->email = $value['email']; $mailToTag = $user->getMailToTag(); } }

Slide 45

Slide 45 text

4. Deallocate Memory

Slide 46

Slide 46 text

Unset Users $users = ... $userIds = ... unset($users); foreach ($propertyNames as $property) { $values = ... }

Slide 47

Slide 47 text

Output Query time: 0.0634s Other time: 0.1916s Num queries: 5 Memory used: 5.02MB (0.00B) line:15 Memory used: 5.09MB (+68.73kB) line:21 Memory used: 4.30MB (-0.78MB) line:23 Memory used: 5.64MB (+1.34MB) line:47 Peak memory used: 5.64MB line:47

Slide 48

Slide 48 text

Comparison Original Grouped Arrays Unset Query Time 2.4335 0.0695 0.0663 0.0634 Other time 3.2928 0.3923 0.1955 0.1916 Num Queries 8001 5 5 5 Peak Memory 10.61MB 22.01MB 6.43MB 5.64MB

Slide 49

Slide 49 text

5. Split to batches

Slide 50

Slide 50 text

Unset Users $batchSize = 200; $userIterations = ceil($numUsers / $batchSize); for ($i=0; $i < $userIterations; $i++) { $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user.id') ->setFirstResult($i * $batchSize) ->setMaxResults($batchSize) ->getQuery()->getResult(); $userIds = ... unset($users); foreach ($propertyNames as $property) { $values = ... } unset($values); }

Slide 51

Slide 51 text

Output Query time: 0.1134s Other time: 0.1004s Num queries: 50 Memory used: 1.87MB (0.00B) line:7 Memory used: 1.88MB (+3.77kB) line:13 Memory used: 4.97MB (+3.10MB) line:30 Memory used: 4.89MB (-85.16kB) line:32 Memory used: 5.04MB (+155.90kB) line:54 Memory used: 4.94MB (-0.10MB) line:58 Peak memory used: 5.04MB line:54

Slide 52

Slide 52 text

Comparison Original Grouped Arrays Unset Batches Query Time 2.4335 0.0695 0.0663 0.0634 0.1134 Other time 3.2928 0.3923 0.1955 0.1916 0.1004 Num Queries 8001 5 5 5 50 Peak Memory 10.61MB 22.01MB 6.43MB 5.64MB 5.04MB

Slide 53

Slide 53 text

The query log uses CPU & memory.

Slide 54

Slide 54 text

Without Query Log Original Grouped Arrays Unset Batches Query Time 2.4335 0.0695 0.0663 0.0634 0.1134 Overall time 4.7104 0.4044 0.2519 0.2949 0.2157 Num Queries 8001 5 5 5 50 Peak Memory 10.61MB 21.74MB 6.16MB 5.37MB 4.52MB

Slide 55

Slide 55 text

6. With lazy loading

Slide 56

Slide 56 text

Lazy Load $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user') ->getQuery() ->getResult(); $billing = []; foreach ($users as $user) { if ($user->billingAddress) { $billing[$user->id] = $user->billingAddress->address; } }

Slide 57

Slide 57 text

Output Query time: 0.2978s Other time: 0.2514s Num queries: 2001 Memory used: 1.87MB (0.00B) line:7 Memory used: 10.50MB (+8.63MB) line:16 Memory used: 15.54MB (+5.04MB) line:26

Slide 58

Slide 58 text

7. Use joins

Slide 59

Slide 59 text

Lazy Load $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user', 'billingAddress') ->leftJoin('user.billingAddress', 'billingAddress') ->getQuery() ->getResult(); $billing = []; foreach ($users as $user) { if ($user->billingAddress) { $billing[$user->id] = $user->billingAddress->address; } }

Slide 60

Slide 60 text

Output Query time: 0.0121s Other time: 0.1342s Num queries: 1 Memory used: 1.87MB (0.00B) line:7 Memory used: 12.15MB (+10.28MB) line:17 Memory used: 12.22MB (+68.42kB) line:27 Peak memory used: 12.22MB line:27

Slide 61

Slide 61 text

Comparison Original Left join Query Time 0.2978 0.0121s Other time 0.2514 0.1342s Num Queries 2001 1 Peak Memory 15.54MB 12.22MB

Slide 62

Slide 62 text

Commit your benchmarks to the repo.

Slide 63

Slide 63 text

Automate performance testing.

Slide 64

Slide 64 text

API { "data": [], "perf": { "query_time": 0.0634, "query_count": 5, "overall_time": 0.2949, "peak_memory": 5632304 } }

Slide 65

Slide 65 text

Behat And the memory should not exceed 6MB And the request should not exceed 0.2s And the request should not exceed 5 queries

Slide 66

Slide 66 text

Resources • http://docs.doctrine-project.org/ projects/doctrine-orm/en/latest/ reference/improving-performance.html • https://ocramius.github.io/blog/ doctrine-orm-optimization-hydration/

Slide 67

Slide 67 text

@afilina afilina.com joind.in