Doctrine Performance Optimization

B3b2139e4f2c0eca4efe2379fcebc1c5?s=47 Anna Filina
November 17, 2017

Doctrine Performance Optimization

Is your Doctrine too slow or using too many resources on the server? In this presentation, you will learn the memory, I/O and CPU usage of different approaches. You will also see learn how to create proofs of concept and benchmarks to make a decision based on science instead of your gut feeling. I will even show you how to automate your performance testing.

B3b2139e4f2c0eca4efe2379fcebc1c5?s=128

Anna Filina

November 17, 2017
Tweet

Transcript

  1. @afilina Doctrine Performance Optimization SymfonyCon, Cluj - November 17, 2017

  2. Anna Filina • Project rescue • Legacy code • Development

    • Conferences • Private workshops • elePHPant smuggling
  3. Lots of code & numbers.

  4. Use science.

  5. Resources • CPU • Memory • Disk access

  6. Good benchmarking is hard.

  7. Time Command $ time php benchmarks/1-no-optimizations.php real 0m5.486s user 0m3.298s

    sys 0m0.406s
  8. Logger $config = new \Doctrine\ORM\Configuration(); $sqlLogger = new \Doctrine\DBAL\Logging\DebugStack(); $sqlLogger->enabled;

    $config->setSQLLogger($sqlLogger); $entityManager = EntityManager::create($dbParams, $config);
  9. Benchmark Output $queryTime = 0; foreach ($sqlLogger->queries as $query) {

    $queryTime += $query['executionMS']; } $totalTime = microtime(true) - $startTime; $scriptTime = $totalTime - $queryTime; $numQueries = count($sqlLogger->queries);
  10. Output Query time: 2.4335s Other time: 3.2928s Num queries: 8001

  11. Personalize E-mails. 200,000 of them.

  12. The E-mail Hello {{ firstName }}, You have {{ points

    }} to spend with us! {% if originalSource == 'ebgames' %} Can we interest you in some video games? {% else %} Can we interest you in some other products? {% endif %}
  13. Models

  14. Copy production data or seed.

  15. Production Data • Copy a subset if too big. •

    Anonymize.
  16. fzaninotto/faker

  17. Seed $schemaTool = new \Doctrine\ORM\Tools\SchemaTool($entityManager); $metadata = $entityManager->getMetadataFactory()- >getAllMetadata(); $schemaTool->dropSchema($metadata);

    $schemaTool->createSchema($metadata);
  18. Generate Sets function createPropertySet($faker) { foreach (['company', 'originalSource', 'jobTitle', 'dateOfBirth',

    'points'] as $name) { $property = new Model\CustomProperty(); $property->name = $name; $property->value = $faker->text(50); $set[] = $property; } return $set; }
  19. Generate Sets for ($i=0; $i<2000; $i++) { $user = new

    Model\User(); $user->firstName = $faker->firstName(); $user->lastName = $faker->lastName(); $user->email = $faker->email(); for ($j=0; $j < 5; $j++) { $property = clone $propertySet[$j]; $property->user = $user; $user->customProperties[] = $property; } $entityManager->persist($user); if ($i % 25 == 0) { $entityManager->flush(); } } $entityManager->flush();
  20. Output Batches: real 0m11.490s user 0m9.457s sys 0m0.427s No batches:

    real 3m34.874s user 3m26.552s sys 0m1.846s
  21. 1. The Original Benchmark

  22. Fetch Property For Each User $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user')

    ->select('user') ->setMaxResults($numUsers) ->getQuery()->getResult(); foreach ($users as $i => $user) { foreach ($propertyNames as $property) { $value = $entityManager ->getRepository(Model\CustomProperty::class) ->createQueryBuilder('property') ->select('property') ->where('property.name = :property_name') ->andWhere('property.user = :user_id') ->setParameter('property_name', $property) ->setParameter('user_id', $user->id) ->setMaxResults(1) ->getQuery()->getResult(); } }
  23. Output Query time: 2.4335s Other time: 3.2928s Num queries: 8001

  24. 2. Group By Property

  25. Original User 1 User 2 User 3 Property 1 Lorem

    Ipsum Dolorem Property 2 Quibusdam Esse ipsa Sed Property 3 Non quae Magnam ea Veritatis
  26. Group User 1 User 2 User 3 Property 1 Lorem

    Ipsum Dolorem Property 2 Quibusdam Esse ipsa Sed Property 3 Non quae Magnam ea Veritatis
  27. Fetch Property For All Users $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user')

    ->select('user') ->setMaxResults($numUsers) ->getQuery()->getResult(); $userIds = array_map(function($value) { return $value->id; }, $users); foreach ($propertyNames as $property) { $values = $entityManager ->getRepository(Model\CustomProperty::class) ->createQueryBuilder('property') ->select('property') ->where('property.name = :property_name') ->setParameter('property_name', $property) ->andWhere('property.user IN (:user_ids)') ->setParameter('user_ids', $userIds) ->getQuery()->getResult(); }
  28. Output Query time: 0.0695s Other time: 0.3923s Num queries: 5

  29. Original Original Grouped Diff Query Time 2.4335 0.0695 35 Other

    time 3.2928 0.3923 8 Num Queries 8001 5 1600
  30. The End

  31. Wait, what about memory?

  32. Log Memory logMemory(); $users = ... logMemory(); $userIds = ...

    logMemory(); foreach ($propertyNames as $property) { $values = ... logMemory(); }
  33. Log Memory $memoryLog = []; function logMemory() { global $memoryLog;

    $backtrace = debug_backtrace(); $line = $backtrace[0]['line']; $memoryLog[] = [ 'usage' => memory_get_usage(false), 'line' => $line ]; }
  34. memory_get_usage • true = total memory allocated • false =

    used memory reported • 22.01MB used = 24.00MB allocated
  35. Memory Log Output • Group logs by line number (in

    case of loops). • Get highest memory usage for each group. • Calculate the diff between each line. • Get highest memory of all logs (peak).
  36. Output Memory used: 1.87MB (0.00B) line:7 Memory used: 10.48MB (+8.61MB)

    line:17 Memory used: 10.54MB (+68.42kB) line:23 Memory used: 22.01MB (+11.47MB) line:38 Peak memory used: 22.01MB line:38
  37. Comparison Original Grouped Diff users 8.61MB 8.61MB 0 userIds 0

    68.42kB 68.42kB value(s) 140.20kB 11.47MB 11.33MB Peak 10.61MB 22.01MB 11.4MB
  38. 3. Reduce Hydration

  39. Result Hydration • Create entities. • Copy properties from SQL

    result. • CPU + memory.
  40. Select Specific Columns $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user.id') ->setMaxResults($numUsers)

    ->getQuery()->getResult(); $userIds = ... foreach ($propertyNames as $property) { $values = $entityManager ->getRepository(Model\CustomProperty::class) ->createQueryBuilder('property') ->select('property.id, property.value') ->where('property.name = :property_name') ->andWhere('property.user IN (:user_ids)') ->setParameter('property_name', $property) ->setParameter('user_ids', $userIds) ->getQuery()->getResult(); }
  41. Output Query time: 0.0633s Other time: 0.2389s Num queries: 5

    Memory used: 1.87MB (0.00B) line:7 Memory used: 5.09MB (+3.22MB) line:21 Memory used: 5.09MB (+568.00B) line:25 Memory used: 6.43MB (+1.34MB) line:47 Peak memory used: 6.43MB line:47
  42. Comparison Original Grouped Arrays Query Time 2.4335 0.0695 0.0663 Other

    time 3.2928 0.3923 0.1955 Num Queries 8001 5 5 Peak Memory 10.61MB 22.01MB 6.43MB
  43. But I needed those entities!

  44. Entity Methods $user = new Model\User(); if ($property == 'email')

    { foreach ($values as $value) { $user->email = $value['email']; $mailToTag = $user->getMailToTag(); } }
  45. 4. Deallocate Memory

  46. Unset Users $users = ... $userIds = ... unset($users); foreach

    ($propertyNames as $property) { $values = ... }
  47. Output Query time: 0.0634s Other time: 0.1916s Num queries: 5

    Memory used: 5.02MB (0.00B) line:15 Memory used: 5.09MB (+68.73kB) line:21 Memory used: 4.30MB (-0.78MB) line:23 Memory used: 5.64MB (+1.34MB) line:47 Peak memory used: 5.64MB line:47
  48. Comparison Original Grouped Arrays Unset Query Time 2.4335 0.0695 0.0663

    0.0634 Other time 3.2928 0.3923 0.1955 0.1916 Num Queries 8001 5 5 5 Peak Memory 10.61MB 22.01MB 6.43MB 5.64MB
  49. 5. Split to batches

  50. Unset Users $batchSize = 200; $userIterations = ceil($numUsers / $batchSize);

    for ($i=0; $i < $userIterations; $i++) { $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user.id') ->setFirstResult($i * $batchSize) ->setMaxResults($batchSize) ->getQuery()->getResult(); $userIds = ... unset($users); foreach ($propertyNames as $property) { $values = ... } unset($values); }
  51. Output Query time: 0.1134s Other time: 0.1004s Num queries: 50

    Memory used: 1.87MB (0.00B) line:7 Memory used: 1.88MB (+3.77kB) line:13 Memory used: 4.97MB (+3.10MB) line:30 Memory used: 4.89MB (-85.16kB) line:32 Memory used: 5.04MB (+155.90kB) line:54 Memory used: 4.94MB (-0.10MB) line:58 Peak memory used: 5.04MB line:54
  52. Comparison Original Grouped Arrays Unset Batches Query Time 2.4335 0.0695

    0.0663 0.0634 0.1134 Other time 3.2928 0.3923 0.1955 0.1916 0.1004 Num Queries 8001 5 5 5 50 Peak Memory 10.61MB 22.01MB 6.43MB 5.64MB 5.04MB
  53. The query log uses CPU & memory.

  54. Without Query Log Original Grouped Arrays Unset Batches Query Time

    2.4335 0.0695 0.0663 0.0634 0.1134 Overall time 4.7104 0.4044 0.2519 0.2949 0.2157 Num Queries 8001 5 5 5 50 Peak Memory 10.61MB 21.74MB 6.16MB 5.37MB 4.52MB
  55. 6. With lazy loading

  56. Lazy Load $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user') ->getQuery() ->getResult();

    $billing = []; foreach ($users as $user) { if ($user->billingAddress) { $billing[$user->id] = $user->billingAddress->address; } }
  57. Output Query time: 0.2978s Other time: 0.2514s Num queries: 2001

    Memory used: 1.87MB (0.00B) line:7 Memory used: 10.50MB (+8.63MB) line:16 Memory used: 15.54MB (+5.04MB) line:26
  58. 7. Use joins

  59. Lazy Load $users = $entityManager ->getRepository(Model\User::class) ->createQueryBuilder('user') ->select('user', 'billingAddress') ->leftJoin('user.billingAddress',

    'billingAddress') ->getQuery() ->getResult(); $billing = []; foreach ($users as $user) { if ($user->billingAddress) { $billing[$user->id] = $user->billingAddress->address; } }
  60. Output Query time: 0.0121s Other time: 0.1342s Num queries: 1

    Memory used: 1.87MB (0.00B) line:7 Memory used: 12.15MB (+10.28MB) line:17 Memory used: 12.22MB (+68.42kB) line:27 Peak memory used: 12.22MB line:27
  61. Comparison Original Left join Query Time 0.2978 0.0121s Other time

    0.2514 0.1342s Num Queries 2001 1 Peak Memory 15.54MB 12.22MB
  62. Commit your benchmarks to the repo.

  63. Automate performance testing.

  64. API { "data": [], "perf": { "query_time": 0.0634, "query_count": 5,

    "overall_time": 0.2949, "peak_memory": 5632304 } }
  65. Behat And the memory should not exceed 6MB And the

    request should not exceed 0.2s And the request should not exceed 5 queries
  66. Resources • http://docs.doctrine-project.org/ projects/doctrine-orm/en/latest/ reference/improving-performance.html • https://ocramius.github.io/blog/ doctrine-orm-optimization-hydration/

  67. @afilina afilina.com joind.in