$30 off During Our Annual Pro Sale. View Details »

Doctrine Performance Optimization

Doctrine Performance Optimization

Is your Doctrine too slow or using too many resources on the server? In this presentation, you will learn the memory, I/O and CPU usage of different approaches. You will also see learn how to create proofs of concept and benchmarks to make a decision based on science instead of your gut feeling. I will even show you how to automate your performance testing.

Anna Filina
PRO

November 17, 2017
Tweet

More Decks by Anna Filina

Other Decks in Programming

Transcript

  1. @afilina
    Doctrine Performance Optimization
    SymfonyCon, Cluj - November 17, 2017

    View Slide

  2. Anna Filina
    • Project rescue
    • Legacy code
    • Development
    • Conferences
    • Private workshops
    • elePHPant smuggling

    View Slide

  3. Lots of code &
    numbers.

    View Slide

  4. Use science.

    View Slide

  5. Resources
    • CPU
    • Memory
    • Disk access

    View Slide

  6. Good benchmarking
    is hard.

    View Slide

  7. Time Command
    $ time php benchmarks/1-no-optimizations.php
    real 0m5.486s
    user 0m3.298s
    sys 0m0.406s

    View Slide

  8. Logger
    $config = new \Doctrine\ORM\Configuration();
    $sqlLogger = new \Doctrine\DBAL\Logging\DebugStack();
    $sqlLogger->enabled;
    $config->setSQLLogger($sqlLogger);
    $entityManager = EntityManager::create($dbParams, $config);

    View Slide

  9. Benchmark Output
    $queryTime = 0;
    foreach ($sqlLogger->queries as $query) {
    $queryTime += $query['executionMS'];
    }
    $totalTime = microtime(true) - $startTime;
    $scriptTime = $totalTime - $queryTime;
    $numQueries = count($sqlLogger->queries);

    View Slide

  10. Output
    Query time: 2.4335s
    Other time: 3.2928s
    Num queries: 8001

    View Slide

  11. Personalize E-mails.
    200,000 of them.

    View Slide

  12. The E-mail
    Hello {{ firstName }},
    You have {{ points }} to spend with us!
    {% if originalSource == 'ebgames' %}
    Can we interest you in some video games?
    {% else %}
    Can we interest you in some other products?
    {% endif %}

    View Slide

  13. Models

    View Slide

  14. Copy production
    data or seed.

    View Slide

  15. Production Data
    • Copy a subset if too big.
    • Anonymize.

    View Slide

  16. fzaninotto/faker

    View Slide

  17. Seed
    $schemaTool = new
    \Doctrine\ORM\Tools\SchemaTool($entityManager);
    $metadata = $entityManager->getMetadataFactory()-
    >getAllMetadata();
    $schemaTool->dropSchema($metadata);
    $schemaTool->createSchema($metadata);

    View Slide

  18. Generate Sets
    function createPropertySet($faker)
    {
    foreach (['company', 'originalSource', 'jobTitle',
    'dateOfBirth', 'points'] as $name) {
    $property = new Model\CustomProperty();
    $property->name = $name;
    $property->value = $faker->text(50);
    $set[] = $property;
    }
    return $set;
    }

    View Slide

  19. Generate Sets
    for ($i=0; $i<2000; $i++)
    {
    $user = new Model\User();
    $user->firstName = $faker->firstName();
    $user->lastName = $faker->lastName();
    $user->email = $faker->email();
    for ($j=0; $j < 5; $j++) {
    $property = clone $propertySet[$j];
    $property->user = $user;
    $user->customProperties[] = $property;
    }
    $entityManager->persist($user);
    if ($i % 25 == 0) {
    $entityManager->flush();
    }
    }
    $entityManager->flush();

    View Slide

  20. Output
    Batches:
    real 0m11.490s
    user 0m9.457s
    sys 0m0.427s
    No batches:
    real 3m34.874s
    user 3m26.552s
    sys 0m1.846s

    View Slide

  21. 1. The Original
    Benchmark

    View Slide

  22. Fetch Property For Each User
    $users = $entityManager
    ->getRepository(Model\User::class)
    ->createQueryBuilder('user')
    ->select('user')
    ->setMaxResults($numUsers)
    ->getQuery()->getResult();
    foreach ($users as $i => $user) {
    foreach ($propertyNames as $property) {
    $value = $entityManager
    ->getRepository(Model\CustomProperty::class)
    ->createQueryBuilder('property')
    ->select('property')
    ->where('property.name = :property_name')
    ->andWhere('property.user = :user_id')
    ->setParameter('property_name', $property)
    ->setParameter('user_id', $user->id)
    ->setMaxResults(1)
    ->getQuery()->getResult();
    }
    }

    View Slide

  23. Output
    Query time: 2.4335s
    Other time: 3.2928s
    Num queries: 8001

    View Slide

  24. 2. Group By
    Property

    View Slide

  25. Original
    User 1 User 2 User 3
    Property 1 Lorem Ipsum Dolorem
    Property 2 Quibusdam Esse ipsa Sed
    Property 3 Non quae Magnam ea Veritatis

    View Slide

  26. Group
    User 1 User 2 User 3
    Property 1 Lorem Ipsum Dolorem
    Property 2 Quibusdam Esse ipsa Sed
    Property 3 Non quae Magnam ea Veritatis

    View Slide

  27. Fetch Property For All Users
    $users = $entityManager
    ->getRepository(Model\User::class)
    ->createQueryBuilder('user')
    ->select('user')
    ->setMaxResults($numUsers)
    ->getQuery()->getResult();
    $userIds = array_map(function($value) {
    return $value->id;
    }, $users);
    foreach ($propertyNames as $property) {
    $values = $entityManager
    ->getRepository(Model\CustomProperty::class)
    ->createQueryBuilder('property')
    ->select('property')
    ->where('property.name = :property_name')
    ->setParameter('property_name', $property)
    ->andWhere('property.user IN (:user_ids)')
    ->setParameter('user_ids', $userIds)
    ->getQuery()->getResult();
    }

    View Slide

  28. Output
    Query time: 0.0695s
    Other time: 0.3923s
    Num queries: 5

    View Slide

  29. Original
    Original Grouped Diff
    Query Time 2.4335 0.0695 35
    Other time 3.2928 0.3923 8
    Num Queries 8001 5 1600

    View Slide

  30. The End

    View Slide

  31. Wait, what about
    memory?

    View Slide

  32. Log Memory
    logMemory();
    $users = ...
    logMemory();
    $userIds = ...
    logMemory();
    foreach ($propertyNames as $property)
    {
    $values = ...
    logMemory();
    }

    View Slide

  33. Log Memory
    $memoryLog = [];
    function logMemory()
    {
    global $memoryLog;
    $backtrace = debug_backtrace();
    $line = $backtrace[0]['line'];
    $memoryLog[] = [
    'usage' => memory_get_usage(false),
    'line' => $line
    ];
    }

    View Slide

  34. memory_get_usage
    • true = total memory allocated
    • false = used memory reported
    • 22.01MB used = 24.00MB allocated

    View Slide

  35. Memory Log Output
    • Group logs by line number (in case of
    loops).
    • Get highest memory usage for each
    group.
    • Calculate the diff between each line.
    • Get highest memory of all logs (peak).

    View Slide

  36. Output
    Memory used: 1.87MB (0.00B) line:7
    Memory used: 10.48MB (+8.61MB) line:17
    Memory used: 10.54MB (+68.42kB) line:23
    Memory used: 22.01MB (+11.47MB) line:38
    Peak memory used: 22.01MB line:38

    View Slide

  37. Comparison
    Original Grouped Diff
    users 8.61MB 8.61MB 0
    userIds 0 68.42kB 68.42kB
    value(s) 140.20kB 11.47MB 11.33MB
    Peak 10.61MB 22.01MB 11.4MB

    View Slide

  38. 3. Reduce Hydration

    View Slide

  39. Result Hydration
    • Create entities.
    • Copy properties from SQL result.
    • CPU + memory.

    View Slide

  40. Select Specific Columns
    $users = $entityManager
    ->getRepository(Model\User::class)
    ->createQueryBuilder('user')
    ->select('user.id')
    ->setMaxResults($numUsers)
    ->getQuery()->getResult();
    $userIds = ...
    foreach ($propertyNames as $property) {
    $values = $entityManager
    ->getRepository(Model\CustomProperty::class)
    ->createQueryBuilder('property')
    ->select('property.id, property.value')
    ->where('property.name = :property_name')
    ->andWhere('property.user IN (:user_ids)')
    ->setParameter('property_name', $property)
    ->setParameter('user_ids', $userIds)
    ->getQuery()->getResult();
    }

    View Slide

  41. Output
    Query time: 0.0633s
    Other time: 0.2389s
    Num queries: 5
    Memory used: 1.87MB (0.00B) line:7
    Memory used: 5.09MB (+3.22MB) line:21
    Memory used: 5.09MB (+568.00B) line:25
    Memory used: 6.43MB (+1.34MB) line:47
    Peak memory used: 6.43MB line:47

    View Slide

  42. Comparison
    Original Grouped Arrays
    Query Time 2.4335 0.0695 0.0663
    Other time 3.2928 0.3923 0.1955
    Num Queries 8001 5 5
    Peak Memory 10.61MB 22.01MB 6.43MB

    View Slide

  43. But I needed those
    entities!

    View Slide

  44. Entity Methods
    $user = new Model\User();
    if ($property == 'email') {
    foreach ($values as $value) {
    $user->email = $value['email'];
    $mailToTag = $user->getMailToTag();
    }
    }

    View Slide

  45. 4. Deallocate
    Memory

    View Slide

  46. Unset Users
    $users = ...
    $userIds = ...
    unset($users);
    foreach ($propertyNames as $property) {
    $values = ...
    }

    View Slide

  47. Output
    Query time: 0.0634s
    Other time: 0.1916s
    Num queries: 5
    Memory used: 5.02MB (0.00B) line:15
    Memory used: 5.09MB (+68.73kB) line:21
    Memory used: 4.30MB (-0.78MB) line:23
    Memory used: 5.64MB (+1.34MB) line:47
    Peak memory used: 5.64MB line:47

    View Slide

  48. Comparison
    Original Grouped Arrays Unset
    Query Time 2.4335 0.0695 0.0663 0.0634
    Other time 3.2928 0.3923 0.1955 0.1916
    Num
    Queries
    8001 5 5 5
    Peak
    Memory
    10.61MB 22.01MB 6.43MB 5.64MB

    View Slide

  49. 5. Split to batches

    View Slide

  50. Unset Users
    $batchSize = 200;
    $userIterations = ceil($numUsers / $batchSize);
    for ($i=0; $i < $userIterations; $i++) {
    $users = $entityManager
    ->getRepository(Model\User::class)
    ->createQueryBuilder('user')
    ->select('user.id')
    ->setFirstResult($i * $batchSize)
    ->setMaxResults($batchSize)
    ->getQuery()->getResult();
    $userIds = ...
    unset($users);
    foreach ($propertyNames as $property) {
    $values = ...
    }
    unset($values);
    }

    View Slide

  51. Output
    Query time: 0.1134s
    Other time: 0.1004s
    Num queries: 50
    Memory used: 1.87MB (0.00B) line:7
    Memory used: 1.88MB (+3.77kB) line:13
    Memory used: 4.97MB (+3.10MB) line:30
    Memory used: 4.89MB (-85.16kB) line:32
    Memory used: 5.04MB (+155.90kB) line:54
    Memory used: 4.94MB (-0.10MB) line:58
    Peak memory used: 5.04MB line:54

    View Slide

  52. Comparison
    Original Grouped Arrays Unset Batches
    Query
    Time
    2.4335 0.0695 0.0663 0.0634 0.1134
    Other
    time
    3.2928 0.3923 0.1955 0.1916 0.1004
    Num
    Queries
    8001 5 5 5 50
    Peak
    Memory
    10.61MB 22.01MB 6.43MB 5.64MB 5.04MB

    View Slide

  53. The query log uses
    CPU & memory.

    View Slide

  54. Without Query Log
    Original Grouped Arrays Unset Batches
    Query
    Time
    2.4335 0.0695 0.0663 0.0634 0.1134
    Overall
    time
    4.7104 0.4044 0.2519 0.2949 0.2157
    Num
    Queries
    8001 5 5 5 50
    Peak
    Memory
    10.61MB 21.74MB 6.16MB 5.37MB 4.52MB

    View Slide

  55. 6. With lazy loading

    View Slide

  56. Lazy Load
    $users = $entityManager
    ->getRepository(Model\User::class)
    ->createQueryBuilder('user')
    ->select('user')
    ->getQuery()
    ->getResult();
    $billing = [];
    foreach ($users as $user) {
    if ($user->billingAddress) {
    $billing[$user->id] = $user->billingAddress->address;
    }
    }

    View Slide

  57. Output
    Query time: 0.2978s
    Other time: 0.2514s
    Num queries: 2001
    Memory used: 1.87MB (0.00B) line:7
    Memory used: 10.50MB (+8.63MB) line:16
    Memory used: 15.54MB (+5.04MB) line:26

    View Slide

  58. 7. Use joins

    View Slide

  59. Lazy Load
    $users = $entityManager
    ->getRepository(Model\User::class)
    ->createQueryBuilder('user')
    ->select('user', 'billingAddress')
    ->leftJoin('user.billingAddress', 'billingAddress')
    ->getQuery()
    ->getResult();
    $billing = [];
    foreach ($users as $user) {
    if ($user->billingAddress) {
    $billing[$user->id] = $user->billingAddress->address;
    }
    }

    View Slide

  60. Output
    Query time: 0.0121s
    Other time: 0.1342s
    Num queries: 1
    Memory used: 1.87MB (0.00B) line:7
    Memory used: 12.15MB (+10.28MB) line:17
    Memory used: 12.22MB (+68.42kB) line:27
    Peak memory used: 12.22MB line:27

    View Slide

  61. Comparison
    Original Left join
    Query Time 0.2978 0.0121s
    Other time 0.2514 0.1342s
    Num Queries 2001 1
    Peak Memory 15.54MB 12.22MB

    View Slide

  62. Commit your
    benchmarks to the repo.

    View Slide

  63. Automate performance
    testing.

    View Slide

  64. API
    {
    "data": [],
    "perf": {
    "query_time": 0.0634,
    "query_count": 5,
    "overall_time": 0.2949,
    "peak_memory": 5632304
    }
    }

    View Slide

  65. Behat
    And the memory should not exceed 6MB
    And the request should not exceed 0.2s
    And the request should not exceed 5 queries

    View Slide

  66. Resources
    • http://docs.doctrine-project.org/
    projects/doctrine-orm/en/latest/
    reference/improving-performance.html
    • https://ocramius.github.io/blog/
    doctrine-orm-optimization-hydration/

    View Slide

  67. @afilina afilina.com
    joind.in

    View Slide