Upgrade to PRO for Only $50/Yearโ€”Limited-Time Offer! ๐Ÿ”ฅ

Teaching Doctrine to be Lazy

Teaching Doctrine to beย Lazy

Doctrine object managers are greedy: when you query for a set of objects, they love to load everything, all at once. Thatโ€™s normally great - but what if youโ€™re working with large data sets, where you might load 10's of thousands of objects?

In this talk, weโ€™ll teach Doctrine how to be lazy by demonstrating how to efficiently query and work with large data sets. Weโ€™ll cover:

- Lazy queries
- Lazy relationships
- Profiling and reducing object "hydrations"
- Efficient batch processing
- An alternate, โ€œlazy-by-defaultโ€ repository pattern

Avatar for Kevin Bond

Kevin Bond

June 15, 2023
Tweet

More Decks by Kevin Bond

Other Decks in Programming

Transcript

  1. Me? From Ontario, Canada Husband, father of three Symfony user

    since 1.0 Symfony Core Team @kbond on GitHub/Slack @zenstruck on Twitter
  2. zenstruck? A GitHub organization where my open source packages live

    zenstruck/foundry zenstruck/browser zenstruck/messenger-test zenstruck/filesystem (wip) zenstruck/schedule-bundle (for <6.3) ... Many now co-maintained by Nicolas PHILIPPE ( @nikophil )
  3. What we'll cover Hydration considerations Lazy batch iterating (readonly) Lazy

    batch processing Updating/Deleting/Persisting Lazy relationships Future ideas Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 4
  4. Sample App +----------+ +------------+ | PRODUCT | | PURCHASE |

    |----------| |------------| | id |---+ | id | | sku | +--<| product_id | | stock | | date | | category | | amount | +----------+ +------------+ 1,000+ products, 100,000+ purchases Products may have 1,000's of purchases Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 5
  5. Mongo? With some tweaks, the demonstrated techniques should/could apply to

    any doctrine/persistence implementation I'm using doctrine/orm for the examples in this talk Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 6
  6. Part 1: Hydration Considerations Hydration is expensive Some rules Only

    hydrate what you need Only hydrate when you need it Cleanup after yourself Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 7
  7. Profiling Hydrations Web Profiler? debesha/doctrine-hydration-profiler-bundle DoctrineBundle? Needs a hook in

    doctrine/orm Blackfire.io metrics.doctrine.entities.hydrated Teaching Doctrine to be Lazy - Part 1: Hydration Considerations Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 8
  8. Part 2: Batch Iterating Read-only Use SQL? purchase:report command Generates

    a report for all purchases Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 9
  9. $repo->findAll() 100000/100000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 100% 1 sec/1 sec 166.0 MiB //

    Time: 2 secs, Queries: 1 Only hydrate what you need Only hydrate when you need it Cleanup after yourself Teaching Doctrine to be Lazy - Part 2: Batch Iterating Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 10
  10. $repo->matching(new Criteria()) 100000/100000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 100% 1 sec/1 sec 168.0 MiB

    // Time: 1 sec, Queries: 2 Only hydrate what you need Only hydrate when you need it Cleanup after yourself Teaching Doctrine to be Lazy - Part 2: Batch Iterating Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 11
  11. Doctrine\ORM\Query::toIterable() 100000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 2 secs 166.0 MiB // Time: 2

    secs, Queries: 1 Only hydrate what you need Only hydrate when you need it Cleanup after yourself Teaching Doctrine to be Lazy - Part 2: Batch Iterating Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 12
  12. Batch Utilities - Iterator ocramius/doctrine-batch-utils Takes an ORM Query object

    and iterates over the result set in batches Clear the ObjectManager after each batch to free memory Enhanced: Accepts any iterable and any ObjectManager instance Teaching Doctrine to be Lazy - Part 2: Batch Iterating Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 13
  13. Use BatchIterator $iterator = new BatchIterator($query->toIterable(), $this->em); 100000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 100%

    2 secs 20.0 MiB // Time: 2 secs, Queries: 1 Only hydrate what you need Only hydrate when you need it Cleanup after yourself Teaching Doctrine to be Lazy - Part 2: Batch Iterating Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 14
  14. Memory Stays Constant, Time Increases 200,000 purchases? 200000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 100%

    4 secs 20.0 MiB // Time: 4 secs, Queries: 1 1,000,000 purchases? 1000000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 19 secs 22.0 MiB // Time: 19 secs, Queries: 1 Teaching Doctrine to be Lazy - Part 2: Batch Iterating Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 15
  15. Part 3: Batch Processing Teaching Doctrine to be Lazy Kevin

    Bond โ€ข @zenstruck โ€ข github.com/kbond 17
  16. Batch Updating product:stock-update Command Loop through all products Update stock

    level from a source (ie. CSV files, API, etc) Teaching Doctrine to be Lazy - Part 3: Batch Processing (Update) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 18
  17. $repo->findAll() foreach ($repo->findAll() as $product) { /** @var Product $product

    */ $product->setStock($this->currentStockFor($product)); $this->em->flush(); } Teaching Doctrine to be Lazy - Part 3: Batch Processing (Update) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 19
  18. $repo->findAll() foreach ($repo->findAll() as $product) { /** @var Product $product

    */ $product->setStock($this->currentStockFor($product)); $this->em->flush(); } 1000/1000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 100% 8 secs/8 secs 16.0 MiB // Time: 8 secs, Queries: 988 Teaching Doctrine to be Lazy - Part 3: Batch Processing (Update) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 20
  19. $repo->findAll() , Delay Flush foreach ($repo->findAll() as $product) { /**

    @var Product $product */ $product->setStock($this->currentStockFor($product)); } $this->em->flush(); Teaching Doctrine to be Lazy - Part 3: Batch Processing (Update) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 21
  20. $repo->findAll() , Delay Flush foreach ($repo->findAll() as $product) { /**

    @var Product $product */ $product->setStock($this->currentStockFor($product)); } $this->em->flush(); 1000/1000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 100% < 1 sec/< 1 sec 16.0 MiB // Time: < 1 sec, Queries: 2 Teaching Doctrine to be Lazy - Part 3: Batch Processing (Update) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 22
  21. $repo->findAll() , Delay Flush 100,000 products? 100000/100000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] 100% <

    1 sec/< 1 sec 186.0 MiB // Time: 12 secs, Queries: 2 Teaching Doctrine to be Lazy - Part 3: Batch Processing (Update) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 23
  22. Batch Utilities - Processor ocramius/doctrine-batch-utils Takes an ORM Query object

    and iterates over the result set in batches Flush and clear the ObjectManager after each batch to free memory and save changes Wrap everything in a transaction Enhanced: Accepts any iterable and any ObjectManager instance Teaching Doctrine to be Lazy - Part 3: Batch Processing (Update) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 24
  23. Using BatchProcessor $processor = new BatchProcessor($query->toIterable(), $this->em); foreach ($processor as

    $product) { /** @var Product $product */ $product->setStock($this->currentStockFor($product)); } // no need for "flush" 1000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] < 1 sec 16.0 MiB // Time: < 1 sec, Queries: 1 Teaching Doctrine to be Lazy - Part 3: Batch Processing (Update) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 25
  24. Batch Deleting DQL DELETE statement? PreRemove / PostRemove events? purchase:purge

    Command Delete all purchases older than X days Imagine a PostRemove event that archives the purged purchases Teaching Doctrine to be Lazy - Part 3: Batch Processing (Delete) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 27
  25. Using BatchProcessor $processor = new BatchProcessor($query->toIterable(), $this->em); foreach ($processor as

    $purchase) { /** @var Purchase $purchase */ $this->em->remove($purchase); // no need for "flush" } Teaching Doctrine to be Lazy - Part 3: Batch Processing (Delete) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 28
  26. Batch Persisting product:import Command Imports products from a source (ie.

    CSV files, API, etc) We'll use a Generator to yield Product instances from our source Requires enhanced BatchProcessor Accepts any iterable Teaching Doctrine to be Lazy - Part 3: Batch Processing (Persist) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 31
  27. Using BatchProcessor $processor = new BatchProcessor( $this->products(), // Product[] -

    our "source" $this->em, ); foreach ($processor as $product) { /** @var Product $product */ $this->em->persist($product); // no need for "flush" } Teaching Doctrine to be Lazy - Part 3: Batch Processing (Persist) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 32
  28. Using BatchProcessor - Import 1,000 1000 [โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“โ–“] < 1 sec

    16.0 MiB // Time: < 1 sec, Queries: 1 Teaching Doctrine to be Lazy - Part 3: Batch Processing (Persist) Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 33
  29. Part 4: Lazy Relationships product:report Command Loop over all products

    (using our BatchIterator ) For each product Fetch details on the most recent purchase Fetch number of purchases in the last 30 days Some products have 10,000+ purchases Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 35
  30. Command Code foreach ($products as $product) { /** @var Product

    $product */ /** @var Collection&Selectable $purchases */ $purchases = $product->getPurchases(); $last30Days = Criteria::create()->where( Criteria::expr()->gte('date', new \DateTimeImmutable('-30 days')) ); $this->addToReport( $product->getSku(), $purchases->first() ?: null, // most recent purchase $purchases->matching($last30Days)->count(), ); } Teaching Doctrine to be Lazy - Part 4: Lazy Relationships Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 36
  31. Standard One-to-Many Relationship #[ORM\Entity] class Product { #[ORM\OneToMany(mappedBy: 'product', targetEntity:

    Purchase::class)] #[ORM\OrderBy(['date' => 'DESC'])] private Collection $purchases; public function getPurchases(): Collection { return $this->purchases; } } Teaching Doctrine to be Lazy - Part 4: Lazy Relationships Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 37
  32. Standard One-to-Many Relationship $purchases = $product->getPurchases(); $purchases->count(); // initializes entire

    collection $purchases->first(); // initializes entire collection $purchases->slice(0, 10); // initializes entire collection foreach ($purchases as $purchase) { // initializes entire collection } Teaching Doctrine to be Lazy - Part 4: Lazy Relationships Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 38
  33. Extra Lazy One-to-Many Relationship #[ORM\Entity] class Product { #[ORM\OneToMany( mappedBy:

    'product', targetEntity: Purchase::class, fetch: 'EXTRA_LAZY', // !!! )] #[ORM\OrderBy(['date' => 'DESC'])] private Collection $purchases; } Teaching Doctrine to be Lazy - Part 4: Lazy Relationships Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 40
  34. Extra Lazy One-to-Many Relationship Assuming the collection hasn't been previously

    initialized, Certain methods create new queries: $purchases = $product->getPurchases(); $purchases->count(); // creates an additional "count" query $purchases->first(); // initializes entire collection !! $purchases->slice(0, 10); // creates an additional "slice" query foreach ($purchases as $purchase) { // initializes entire collection } Teaching Doctrine to be Lazy - Part 4: Lazy Relationships Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 41
  35. Extra Lazy One-to-Many Relationship More efficient first() : $purchases =

    $product->getPurchases(); $purchases->slice(0, 1)[0] ?? null; Teaching Doctrine to be Lazy - Part 4: Lazy Relationships Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 42
  36. Updated Command Code foreach ($products as $product) { // ...

    $this->addToReport( $product->getSku(), $purchases->slice(0, 1)[0] ?? null, // most recent purchase $purchases->matching($last30Days)->count(), ); } Teaching Doctrine to be Lazy - Part 4: Lazy Relationships Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 43
  37. n+x Problem? ...it depends... Saving the number of queries at

    all costs is not always the best solution If the collection has many items, hydration will be more expensive than the extra queries Evaluate your models and use cases Teaching Doctrine to be Lazy - Part 4: Lazy Relationships Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 45
  38. Batch Summary Hydration is expensive The BatchIterator / Processor can

    keep the expense down to time only When you have a large or unknown amount of data to process, it's better to move the processing to background tasks Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 46
  39. Part 5: Future Ideas Exploring some ideas in zenstruck/collection .

    Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 47
  40. Alternate Lazy by Default ObjectRepository Teaching Doctrine to be Lazy

    - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 48
  41. New ObjectRepository Interface /** * @template T of object *

    @extends \IteratorAggregate<T> */ interface ObjectRepository extends \IteratorAggregate, \Countable { /** * @param mixed|Criteria $specification * * @return Result<T> */ public function filter(mixed $specification): Result; } Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 49
  42. The Result Interface /** * @template T of object *

    @extends \IteratorAggregate<T> */ interface Result extends \IteratorAggregate, \Countable { public function first(): T|null; public function take(int $limit, int $offset = 0): self; public function process(int $chunkSize = 100): BatchProcessor public function toArray(): array; // ... } Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 50
  43. ORM ObjectRepository::filter() $specification can be: array<string,mixed> : works like findBy()

    Criteria : works like matching() callable(QueryBuilder, string): void : custom query Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 51
  44. Using the $specification callable $purchases = $repo->filter( function(QueryBuilder $qb, string

    $root) use ($newerThan) { $qb->where("{$root}.date > :newerThan") ->setParameter('newerThan', $newerThan) ; } ); Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 52
  45. Specification Objects You could extend this ObjectRepository to add your

    methods, but, because filter() accepts callable(QueryBuilder) , you can create invokable specification objects instead. Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 53
  46. Between Specification final class Between { public function __invoke(QueryBuilder $qb,

    string $root): void { if ($this->from) { $qb->andWhere("{$root}.date >= :from") ->setParameter('from', $this->from) ; } // "to" logic... } } Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 54
  47. Inject as a Service (Symfony 6.3+) /** * @param ObjectRepository<Purchase>

    $repo */ public function someAction( // extends "Autowire" (creates repo from factory service) #[ForClass(Purchase::class)] ObjectRepository $repo, ) { $products = $repo->filter(new Between('2021-01-01', '2021-12-31')); // ... } Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 55
  48. Thank You! @kbond on GitHub/Slack @zenstruck on Twitter Sample Code:

    github.com/kbond/lazy-doctrine Slides: speakerdeck.com/kbond zenstruck/collection Teaching Doctrine to be Lazy Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 56
  49. Paginating the Result class ResultPagerfantaAdapter implements AdapterInterface { public function

    getNbResults(): int { return $this->result->count(); } public function getSlice(int $offset, int $length): array { return $this->result->take($length, $offset)->toArray(); } } Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 57
  50. Lazier Doctrine Collection $purchase = $purchases->first(); // use slice(0, 1)[0]

    ?? null internally foreach ($purchases as $purchase) { // lazily iterate "chunks" if large count } Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 58
  51. Generic Specification System $specification = Spec::andX( new Between(from: new \DateTimeImmutable('-1

    year')), // in last year Spec::greaterThan('amount', 100.00), // amount > $100.00 Spec::sortDesc('date'), // sort by date ); Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 59
  52. Generic Specification System Use the same specification object in multiple

    places: // use with ORM $purchases = $ormPurchaseRepository->filter($specification); // use with Mongo $purchases = $mongoPurchaseRepository->filter($specification); // use with Collection $purchases = $product->getPurchases()->filter($specification); Teaching Doctrine to be Lazy - Part 5: Future Ideas Kevin Bond โ€ข @zenstruck โ€ข github.com/kbond 60