Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Magento 2 Indexer

Magento 2 Indexer

- What's an index?
- What's an indexer?
- What's indexing?
A brief introduction about most of the indexing stuff in Magento 2.
What's an mview?
When will data be processed?
Naming: reatime, update on schedule, update on save, ...
- Lock Management
- Parallel processing
- Dependencies
- Trigger
- Full reindex
- Cronjobs

Christian Münch

October 02, 2022
Tweet

More Decks by Christian Münch

Other Decks in Technology

Transcript

  1. Indexing is how Adobe Commerce and Magento Open Source transform

    data such as products and categories, to improve the performance of your storefront. As data changes, the transformed data must be updated or reindexed. The application has a very sophisticated architecture that stores lots of merchant data (including catalog data, prices, users, and stores) in many database tables. To optimize storefront performance, the application accumulates data into special tables using indexers. https://developer.adobe.com/commerce/php/development/components/indexing/ Definition by Adobe/Magento netz98 a valantic company | Christian Münch @cmuench 3
  2. Dictionary: Original data entered to the system. Dictionaries are organized

    in normal form to facilitate maintenance (updating the data). Index: Representation of the original data for optimized reading and searching. Indexes can contain results of aggregations and various calculations. Index data can be always re-created from a dictionary using a certain algorithm. Indexer : Object that creates an index. https://developer.adobe.com/commerce/php/development/components/indexing/#indexi ng-terminology Terminology netz98 a valantic company | Christian Münch @cmuench 4
  3. php bin/magento indexer:info customer_grid Customer Grid design_config_grid Design Config Grid

    catalog_category_product Category Products catalog_product_category Product Categories catalogrule_rule Catalog Rule Product catalog_product_attribute Product EAV cataloginventory_stock Stock catalog_product_price Product Price catalogrule_product Catalog Product Rule catalogsearch_fulltext Catalog Search targetrule_product_rule Product/Target Rule targetrule_rule_product Target Rule/Product salesrule_rule Sales Rule elasticsuite_categories_fulltext ElasticSuite Category Indexing elasticsuite_cms_page_fulltext Elasticsuite Cms Page Indexing elasticsuite_thesaurus ElasticSuite Thesaurus Indexing Available Indexers netz98 a valantic company | Christian Münch @cmuench 5
  4. php bin/magento indexer:show-mode Customer Grid: Update by Schedule Design Config

    Grid: Update by Schedule Category Products: Update by Schedule Product Categories: Update by Schedule Catalog Rule Product: Update by Schedule Product EAV: Update by Schedule Stock: Update by Schedule Product Price: Update by Schedule Catalog Product Rule: Update by Schedule Catalog Search: Update by Schedule Product/Target Rule: Update by Schedule Target Rule/Product: Update by Schedule Sales Rule: Update by Schedule ElasticSuite Category Indexing: Update on Save Elasticsuite Cms Page Indexing: Update on Save ElasticSuite Thesaurus Indexing: Update on Save Indexer Mode netz98 a valantic company | Christian Münch @cmuench 7
  5. Update "on save" Synchronously aka "realtime" After every model save.

    Uses PHP logic. Minimal delay Cause database locks on concurrent writes Update "on scheduled" Asynchronously Sequentially processed by cron job (Group "indexer"). Cronjob creates database triggers. Detects database changes. Also without PHP logic. Delay (depends on cronjob runtime and frequency) It is scaling! No database locks. Indexer Mode netz98 a valantic company | Christian Münch @cmuench 8
  6. php bin/magento indexer:set-mode <realtime|schedule> Index mode for Indexer Customer Grid

    has not been changed Index mode for Indexer Design Config Grid has not been changed Index mode for Indexer Category Products has not been changed Index mode for Indexer Product Categories has not been changed Index mode for Indexer Catalog Rule Product has not been changed Index mode for Indexer Product EAV has not been changed Index mode for Indexer Stock has not been changed Index mode for Indexer Product Price has not been changed Index mode for Indexer Catalog Product Rule has not been changed Index mode for Indexer Catalog Search has not been changed Index mode for Indexer Product/Target Rule has not been changed Index mode for Indexer Target Rule/Product has not been changed Index mode for Indexer Sales Rule has not been changed Index mode for Indexer ElasticSuite Category Indexing was changed from 'Update on Save' to 'Update by Schedule' Index mode for Indexer Elasticsuite Cms Page Indexing was changed from 'Update on Save' to 'Update by Schedule' Index mode for Indexer ElasticSuite Thesaurus Indexing was changed from 'Update on Save' to 'Update by Schedule' Change Mode netz98 a valantic company | Christian Münch @cmuench 9
  7. php bin/magento indexer:reset php bin/magento indexer:reset catalog_product_price Reset after a

    interrupted index:reindex netz98 a valantic company | Christian Münch @cmuench 11
  8. Supported is currently only the price indexer. bin/magento indexer:show-dimensions-mode bin/magento

    indexer:set-dimensions-mode catalog_product_price website Mode can be "website, customer_group,website_and_customer_group" Indexer Mode netz98 a valantic company | Christian Münch @cmuench 12
  9. via config in a module . └── MyModule └── etc

    └── indexer.xml <?xml version="1.0" encoding="UTF-8"?> <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:Indexer/etc/indexer.xsd"> <indexer id="inventory" view_id="inventory" class="Magento\InventoryIndexer\Indexer\InventoryIndexer"> <title translate="true">Inventory</title> <description translate="true">Inventory index (MSI)</description> </indexer> <indexer id="catalog_product_price"> <dependencies> <indexer id="inventory"/> </dependencies> </indexer> </config> Indexer Registration netz98 a valantic company | Christian Münch @cmuench 15
  10. class \Magento\Indexer\Model\Indexer\DependencyDecorator implements \Magento\Framework\Indexer\IndexerInterface { // ... public function reindexRow($id)

    { $this->cacheCleaner->start(); $this->indexer->reindexRow($id); $dependentIndexerIds = $this->dependencyInfoProvider->getIndexerIdsToRunAfter($this->indexer->getId()); foreach ($dependentIndexerIds as $indexerId) { $dependentIndexer = $this->indexerRegistry->get($indexerId); if (!$dependentIndexer->isScheduled()) { $dependentIndexer->reindexRow($id); } } $this->cacheCleaner->flush(); } } Default implementation defined in di.xml of Magento_Indexer module. Dependency "update on save" netz98 a valantic company | Christian Münch @cmuench 16
  11. \Magento\Indexer\Console\Command\IndexerReindexCommand::getIndexers protected function getIndexers(InputInterface $input) { $indexers = parent::getIndexers($input); $allIndexers

    = $this->getAllIndexers(); if (!array_diff_key($allIndexers, $indexers)) { return $indexers; } $relatedIndexers = []; $dependentIndexers = []; foreach ($indexers as $indexer) { $relatedIndexers[] = $this->getRelatedIndexerIds($indexer->getId()); $dependentIndexers[] = $this->getDependentIndexerIds($indexer->getId()); } $relatedIndexers = array_unique(array_merge([], ...$relatedIndexers)); $dependentIndexers = array_merge([], ...$dependentIndexers); $invalidRelatedIndexers = []; foreach ($relatedIndexers as $relatedIndexer) { if ($allIndexers[$relatedIndexer]->isInvalid()) { $invalidRelatedIndexers[] = $relatedIndexer; } } return array_intersect_key( $allIndexers, array_flip( array_unique( array_merge( array_keys($indexers), $invalidRelatedIndexers, $dependentIndexers ) ) ) ); } Dependency index:reindex command (Hip hip array) 17
  12. class MyIndexer implements \Magento\Framework\Indexer\ActionInterface, \Magento\Framework\Mview\ActionInterface { // ... } Logical

    separation: \Magento\Framework\Indexer\ActionInterface : Update on save / adhoc \Magento\Framework\Mview\ActionInterface : Update on schedule Indexer Class netz98 a valantic company | Christian Münch @cmuench 18
  13. interface ActionInterface { /** * Execute full indexation * *

    @return void */ public function executeFull(); /** * Execute partial indexation by ID list * * @param int[] $ids * @return void */ public function executeList(array $ids); /** * Execute partial indexation by ID * * @param int $id * @return void */ public function executeRow($id); } \Magento\Framework\Indexer\ActionInterface netz98 a valantic company | Christian Münch @cmuench 19
  14. Called by scheduled indexer job (via cronjob) interface ActionInterface {

    /** * Execute materialization on ids entities * * @param int[] $ids * @return void * @api */ public function execute($ids); } \Magento\Framework\Mview\ActionInterface netz98 a valantic company | Christian Münch @cmuench 20
  15. In computing, a materialized view is a database object that

    contains the results of a query. For example, it may be a local copy of data located remotely, or may be a subset of the rows and/or columns of a table or join result, or may be a summary using an aggregate function. -- Wikipedia MView Definition netz98 a valantic company | Christian Münch @cmuench 22
  16. . └── MyModule └── etc └── mview.xml <?xml version="1.0" encoding="UTF-8"?>

    <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:Mview/etc/mview.xsd"> <view id="catalog_product_price" class="Magento\Catalog\Model\Indexer\Product\Price" group="indexer"> <subscriptions> <table name="catalog_product_entity" entity_column="entity_id" /> <table name="catalog_product_entity_datetime" entity_column="entity_id" /> <table name="catalog_product_entity_decimal" entity_column="entity_id" /> <table name="catalog_product_entity_int" entity_column="entity_id" /> <table name="catalog_product_entity_tier_price" entity_column="entity_id" /> </subscriptions> </view> </config> MView Definition -> Trigger Config netz98 a valantic company | Christian Münch @cmuench 24
  17. Status: working, valid, invalid Indexer State in the database netz98

    a valantic company | Christian Münch @cmuench 25
  18. Insert: BEGIN INSERT IGNORE INTO `catalog_category_product_cl` (`entity_id`) VALUES (NEW.`entity_id`); END

    Update: BEGIN IF (NEW.`entity_id` <=> OLD.`entity_id` OR NEW.`attribute_set_id` <=> OLD.`attribute_set_id` OR NEW.`parent_id` <=> OLD.`parent_id` OR NEW.`created_at` <=> OLD.`created_at` OR NEW.`path` <=> OLD.`path` OR NEW.`position` <=> OLD.`position` OR NEW.`level` <=> OLD.`level` OR NEW.`children_count` <=> OLD.`children_count`) THEN INSERT IGNORE INTO `catalog_category_product_cl` (`entity_id`) VALUES (NEW.`entity_id`); END IF; END Delete: BEGIN INSERT IGNORE INTO `catalog_category_product_cl` (`entity_id`) VALUES (OLD.`entity_id`); END How triggers look like netz98 a valantic company | Christian Münch @cmuench 27
  19. Version-ID (last id in changelog table) status -> idle, working,

    suspended Up-to-date: version_id == last version_id in _CL table MView Status netz98 a valantic company | Christian Münch @cmuench 29
  20. php bin/magento cron:run # only indexer jobs php bin/magento cron:run

    --group index Indexer Cron netz98 a valantic company | Christian Münch @cmuench 30
  21. Job Description Frequency indexer_clean_all_changelogs Cleanup old entries in _CL tables.

    daily indexer_reindex_all_invalid Start full reindex of broken indexers every minute indexer_update_all_views Executes "update on schedule" every minute Indexer Cronjobs netz98 a valantic company | Christian Münch @cmuench 31
  22. Indexer batching (Finetuning, depending on data) Configure batch size via

    ENV variables or in env.php Indexer table switching (for full reindex) https://developer.adobe.com/commerce/php/development/components/indexing/optimiz ation/ Optimizations netz98 a valantic company | Christian Münch @cmuench 33
  23. The status values in the indexer_state or mview_state database tables

    may not be the same as what is observed, because they sometimes do not get updated when an indexer fails. https://developer.adobe.com/commerce/php/development/components/indexing/#using-application-lock-mode-for- reindex-processes more accurate status by using a \Magento\Framework\Lock\LockManagerInterface implementation (DB, Zookeeper, FileLock, ...). cronjob will retry to index without a reset of the indexer env.php 'indexer' => [ 'use_application_lock' => true ] Don't trust the rabbit netz98 a valantic company | Christian Münch @cmuench 34
  24. Lock Manager is configured correct IMHO no ... if netz98

    a valantic company | Christian Münch @cmuench 36
  25. <type name="Magento\Framework\Mview\View\Subscription"> <arguments> <argument name="ignoredUpdateColumnsBySubscription" xsi:type="array"> <item name="my_custom_view" xsi:type="array"> <item

    name="salesrule" xsi:type="array"> <item name="times_used" xsi:type="string">times_used</item> </item> </item> </argument> </arguments> </type> https://github.com/magento/magento2/issues/30243 Ignore columns for mview to be specified at the subscription level netz98 a valantic company | Christian Münch @cmuench 37
  26. If triggers are missing (can be possible if DB dump

    was exported without triggers) ... n98-magerun2.phar index:trigger:recreate Force trigger recreation netz98 a valantic company | Christian Münch @cmuench 38
  27. Must be handled manually create, update, delete Different kind of

    implementations in Magento Commit Callbacks in Resource Models (own implementation) Plugins (modify an entity which is not fully controlled by your module) Events/Observer (process/entity dependency -> Trigger indexer of another entity) Most the time we extend indexer and do not require to handle the re-indexing. Test if indexer processes the data after a modification (e.g mview.xml) Test in Magento Admin if there is a UI Test changes via webapi Update On Save netz98 a valantic company | Christian Münch @cmuench 40
  28. \Magento\Catalog\Model\Product::afterSave public function afterSave() { $this->getLinkInstance()->saveProductRelations($this); $this->getTypeInstance()->save($this); if ($this->getStockData()) {

    $this->setForceReindexEavRequired(true); } $this->_getResource()->addCommitCallback([$this, 'priceReindexCallback']); $this->_getResource()->addCommitCallback([$this, 'eavReindexCallback']); $result = parent::afterSave(); $this->_getResource()->addCommitCallback([$this, 'reindex']); $this->reloadPriceInfo(); return $result; } /** * Reindex callback for EAV indexer * * @return void */ public function eavReindexCallback() { if ($this->isObjectNew() || $this->isDataChanged()) { $this->_productEavIndexerProcessor->reindexRow($this->getEntityId()); } } Example: Commit Callback / Resource Model netz98 a valantic company | Christian Münch @cmuench 41
  29. Magento_CatalogRule module di.xml <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:ObjectManager/etc/config.xsd"> <type name="Magento\Catalog\Model\Product"> <plugin name="apply_catalog_rules_after_product_save_and_reindex"

    type="Magento\CatalogRule\Plugin\Indexer\Product\Save\ApplyRulesAfterReindex"/> </type> </config> class ApplyRulesAfterReindex { /** * @param ProductRuleProcessor $productRuleProcessor */ public function __construct(ProductRuleProcessor $productRuleProcessor) { $this->productRuleProcessor = $productRuleProcessor; } // ... public function afterReindex(Product $subject) { $this->productRuleProcessor->reindexRow($subject->getId()); } } Example: Plugin netz98 a valantic company | Christian Münch @cmuench 42
  30. public function execute(EventObserver $observer) { // Reindex quote ids $quote

    = $observer->getEvent()->getQuote(); $productIds = []; foreach ($quote->getAllItems() as $item) { $productIds[$item->getProductId()] = $item->getProductId(); $children = $item->getChildrenItems(); if ($children) { foreach ($children as $childItem) { $productIds[$childItem->getProductId()] = $childItem->getProductId(); } } } if ($productIds) { $this->stockIndexerProcessor->reindexList($productIds); // <-- CALL INDEXER 1 } // Reindex previously remembered items $productIds = []; foreach ($this->itemsForReindex->getItems() as $item) { $item->save(); $productIds[] = $item->getProductId(); } if (!empty($productIds)) { $this->priceIndexer->reindexList($productIds); // <-- CALL INDEXER 2 } $this->itemsForReindex->clear(); // Clear list of remembered items - we don't need it anymore } netz98 a valantic company | Christian Münch @cmuench 44
  31. vendor/magento/module-indexer/etc/di.xml <type name="Magento\Framework\Indexer\ActionInterface"> <plugin name="cache_cleaner_after_reindex" type="Magento\Indexer\Model\Indexer\CacheCleaner" /> </type> <type name="Magento\Framework\Indexer\CacheContext">

    <plugin name="defer_cache_cleaning" type="Magento\Indexer\Model\Indexer\DeferCacheCleaning" /> </type> Plugins in Magento_Indexer module netz98 a valantic company | Christian Münch @cmuench 46
  32. class CacheCleaner { /** * Defer cache cleaning until after

    execute full * * @param ActionInterface $subject * @return void * @SuppressWarnings(PHPMD.UnusedFormalParameter) */ public function beforeExecuteFull(ActionInterface $subject) { $this->cacheCleaner->start(); } /** * Clean cache after full reindex full * * @param ActionInterface $subject * @return void * @SuppressWarnings(PHPMD.UnusedFormalParameter) */ public function afterExecuteFull(ActionInterface $subject) { $this->cacheCleaner->flush(); } /** * Defer cache cleaning until after execute list * * @param ActionInterface $subject * @return void * @SuppressWarnings(PHPMD.UnusedFormalParameter) */ public function beforeExecuteList(ActionInterface $subject) { $this->cacheCleaner->start(); } /** * Clean cache after reindexed list. * * @param ActionInterface $subject * @return void * @SuppressWarnings(PHPMD.UnusedFormalParameter) */ public function afterExecuteList(ActionInterface $subject) { $this->cacheCleaner->flush(); } /** * Defer cache cleaning until after execute row * * @param ActionInterface $subject * @return void * @SuppressWarnings(PHPMD.UnusedFormalParameter) */ public function beforeExecuteRow(ActionInterface $subject) { $this->cacheCleaner->start(); } /** * Clean cache after reindexed row. * * @param ActionInterface $subject * @return void * @SuppressWarnings(PHPMD.UnusedFormalParameter) */ public function afterExecuteRow(ActionInterface $subject) { $this->cacheCleaner->flush(); } } netz98 a valantic company | Christian Münch @cmuench 47
  33. class DeferredCacheCleaner { // ... /** * @param EventManager $eventManager

    * @param CacheInterface $appCache * @param DeferredCacheContext $deferredCacheContext * @param CacheContext $cacheContext */ public function __construct( EventManager $eventManager, CacheInterface $appCache, DeferredCacheContext $deferredCacheContext, CacheContext $cacheContext ) { $this->eventManager = $eventManager; $this->deferredCacheContext = $deferredCacheContext; $this->appCache = $appCache; $this->cacheContext = $cacheContext; } /** * Defer cache cleaning until flush() is called * * @see flush() */ public function start(): void { $this->deferredCacheContext->start(); } /** * Flush cache */ public function flush(): void { $this->deferredCacheContext->commit(); $this->eventManager->dispatch('clean_cache_by_tags', ['object' => $this->cacheContext]); $identities = $this->cacheContext->getIdentities(); if (!empty($identities)) { $this->appCache->clean($identities); $this->cacheContext->flush(); } } } netz98 a valantic company | Christian Münch @cmuench 48
  34. vendor/magento/module-cache-invalidate/etc/events.xml <event name="clean_cache_by_tags"> <observer name="invalidate_varnish" instance="Magento\CacheInvalidate\Observer\InvalidateVarnishObserver"/> </event> vendor/magento/module-page-cache/etc/events.xml <event name="clean_cache_by_tags">

    <observer name="invalidate_builtin" instance="Magento\PageCache\Observer\FlushCacheByTags" /> </event> clean_cache_by_tags -> FPC Invalidation netz98 a valantic company | Christian Münch @cmuench 49
  35. vendor/magento/module-customer/etc/indexer.xml <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:Indexer/etc/indexer.xsd"> <indexer id="customer_grid" class="Magento\Framework\Indexer\Action\Entity" primary="customer"> <!-- ...

    --> <fieldset name="customer" source="Magento\Customer\Model\Indexer\Source" provider="Magento\Customer\Model\Indexer\AttributeProvider"> <field name="name" xsi:type="searchable" dataType="text" handler="CustomerNameHandler"/> <field name="email" xsi:type="searchable" dataType="varchar"/> <field name="group_id" xsi:type="filterable" dataType="int"/> <field name="created_at" xsi:type="filterable" dataType="timestamp"/> <!-- ... --> </fieldset> <fieldset name="shipping" source="Magento\Customer\Model\ResourceModel\Address\Collection"> <reference fieldset="customer" from="entity_id" to="default_shipping"/> <field name="full" xsi:type="searchable" dataType="text" handler="ShippingAddressHandler"/> </fieldset> <fieldset name="billing"> <!-- ... --> </fieldset> <saveHandler class="Magento\Framework\Indexer\SaveHandler\Grid"/> <structure class="Magento\Framework\Indexer\GridStructure"/> </indexer> </config> "Structure" Handling / Save Handler netz98 a valantic company | Christian Münch @cmuench 50