Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Magento 2 Indexer

Magento 2 Indexer

- What's an index?
- What's an indexer?
- What's indexing?
A brief introduction about most of the indexing stuff in Magento 2.
What's an mview?
When will data be processed?
Naming: reatime, update on schedule, update on save, ...
- Lock Management
- Parallel processing
- Dependencies
- Trigger
- Full reindex
- Cronjobs

Christian Münch

October 02, 2022
Tweet

More Decks by Christian Münch

Other Decks in Technology

Transcript

  1. Magento 2 Indexer
    1

    View Slide

  2. Indexing? Indexer? Index? What?
    netz98 a valantic company | Christian Münch @cmuench
    2

    View Slide

  3. Indexing is how Adobe Commerce and Magento Open Source transform data
    such as products and categories, to improve the performance of your storefront.
    As data changes, the transformed data must be updated or reindexed. The
    application has a very sophisticated architecture that stores lots of merchant data
    (including catalog data, prices, users, and stores) in many database tables. To
    optimize storefront performance, the application accumulates data into special
    tables using indexers.

    https://developer.adobe.com/commerce/php/development/components/indexing/
    Definition by Adobe/Magento
    netz98 a valantic company | Christian Münch @cmuench
    3

    View Slide

  4. Dictionary: Original data entered to the system. Dictionaries are organized in
    normal form to facilitate maintenance (updating the data).
    Index: Representation of the original data for optimized reading and searching.
    Indexes can contain results of aggregations and various calculations. Index data
    can be always re-created from a dictionary using a certain algorithm.
    Indexer : Object that creates an index.
    https://developer.adobe.com/commerce/php/development/components/indexing/#indexi
    ng-terminology
    Terminology
    netz98 a valantic company | Christian Münch @cmuench
    4

    View Slide

  5. php bin/magento indexer:info

    customer_grid Customer Grid

    design_config_grid Design Config Grid

    catalog_category_product Category Products

    catalog_product_category Product Categories

    catalogrule_rule Catalog Rule Product

    catalog_product_attribute Product EAV

    cataloginventory_stock Stock

    catalog_product_price Product Price

    catalogrule_product Catalog Product Rule

    catalogsearch_fulltext Catalog Search

    targetrule_product_rule Product/Target Rule

    targetrule_rule_product Target Rule/Product

    salesrule_rule Sales Rule

    elasticsuite_categories_fulltext ElasticSuite Category Indexing

    elasticsuite_cms_page_fulltext Elasticsuite Cms Page Indexing

    elasticsuite_thesaurus ElasticSuite Thesaurus Indexing

    Available Indexers
    netz98 a valantic company | Christian Münch @cmuench
    5

    View Slide

  6. php bin/magento indexer:reindex

    php bin/magento indexer:reindex catalog_category_product catalogrule_product

    Full Reindex
    netz98 a valantic company | Christian Münch @cmuench
    6

    View Slide

  7. php bin/magento indexer:show-mode

    Customer Grid: Update by Schedule

    Design Config Grid: Update by Schedule

    Category Products: Update by Schedule

    Product Categories: Update by Schedule

    Catalog Rule Product: Update by Schedule

    Product EAV: Update by Schedule

    Stock: Update by Schedule

    Product Price: Update by Schedule

    Catalog Product Rule: Update by Schedule

    Catalog Search: Update by Schedule

    Product/Target Rule: Update by Schedule

    Target Rule/Product: Update by Schedule

    Sales Rule: Update by Schedule

    ElasticSuite Category Indexing: Update on Save

    Elasticsuite Cms Page Indexing: Update on Save

    ElasticSuite Thesaurus Indexing: Update on Save

    Indexer Mode
    netz98 a valantic company | Christian Münch @cmuench
    7

    View Slide

  8. Update "on save"
    Synchronously aka "realtime"
    After every model save. Uses PHP logic.
    Minimal delay
    Cause database locks on concurrent writes
    Update "on scheduled"
    Asynchronously
    Sequentially processed by cron job (Group "indexer").
    Cronjob creates database triggers. Detects database changes. Also without
    PHP logic.
    Delay (depends on cronjob runtime and frequency)
    It is scaling! No database locks.
    Indexer Mode
    netz98 a valantic company | Christian Münch @cmuench
    8

    View Slide

  9. php bin/magento indexer:set-mode

    Index mode for Indexer Customer Grid has not been changed

    Index mode for Indexer Design Config Grid has not been changed

    Index mode for Indexer Category Products has not been changed

    Index mode for Indexer Product Categories has not been changed

    Index mode for Indexer Catalog Rule Product has not been changed

    Index mode for Indexer Product EAV has not been changed

    Index mode for Indexer Stock has not been changed

    Index mode for Indexer Product Price has not been changed

    Index mode for Indexer Catalog Product Rule has not been changed

    Index mode for Indexer Catalog Search has not been changed

    Index mode for Indexer Product/Target Rule has not been changed

    Index mode for Indexer Target Rule/Product has not been changed

    Index mode for Indexer Sales Rule has not been changed

    Index mode for Indexer ElasticSuite Category Indexing was changed from 'Update on Save' to 'Update by Schedule'

    Index mode for Indexer Elasticsuite Cms Page Indexing was changed from 'Update on Save' to 'Update by Schedule'

    Index mode for Indexer ElasticSuite Thesaurus Indexing was changed from 'Update on Save' to 'Update by Schedule'

    Change Mode
    netz98 a valantic company | Christian Münch @cmuench
    9

    View Slide

  10. php bin/magento indexer:status

    Is everything up-to-date? - Indexer Status
    netz98 a valantic company | Christian Münch @cmuench
    10

    View Slide

  11. php bin/magento indexer:reset

    php bin/magento indexer:reset catalog_product_price

    Reset after a interrupted index:reindex
    netz98 a valantic company | Christian Münch @cmuench
    11

    View Slide

  12. Supported is currently only the price indexer.
    bin/magento indexer:show-dimensions-mode

    bin/magento indexer:set-dimensions-mode catalog_product_price website

    Mode can be "website, customer_group,website_and_customer_group"
    Indexer Mode
    netz98 a valantic company | Christian Münch @cmuench
    12

    View Slide

  13. MAGE_INDEXER_THREADS_COUNT=3 php -f bin/magento indexer:reindex catalog_product_price

    Uses pcntl_fork
    Parallel
    netz98 a valantic company | Christian Münch @cmuench
    13

    View Slide

  14. Indexer Implementation
    netz98 a valantic company | Christian Münch @cmuench
    14

    View Slide

  15. via config in a module
    .

    └── MyModule

    └── etc

    └── indexer.xml







    Inventory

    Inventory index (MSI)















    Indexer Registration
    netz98 a valantic company | Christian Münch @cmuench
    15

    View Slide

  16. class \Magento\Indexer\Model\Indexer\DependencyDecorator implements \Magento\Framework\Indexer\IndexerInterface

    {

    // ...

    public function reindexRow($id)

    {

    $this->cacheCleaner->start();

    $this->indexer->reindexRow($id);

    $dependentIndexerIds = $this->dependencyInfoProvider->getIndexerIdsToRunAfter($this->indexer->getId());

    foreach ($dependentIndexerIds as $indexerId) {

    $dependentIndexer = $this->indexerRegistry->get($indexerId);

    if (!$dependentIndexer->isScheduled()) {

    $dependentIndexer->reindexRow($id);

    }

    }

    $this->cacheCleaner->flush();

    }

    }

    Default implementation defined in di.xml of Magento_Indexer module.
    Dependency "update on save"
    netz98 a valantic company | Christian Münch @cmuench
    16

    View Slide

  17. \Magento\Indexer\Console\Command\IndexerReindexCommand::getIndexers
    protected function getIndexers(InputInterface $input)
    {
    $indexers = parent::getIndexers($input);
    $allIndexers = $this->getAllIndexers();
    if (!array_diff_key($allIndexers, $indexers)) {
    return $indexers;
    }
    $relatedIndexers = [];

    $dependentIndexers = [];

    foreach ($indexers as $indexer) {
    $relatedIndexers[] = $this->getRelatedIndexerIds($indexer->getId());

    $dependentIndexers[] = $this->getDependentIndexerIds($indexer->getId());

    }
    $relatedIndexers = array_unique(array_merge([], ...$relatedIndexers));

    $dependentIndexers = array_merge([], ...$dependentIndexers);

    $invalidRelatedIndexers = [];

    foreach ($relatedIndexers as $relatedIndexer) {

    if ($allIndexers[$relatedIndexer]->isInvalid()) {
    $invalidRelatedIndexers[] = $relatedIndexer;

    }
    }
    return array_intersect_key(

    $allIndexers,
    array_flip(
    array_unique(
    array_merge(

    array_keys($indexers),

    $invalidRelatedIndexers,

    $dependentIndexers

    )

    )
    )
    );
    }
    Dependency index:reindex command (Hip hip array)
    17

    View Slide

  18. class MyIndexer implements

    \Magento\Framework\Indexer\ActionInterface,

    \Magento\Framework\Mview\ActionInterface

    {

    // ...
    }

    Logical separation:
    \Magento\Framework\Indexer\ActionInterface : Update on save / adhoc

    \Magento\Framework\Mview\ActionInterface : Update on schedule
    Indexer Class
    netz98 a valantic company | Christian Münch @cmuench
    18

    View Slide

  19. interface ActionInterface

    {

    /**

    * Execute full indexation

    *

    * @return void

    */

    public function executeFull();

    /**

    * Execute partial indexation by ID list

    *

    * @param int[] $ids

    * @return void

    */

    public function executeList(array $ids);

    /**

    * Execute partial indexation by ID

    *

    * @param int $id

    * @return void

    */

    public function executeRow($id);

    }

    \Magento\Framework\Indexer\ActionInterface
    netz98 a valantic company | Christian Münch @cmuench
    19

    View Slide

  20. Called by scheduled indexer job (via cronjob)
    interface ActionInterface

    {

    /**

    * Execute materialization on ids entities

    *

    * @param int[] $ids

    * @return void

    * @api

    */

    public function execute($ids);

    }

    \Magento\Framework\Mview\ActionInterface
    netz98 a valantic company | Christian Münch @cmuench
    20

    View Slide

  21. How update on schedule works
    netz98 a valantic company | Christian Münch @cmuench
    21

    View Slide

  22. In computing, a materialized view is a database object that contains the results of
    a query. For example, it may be a local copy of data located remotely, or may be a
    subset of the rows and/or columns of a table or join result, or may be a summary
    using an aggregate function.
    -- Wikipedia
    MView Definition
    netz98 a valantic company | Christian Münch @cmuench
    22

    View Slide

  23. © https://oracle-base.com/articles/misc/materialized-views
    Materialized Views in a Oracle Database
    netz98 a valantic company | Christian Münch @cmuench
    23

    View Slide

  24. .

    └── MyModule

    └── etc

    └── mview.xml




    xsi:noNamespaceSchemaLocation="urn:magento:framework:Mview/etc/mview.xsd">


    id="catalog_product_price"

    class="Magento\Catalog\Model\Indexer\Product\Price"

    group="indexer">





















    MView Definition -> Trigger Config
    netz98 a valantic company | Christian Münch @cmuench
    24

    View Slide

  25. Status: working, valid, invalid
    Indexer State in the database
    netz98 a valantic company | Christian Münch @cmuench
    25

    View Slide

  26. When will triggers be created?
    netz98 a valantic company | Christian Münch @cmuench
    26

    View Slide

  27. Insert:
    BEGIN
    INSERT IGNORE INTO `catalog_category_product_cl` (`entity_id`) VALUES (NEW.`entity_id`);

    END
    Update:
    BEGIN
    IF (NEW.`entity_id` <=> OLD.`entity_id`

    OR NEW.`attribute_set_id` <=> OLD.`attribute_set_id`

    OR NEW.`parent_id` <=> OLD.`parent_id`

    OR NEW.`created_at` <=> OLD.`created_at`

    OR NEW.`path` <=> OLD.`path`
    OR NEW.`position` <=> OLD.`position`

    OR NEW.`level` <=> OLD.`level`
    OR NEW.`children_count` <=> OLD.`children_count`)

    THEN INSERT IGNORE INTO `catalog_category_product_cl` (`entity_id`) VALUES (NEW.`entity_id`);

    END IF;
    END
    Delete:
    BEGIN
    INSERT IGNORE INTO `catalog_category_product_cl` (`entity_id`) VALUES (OLD.`entity_id`);

    END
    How triggers look like
    netz98 a valantic company | Christian Münch @cmuench
    27

    View Slide

  28. Changelog Tables
    28

    View Slide

  29. Version-ID (last id in changelog table)
    status -> idle, working, suspended
    Up-to-date: version_id == last version_id in _CL table
    MView Status
    netz98 a valantic company | Christian Münch @cmuench
    29

    View Slide

  30. php bin/magento cron:run

    # only indexer jobs

    php bin/magento cron:run --group index

    Indexer Cron
    netz98 a valantic company | Christian Münch @cmuench
    30

    View Slide

  31. Job Description Frequency
    indexer_clean_all_changelogs Cleanup old entries in _CL tables. daily
    indexer_reindex_all_invalid Start full reindex of broken indexers every minute
    indexer_update_all_views Executes "update on schedule" every minute
    Indexer Cronjobs
    netz98 a valantic company | Christian Münch @cmuench
    31

    View Slide

  32. Update on scheduled processing
    netz98 a valantic company | Christian Münch @cmuench
    32

    View Slide

  33. Indexer batching (Finetuning, depending on data)
    Configure batch size via ENV variables or in env.php
    Indexer table switching (for full reindex)
    https://developer.adobe.com/commerce/php/development/components/indexing/optimiz
    ation/
    Optimizations
    netz98 a valantic company | Christian Münch @cmuench
    33

    View Slide

  34. The status values in the indexer_state or mview_state database tables may not be the same as what is observed,
    because they sometimes do not get updated when an indexer fails.

    https://developer.adobe.com/commerce/php/development/components/indexing/#using-application-lock-mode-for-
    reindex-processes
    more accurate status by using a
    \Magento\Framework\Lock\LockManagerInterface implementation

    (DB, Zookeeper, FileLock, ...).
    cronjob will retry to index without a reset of the indexer
    env.php
    'indexer' => [
    'use_application_lock' => true

    ]

    Don't trust the rabbit
    netz98 a valantic company | Christian Münch @cmuench
    34

    View Slide

  35. https://twitter.com/ospadano/status/1570859136379924482?
    t=R9Gp9EkBz1eYiYw05fgFFQ
    Can Indexer Cronjobs conflict?
    netz98 a valantic company | Christian Münch @cmuench
    35

    View Slide

  36. Lock

    Manager is configured correct
    IMHO no ... if
    netz98 a valantic company | Christian Münch @cmuench
    36

    View Slide











  37. times_used











    https://github.com/magento/magento2/issues/30243
    Ignore columns for mview to be specified at the
    subscription level
    netz98 a valantic company | Christian Münch @cmuench
    37

    View Slide

  38. If triggers are missing (can be possible if DB dump was exported without triggers) ...
    n98-magerun2.phar index:trigger:recreate

    Force trigger recreation
    netz98 a valantic company | Christian Münch @cmuench
    38

    View Slide

  39. Update On Save
    netz98 a valantic company | Christian Münch @cmuench
    39

    View Slide

  40. Must be handled manually
    create, update, delete
    Different kind of implementations in Magento
    Commit Callbacks in Resource Models (own implementation)
    Plugins (modify an entity which is not fully controlled by your module)
    Events/Observer (process/entity dependency -> Trigger indexer of another
    entity)
    Most the time we extend indexer and do not require to handle the re-indexing.
    Test if indexer processes the data after a modification (e.g mview.xml)
    Test in Magento Admin if there is a UI
    Test changes via webapi
    Update On Save
    netz98 a valantic company | Christian Münch @cmuench
    40

    View Slide

  41. \Magento\Catalog\Model\Product::afterSave
    public function afterSave()
    {
    $this->getLinkInstance()->saveProductRelations($this);

    $this->getTypeInstance()->save($this);
    if ($this->getStockData()) {
    $this->setForceReindexEavRequired(true);

    }
    $this->_getResource()->addCommitCallback([$this, 'priceReindexCallback']);

    $this->_getResource()->addCommitCallback([$this, 'eavReindexCallback']);

    $result = parent::afterSave();
    $this->_getResource()->addCommitCallback([$this, 'reindex']);

    $this->reloadPriceInfo();
    return $result;
    }
    /**
    * Reindex callback for EAV indexer

    *
    * @return void
    */
    public function eavReindexCallback()

    {
    if ($this->isObjectNew() || $this->isDataChanged()) {
    $this->_productEavIndexerProcessor->reindexRow($this->getEntityId());

    }
    }
    Example: Commit Callback / Resource Model
    netz98 a valantic company | Christian Münch @cmuench
    41

    View Slide

  42. Magento_CatalogRule module di.xml




    name="apply_catalog_rules_after_product_save_and_reindex" type="Magento\CatalogRule\Plugin\Indexer\Product\Save\ApplyRulesAfterReindex"/>



    class ApplyRulesAfterReindex

    {
    /**
    * @param ProductRuleProcessor $productRuleProcessor

    */
    public function __construct(ProductRuleProcessor $productRuleProcessor)

    {
    $this->productRuleProcessor = $productRuleProcessor;

    }
    // ...
    public function afterReindex(Product $subject)
    {
    $this->productRuleProcessor->reindexRow($subject->getId());

    }
    }
    Example: Plugin
    netz98 a valantic company | Christian Münch @cmuench
    42

    View Slide

  43. vendor/magento/module-catalog-inventory/etc/events.xml






    Example 3: Event/Observer
    netz98 a valantic company | Christian Münch @cmuench
    43

    View Slide

  44. public function execute(EventObserver $observer)

    {

    // Reindex quote ids

    $quote = $observer->getEvent()->getQuote();



    $productIds = [];

    foreach ($quote->getAllItems() as $item) {

    $productIds[$item->getProductId()] = $item->getProductId();

    $children = $item->getChildrenItems();

    if ($children) {

    foreach ($children as $childItem) {

    $productIds[$childItem->getProductId()] = $childItem->getProductId();

    }

    }

    }

    if ($productIds) {

    $this->stockIndexerProcessor->reindexList($productIds); // <-- CALL INDEXER 1

    }

    // Reindex previously remembered items

    $productIds = [];

    foreach ($this->itemsForReindex->getItems() as $item) {

    $item->save();

    $productIds[] = $item->getProductId();

    }

    if (!empty($productIds)) {

    $this->priceIndexer->reindexList($productIds); // <-- CALL INDEXER 2

    }

    $this->itemsForReindex->clear();

    // Clear list of remembered items - we don't need it anymore

    }

    netz98 a valantic company | Christian Münch @cmuench
    44

    View Slide

  45. Cache Cleaning
    netz98 a valantic company | Christian Münch @cmuench
    45

    View Slide

  46. vendor/magento/module-indexer/etc/di.xml












    Plugins in Magento_Indexer module
    netz98 a valantic company | Christian Münch @cmuench
    46

    View Slide

  47. class CacheCleaner

    {

    /**

    * Defer cache cleaning until after execute full

    *

    * @param ActionInterface $subject

    * @return void

    * @SuppressWarnings(PHPMD.UnusedFormalParameter)

    */

    public function beforeExecuteFull(ActionInterface $subject)

    {

    $this->cacheCleaner->start();

    }

    /**

    * Clean cache after full reindex full

    *

    * @param ActionInterface $subject

    * @return void

    * @SuppressWarnings(PHPMD.UnusedFormalParameter)

    */

    public function afterExecuteFull(ActionInterface $subject)

    {

    $this->cacheCleaner->flush();

    }

    /**

    * Defer cache cleaning until after execute list

    *

    * @param ActionInterface $subject

    * @return void

    * @SuppressWarnings(PHPMD.UnusedFormalParameter)

    */

    public function beforeExecuteList(ActionInterface $subject)

    {

    $this->cacheCleaner->start();

    }

    /**

    * Clean cache after reindexed list.

    *

    * @param ActionInterface $subject

    * @return void

    * @SuppressWarnings(PHPMD.UnusedFormalParameter)

    */

    public function afterExecuteList(ActionInterface $subject)

    {

    $this->cacheCleaner->flush();

    }

    /**

    * Defer cache cleaning until after execute row

    *

    * @param ActionInterface $subject

    * @return void

    * @SuppressWarnings(PHPMD.UnusedFormalParameter)

    */

    public function beforeExecuteRow(ActionInterface $subject)

    {

    $this->cacheCleaner->start();

    }

    /**

    * Clean cache after reindexed row.

    *

    * @param ActionInterface $subject

    * @return void

    * @SuppressWarnings(PHPMD.UnusedFormalParameter)

    */

    public function afterExecuteRow(ActionInterface $subject)

    {

    $this->cacheCleaner->flush();

    }

    }

    netz98 a valantic company | Christian Münch @cmuench
    47

    View Slide

  48. class DeferredCacheCleaner

    {

    // ...

    /**

    * @param EventManager $eventManager

    * @param CacheInterface $appCache

    * @param DeferredCacheContext $deferredCacheContext

    * @param CacheContext $cacheContext

    */

    public function __construct(

    EventManager $eventManager,

    CacheInterface $appCache,

    DeferredCacheContext $deferredCacheContext,

    CacheContext $cacheContext

    ) {

    $this->eventManager = $eventManager;

    $this->deferredCacheContext = $deferredCacheContext;

    $this->appCache = $appCache;

    $this->cacheContext = $cacheContext;

    }

    /**

    * Defer cache cleaning until flush() is called

    *

    * @see flush()

    */

    public function start(): void

    {

    $this->deferredCacheContext->start();

    }

    /**

    * Flush cache

    */

    public function flush(): void

    {

    $this->deferredCacheContext->commit();

    $this->eventManager->dispatch('clean_cache_by_tags', ['object' => $this->cacheContext]);

    $identities = $this->cacheContext->getIdentities();

    if (!empty($identities)) {

    $this->appCache->clean($identities);

    $this->cacheContext->flush();

    }

    }

    }

    netz98 a valantic company | Christian Münch @cmuench
    48

    View Slide

  49. vendor/magento/module-cache-invalidate/etc/events.xml






    vendor/magento/module-page-cache/etc/events.xml






    clean_cache_by_tags -> FPC Invalidation
    netz98 a valantic company | Christian Münch @cmuench
    49

    View Slide

  50. vendor/magento/module-customer/etc/indexer.xml

    xsi:noNamespaceSchemaLocation="urn:magento:framework:Indexer/etc/indexer.xsd">





    provider="Magento\Customer\Model\Indexer\AttributeProvider">



































    "Structure" Handling / Save Handler
    netz98 a valantic company | Christian Münch @cmuench
    50

    View Slide

  51. https://experienceleague.adobe.com/docs/commerce-operations/configuration-
    guide/cli/manage-indexers.html
    MView Implementation: vendor/magento/framework/Mview
    https://amasty.com/blog/comprehensive-guide-to-magento-2-indexing/
    https://developer.adobe.com/commerce/php/development/components/indexing/
    Performance:
    https://developer.adobe.com/commerce/php/development/components/indexing/op
    timization/
    Resources
    netz98 a valantic company | Christian Münch @cmuench
    51

    View Slide

  52. netz98 a valantic company | Christian Münch @cmuench
    52

    View Slide

  53. Photos provided by
    Canva
    Tobias Fischer
    netz98 a valantic company | Christian Münch @cmuench
    53

    View Slide