Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Bottlenecks to Breakthroughs: How the New ...

From Bottlenecks to Breakthroughs: How the New York Post Mastered Scalability

Linnea Huxford

September 19, 2024
Tweet

Other Decks in Programming

Transcript

  1. Agenda The New York Post’s WordPress Journey An Overview of

    Scaling Strategies Deep Dive into Bottlenecks & Solutions
  2. A long history of building and publishing with WordPress Number

    of years on WordPress 11 Years Database rows for the WP_Posts table 15 Million
  3. Timeline 2013 Launched The NY Post on WordPress 2015 Redesigned

    NY Post & Page Six 2020 Migrated sites from VIP Classic to VIP Go 2021 A/B tests and rollout of the 2021 NY Post redesign 2023 Launched the redesign of Page Six as a child theme
  4. We run a WordPress Multisite on WordPress VIP with a

    “monolith” architecture • Gutenberg block editor with custom blocks • WP VIPʼs mu-plugins have many performance tweaks to support enterprise usage • Persistent object cache - memcached • Reliable Cron System with Automatticʼs Cron Control plugin • CDN for images to deliver them quickly from servers close to the userʼs location • Robust Page Cache to serve up pre-built HTML to the majority of users.
  5. A typical page request has very few database queries because

    most data is already in the object cache.
  6. Key Principle: No database writes from frontend traffic • API

    integrations often hook in on admin actions. For example, when posts are saved or updated, using the wp_after_insert_post hook. • Scheduled cron jobs update data from external sources. • To optimize performance, API data is stored in the object cache, minimizing the need for repeated external requests • Rewrite rules are updated via a CLI command • Analytics, comments, and membership functionalities are powered by external third parties, leveraging JavaScript HTTP calls and scripts instead of backend calls to WordPress.
  7. We use the following Elasticsearch Plugins for better query performance

    SearchPress Indexes content in Elasticsearch ES WP Query Offloads query to Elasticsearch https://github.com/alleyinteractive/searchpress https://github.com/alleyinteractive/es-wp-query ES_WP_Query
  8. Simply add ‘es’ => true and the query will use

    Elasticsearch to get the Post IDs Elasticsearch DSL Elasticsearch Cluster Post ID Post ID Post ID Post ID … Complex Query: Metro Snowstorm Posts with Video
  9. ES WP Query will then shoehorn the Post IDs from

    Elasticsearch into a highly efficient post__in query to retrieve the full WP_Post objects from the database. es-wp-query/class-es-wp-query-shoehorn.php Post ID Post ID Post ID Post ID …
  10. We built Remote Backstop as a safety net https://github.com/alleyinteractive/remote-backstop Object

    Cache Object Cache Object Cache Object Cache Object Cache Remote Services Remote Backstop
  11. Remote backstop has several key benefits 1. Site Stability: The

    plugin prevents site downtime when external resources are unavailable. 2. Stress Reduction: Remote Backstop alleviates the load on a stressed resource by serving cached responses. 3. User Experience: End users are unaffected, unaware of any issues, due to the seamless fallback mechanism.
  12. The WP New Relic Transactions plugin adds WordPress context to

    New Relic https://github.com/alleyinteractive/wp-new-relic-transactions
  13. Bottleneck 2 Simultaneous cache expiry The homepage is very long

    and each module’s query is cached. If all of the caches expire at the same time, the page becomes slow to build.
  14. Solution Implementing staggered cache durations Random cache duration leads to

    more evenly distributed cache expiration, preventing server overload.
  15. Solutions to slow queries • Solution 1 Avoid posts__not_in ◦

    Get more posts than needed and exclude posts in PHP. • Solution 2 Disable Getting All Found Rows ◦ If Pagination isnʼt necessary, set 'no_found_rows' => true • Solution 3 Optimize Query Parameters ◦ 'ignore_sticky_posts' => true ◦ 'suppress_filters' => false • Solution 4 Offload to Elasticsearch and Cache Results • Solution 5 Avoid Unnecessary Queries ◦ Whenever possible, avoid running the query entirely. Sometimes, plugins add unnecessary functionality that can be disabled to improve performance.
  16. Add a Date Query • Getting 100 Recent Posts without

    a date query: 171 ms • Limiting the query to the last 3 months: 46 ms
  17. Conclusion WordPress, coupled with WordPress VIP hosting, offers a robust

    foundation for building enterprise solutions that scale effectively. Weʼve relied on several key principles to achieve this scalability: • Leveraging Elasticsearch to enhance WP Query performance and offload complex queries. • Utilizing Object Cache to efficiently store data from queries and external APIs, reducing load times. • Adhering to the principle of ‘No Frontend Database Writes,ʼ ensuring data integrity and performance. • Identifying bottlenecks using tools like New Relic and Query Monitor to pinpoint and resolve performance issues. • Applying common solutions such as query optimizations and caching strategies to further boost efficiency.
  18. “Great things are not done by impulse, but by a

    series of small things brought together.ˮ — Vincent Van Gogh