Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Webinar_-_Entity_Framework_Core_Performance-Opt...

 Webinar_-_Entity_Framework_Core_Performance-Optimierung_aus_der_Praxis.pdf

Pawel Gerr

April 19, 2023
Tweet

More Decks by Pawel Gerr

Other Decks in Programming

Transcript

  1. Entity Framework Core Performance-Optimierung aus der Praxis • Optimization options

    and KPIs • Reduction of database requests • Reduction of query complexity • Query statistics and execution plans Agenda P
  2. Entity Framework Core Performance-Optimierung aus der Praxis My usual options

    for Entity Framework performance optimization 2nd level cache Hardware upgrade Materialized views Stored procedures Triggers Database Features Server Raw SELECT * FROM … Raw SQL Bulk operations Indices Entity Framework Requests reduction Query restructure Use 3rd party libs if available CPU, Memory, I/O
  3. Entity Framework Core Performance-Optimierung aus der Praxis Did „the change“

    anything good? – My KPI Execution time (LINQ) SELECT * FROM … SELECT * FROM … SELECT * FROM … Number of requests Query statistics SELECT * FROM … “Shape” of SQL Execution plans Parse time Compile time Execution time (DB) Physical reads Logical reads Row count Round trip time SQL generation Execution time (DB) Materialization My key performance indicators
  4. Entity Framework Core Performance-Optimierung aus der Praxis N+1 queries problem

    Execution of hundreds of requests for one use case Caused by: unintentional execution of queries in loops Possible solutions: • Better code structure (software architecture) • Avoid overgeneralization (shared/core projects) • Help each other to improve (review/feedback) • Know the insights of Entity Framework • Lazy loading • Limitations Reducing database requests Demos 1.1 + 1.2
  5. Simplified case of indirect execution of database requests in a

    loop Library Main Entity Framework Core Performance-Optimierung aus der Praxis Multiple requests in loops Reducing database requests var products = LoadProducts(); var prices = new List<Price>(); foreach (var product in products) { var price = GetPrice(product.Id); prices.Add(price); } public Price GetPrice(Guid productId) { MyDbContext ctx = …; return ctx.Prices.FirstOrDefault(p => p.ProductId == productId); } Executed N times
  6. Library Main Entity Framework Core Performance-Optimierung aus der Praxis LINQ

    queries are loops Reducing database requests var products = LoadProducts(); var prices = products.Select(product => GetPrice(product.Id)) .ToList(); public Price GetPrice(Guid productId) { MyDbContext ctx = …; return ctx.Prices.FirstOrDefault(p => p.ProductId == productId); }
  7. Library Main Entity Framework Core Performance-Optimierung aus der Praxis Approach

    1 – Use-case specific methods Reducing database requests var productsWithPrices = LoadProductsWithPrices(); public List<ProductWithPrices> LoadProductsWithPrices() { return ctx.Products .Include(p => p.Prices) // merely symbolic, fetching prices is usually more complex .ToList(); } Approach 1.1
  8. Library Main var products = LoadProducts(); var productIds = products.Select(p

    => p.Id); var pricesByProductId = GetPrices(productIds); public ILookup<int, Price> GetPrices(IEnumerable<Guid> productIds) { return ctx.Prices .Where(p => productIds.Contains(p.ProductId)) .ToLookup(p => p.ProductId); } Entity Framework Core Performance-Optimierung aus der Praxis Approach 2 – load all required prices at once Reducing database requests Approach 1.2
  9. Given: 100 products being in 5 studio Loading of all

    products “in one go” leads to 1 + 5 queries Library Main var productsByStudio = ctx.Products .Include(p => p.Studio) .ToLookup(p => p.Studio.Name); Entity Framework Core Performance-Optimierung aus der Praxis Lazy loading Reducing database requests public static ILookup<TKey, TSource> ToLookup<TSource, TKey>( this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) { ... } Use eager loading to prevent unnecessary requests IEnumerable<T> always forces query execution Demo 1.3 Approach 1.3
  10. Entity Framework Core Performance-Optimierung aus der Praxis Finding the problem

    area Use profiling tools In case of Microsoft SQL Server: XEvent Profiler in SQL Server Management Studio SQL Server Profiler (deprecated, resource intensive) Azure Data Studio Logs coming from EF with EnableSensitiveDataLogging Reducing database requests Logging DbContextOptionsBuilder builder = ...; builder.UseSqlServer("...") // or any other database .UseLoggerFactory(loggerFactory) .EnableSensitiveDataLogging(); In development only because possible security leak!
  11. Entity Framework Core Performance-Optimierung aus der Praxis Finding the problem

    area Look for repetitive lines and cycles Reducing database requests Repetive lines Cycles Demo: SQL Server Profiler
  12. Query tags: correlation between LINQ and SQL LINQ SQL Entity

    Framework Core Performance-Optimierung aus der Praxis Finding the problem area Reducing database requests var products = ctx.Products .TagWith("My Query") .ToList(); -- My Query SELECT * FROM Products
  13. Entity Framework Core Performance-Optimierung aus der Praxis Cartesian explosion problem

    Fetching a lot of data from multiple tables at once • Increased memory consumption and execution time • High I/O load Caused by: loading of multiple navigational collection-properties Possible solutions: • Reduce Includes (eager loading) • Reduce access to navigational properties in projections (i.e. Select) Reducing query complexity Query splitting Demo 2.1
  14. Entity Framework Core Performance-Optimierung aus der Praxis Eager loading via

    Include EF 7 loads all data using 1 SQL statement by default (QuerySplittingBehavior.SingleQuery) • Bigger result sets: 100 products with 10 prices each and 2 sellers (incl. 200 product-seller-relationship) (100 * 10 * 2) rows > (100 + 1000 + 2 + 200) rows • EF-forced ORDER BY clause produces considerable load Reducing query complexity Product Id Price Id Seller Id 1 1 1 1 1 2 1 2 1 1 2 2 1 3 1 1 3 2 SQL LINQ var products = ctx.Products .Include(p => p.Studio) .Include(p => p.Prices) .Include(p => p.Sellers) .ToList(); SELECT * FROM Products INNER JOIN Studios ... LEFT JOIN Prices ON ... LEFT JOIN Sellers ON ... ... ORDER BY Products.Id, Studios.Id, Prices.Id, … Cartesian explosion
  15. EF 7 allows to change the QuerySplittingBehavior from SingleQuery to

    SplitQuery Entity Framework Core Performance-Optimierung aus der Praxis Include: query splitting by EF Reducing query complexity Approach 2.1 LINQ var products = ctx.Products .AsSplitQuery() .Include(p => p.Studio) .Include(p => p.Prices) .Include(p => p.Sellers) .ToList();
  16. Entity Framework Core Performance-Optimierung aus der Praxis Fetching collections in

    projections Cartesian Explosion due to fetching of collection in Select Reducing query complexity LINQ var result = _ctx.Studios .Select(s => new { Studio = s, Infinity = s.Products .Where(p => p.Name.Contains("Infinity")) .Select(p => new { Product = p, p.Prices, p.Sellers }), Endgame = s.Products .Where(p => p.Name.Contains("Endgame")) .Select(p => new { Product = p, p.Prices, p.Sellers }) }) .ToList(); Demo 2.2 Before Statistics: • CPU: 1766 • Duration: 3180 • Reads: 20773 • Row Count: 200000
  17. Entity Framework Core Performance-Optimierung aus der Praxis Projections: query splitting

    by EF Cartesian Explosion due to fetching of collection in Select Reducing query complexity LINQ var result = _ctx.Studios .AsSplitQuery() .Select(s => new { Studio = s, Infinity = s.Products .Where(p => p.Name.Contains("Infinity")) .Select(p => new { Product = p, p.Prices, p.Sellers }), Endgame = s.Products .Where(p => p.Name.Contains("Endgame")) .Select(p => new { Product = p, p.Prices, p.Sellers }) }) .ToList(); Approach 2.2 AsSplitQuery Before Statistics: • CPU: 1766 • Duration: 3180 • Reads: 20773 • Row Count: 200000 AsSplitQuery 13 318 568 855
  18. Usually, loading collection separately requires more code but less resources

    on database LINQ Entity Framework Core Performance-Optimierung aus der Praxis Manual query splitting Reducing query complexity var studiosQuery = _ctx.Studios; var studios = studiosQuery.ToList(); var infinityProductsQuery = studiosQuery.SelectMany(s => s.Products).Where(p => p.Name.Contains("Infinity")); var endgameProductsQuery = studiosQuery.SelectMany(s => s.Products).Where(p => p.Name.Contains("Endgame")); var infinityProducts = infinityProductsQuery.ToList(); var endgameProducts = endgameProductsQuery.ToList(); var productIds = infinityProducts.Concat(endgameProducts).Select(p => p.Id); var prices = _ctx.Prices.Where(p => productIds.Contains(p.ProductId)).ToList(); // cannot use "productsQuery.Select(p => p.Sellers)" because JoinTable "Seller_Product" won't be selected var sellers = _ctx.SellerProducts.Where(sp => productIds.Contains(sp.ProductId)).Include(sp => sp.Seller).ToList(); // Build the desired data structure Approach 2.2 Manual Splitting Before Statistics: • CPU: 1766 • Duration: 3180 • Reads: 20773 • Row Count: 200000 AsSplitQuery 13 318 568 855 Manual Splitting 1 270 22 1305
  19. LINQ 2 LINQ 1 Entity Framework Core Performance-Optimierung aus der

    Praxis Which one is better? Understanding Queries var studios = ctx.Studios .Select(s => new { s.Products.OrderBy(p => p.Id).FirstOrDefault().Id, s.Products.OrderBy(p => p.Id).FirstOrDefault().Name }) .ToList(); var studios = ctx.Studios .Select(s => s.Products .OrderBy(p => p.Id) .Select(p => new { p.Id, p.Name }) .FirstOrDefault()) .ToList(); 2x “FirstOrDefault()” before selecting properties Selection of properties before “FirstOrDefault()”
  20. SQL 2 SQL 1 Entity Framework Core Performance-Optimierung aus der

    Praxis Lets try with SQL … Understanding Queries SELECT ( SELECT TOP(1) p.Id FROM Products p WHERE s.Id = p.StudioId ORDER BY p.Id) AS FirstProductId, ( SELECT TOP(1) p.Name FROM Products p WHERE s.Id = p.StudioId ORDER BY p.Id) AS FirstProductName FROM Studios s SELECT p.Id, p.Name FROM Studios s LEFT JOIN ( SELECT Id, Name, StudioId FROM ( SELECT Id, Name, StudioId, ROW_NUMBER() OVER(PARTITION BY StudioId ORDER BY Id) AS row FROM Products ) p WHERE row <= 1 ) p ON s.Id = p.StudioId 2 sub-selects Window function “ROW_NUMBER”
  21. Entity Framework Core Performance-Optimierung aus der Praxis Textual representation of

    all database operations • Type and the order of operations (JOIN, filter, projection, ...) • Indexes being used • Amount of data flowing between two operations • Costs of an operation and a subtree Some tools have built-in support for displaying execution plans as a graph Execution plans
  22. Entity Framework Core Performance-Optimierung aus der Praxis Another try …

    Execution plans Execution plan 1 SQL 1 SELECT ( SELECT TOP(1) p.Id FROM Products p WHERE s.Id = p.StudioId ) AS FirstProductId, ( SELECT TOP(1) p.Name FROM Products p WHERE s.Id = p.StudioId ) AS FirstProductName FROM Studios s Why not “index seek”?
  23. Entity Framework Core Performance-Optimierung aus der Praxis Execution plans Execution

    plan 2 SQL 2 SELECT p.Id, p.Name FROM Studios s LEFT JOIN ( SELECT Id, Name, StudioId FROM ( SELECT Id, Name, StudioId, ROW_NUMBER() OVER(PARTITION BY StudioId ORDER BY Id) AS row FROM Products ) p WHERE row <= 1 ) p ON s.Id = p.StudioId Why “index scan” + “sort”?
  24. Entity Framework Core Performance-Optimierung aus der Praxis Comparing • Compare

    estimated subtree costs • Deviance: estimated vs actual number of rows • Take statistics (reads) into consideration Execution plans Demo 3.1
  25. Entity Framework Core Performance-Optimierung aus der Praxis SQL Server Operators

    General icon Cursors Parallelism T-SQL Execution plans
  26. Entity Framework Core Performance-Optimierung aus der Praxis Execution plans Crash

    course Table scan Clustered index scan Non-clustered index scan Clustered index seek Non-clustered index seek Scan No clustered index Seek Filter Filtering Filtering Filtering Key lookup Missing (include) columns? RID lookup If no clustered index Sort “Stop & Go” operator! Required? Missing index? Sorting in .NET cheaper? Parallelism Expensive High query complexity Missing index? Fuzzy search? Bad discriminator?
  27. Entity Framework Core Performance-Optimierung aus der Praxis JOINs Execution plans

    DB may perform sort on its own • Resource-saving • Data sets must be ordered • The “default” • Kind-of 2 nested loops • Speed depends on: • Scan of data set A • Seek of data set B • Large and unsorted sets • 2 phases: • Builds hash table for set A • Matches hash values from B Should be looked at Merge join Nested loop join Hash match join
  28. Entity Framework Core Performance-Optimierung aus der Praxis Learnings • Look

    for database requests in “loops” • Split complex queries if necessary • Learn to read execution plans - optimize queries accordingly Resources • Entity Framework Core: https://docs.microsoft.com/en-us/ef/core/ • Execution Plans: https://docs.microsoft.com/en-us/sql/relational-databases/performance/execution-plans • Webinar demos: https://github.com/thinktecture/ef-core-performance-webinar • My EF playground: https://github.com/PawelGerr/EntityFrameworkCore-Demos Ask me @pawelgerr [email protected]