Slide 1

Slide 1 text

Entity Framework Core 7 Performance-Optimierung aus der Praxis Pawel Gerr @pawelgerr [email protected]

Slide 2

Slide 2 text

Entity Framework Core Performance-Optimierung aus der Praxis • Optimization options and KPIs • Reduction of database requests • Reduction of query complexity • Query statistics and execution plans Agenda P

Slide 3

Slide 3 text

Entity Framework Core Performance-Optimierung aus der Praxis Example for today – Products DB Products Prices Sellers Studio

Slide 4

Slide 4 text

Entity Framework Core Performance-Optimierung aus der Praxis My usual options for Entity Framework performance optimization 2nd level cache Hardware upgrade Materialized views Stored procedures Triggers Database Features Server Raw SELECT * FROM … Raw SQL Bulk operations Indices Entity Framework Requests reduction Query restructure Use 3rd party libs if available CPU, Memory, I/O

Slide 5

Slide 5 text

Entity Framework Core Performance-Optimierung aus der Praxis Did „the change“ anything good? – My KPI Execution time (LINQ) SELECT * FROM … SELECT * FROM … SELECT * FROM … Number of requests Query statistics SELECT * FROM … “Shape” of SQL Execution plans Parse time Compile time Execution time (DB) Physical reads Logical reads Row count Round trip time SQL generation Execution time (DB) Materialization My key performance indicators

Slide 6

Slide 6 text

Entity Framework Core Performance-Optimierung aus der Praxis Reducing database requests

Slide 7

Slide 7 text

Entity Framework Core Performance-Optimierung aus der Praxis N+1 queries problem Execution of hundreds of requests for one use case Caused by: unintentional execution of queries in loops Possible solutions: • Better code structure (software architecture) • Avoid overgeneralization (shared/core projects) • Help each other to improve (review/feedback) • Know the insights of Entity Framework • Lazy loading • Limitations Reducing database requests Demos 1.1 + 1.2

Slide 8

Slide 8 text

Simplified case of indirect execution of database requests in a loop Library Main Entity Framework Core Performance-Optimierung aus der Praxis Multiple requests in loops Reducing database requests var products = LoadProducts(); var prices = new List(); foreach (var product in products) { var price = GetPrice(product.Id); prices.Add(price); } public Price GetPrice(Guid productId) { MyDbContext ctx = …; return ctx.Prices.FirstOrDefault(p => p.ProductId == productId); } Executed N times

Slide 9

Slide 9 text

Library Main Entity Framework Core Performance-Optimierung aus der Praxis LINQ queries are loops Reducing database requests var products = LoadProducts(); var prices = products.Select(product => GetPrice(product.Id)) .ToList(); public Price GetPrice(Guid productId) { MyDbContext ctx = …; return ctx.Prices.FirstOrDefault(p => p.ProductId == productId); }

Slide 10

Slide 10 text

Library Main Entity Framework Core Performance-Optimierung aus der Praxis Approach 1 – Use-case specific methods Reducing database requests var productsWithPrices = LoadProductsWithPrices(); public List LoadProductsWithPrices() { return ctx.Products .Include(p => p.Prices) // merely symbolic, fetching prices is usually more complex .ToList(); } Approach 1.1

Slide 11

Slide 11 text

Library Main var products = LoadProducts(); var productIds = products.Select(p => p.Id); var pricesByProductId = GetPrices(productIds); public ILookup GetPrices(IEnumerable productIds) { return ctx.Prices .Where(p => productIds.Contains(p.ProductId)) .ToLookup(p => p.ProductId); } Entity Framework Core Performance-Optimierung aus der Praxis Approach 2 – load all required prices at once Reducing database requests Approach 1.2

Slide 12

Slide 12 text

Given: 100 products being in 5 studio Loading of all products “in one go” leads to 1 + 5 queries Library Main var productsByStudio = ctx.Products .Include(p => p.Studio) .ToLookup(p => p.Studio.Name); Entity Framework Core Performance-Optimierung aus der Praxis Lazy loading Reducing database requests public static ILookup ToLookup( this IEnumerable source, Func keySelector) { ... } Use eager loading to prevent unnecessary requests IEnumerable always forces query execution Demo 1.3 Approach 1.3

Slide 13

Slide 13 text

Entity Framework Core Performance-Optimierung aus der Praxis Finding the problem area Use profiling tools In case of Microsoft SQL Server: XEvent Profiler in SQL Server Management Studio SQL Server Profiler (deprecated, resource intensive) Azure Data Studio Logs coming from EF with EnableSensitiveDataLogging Reducing database requests Logging DbContextOptionsBuilder builder = ...; builder.UseSqlServer("...") // or any other database .UseLoggerFactory(loggerFactory) .EnableSensitiveDataLogging(); In development only because possible security leak!

Slide 14

Slide 14 text

Entity Framework Core Performance-Optimierung aus der Praxis Finding the problem area Look for repetitive lines and cycles Reducing database requests Repetive lines Cycles Demo: SQL Server Profiler

Slide 15

Slide 15 text

Query tags: correlation between LINQ and SQL LINQ SQL Entity Framework Core Performance-Optimierung aus der Praxis Finding the problem area Reducing database requests var products = ctx.Products .TagWith("My Query") .ToList(); -- My Query SELECT * FROM Products

Slide 16

Slide 16 text

Entity Framework Core Performance-Optimierung aus der Praxis Reducing query complexity

Slide 17

Slide 17 text

Entity Framework Core Performance-Optimierung aus der Praxis Cartesian explosion problem Fetching a lot of data from multiple tables at once • Increased memory consumption and execution time • High I/O load Caused by: loading of multiple navigational collection-properties Possible solutions: • Reduce Includes (eager loading) • Reduce access to navigational properties in projections (i.e. Select) Reducing query complexity Query splitting Demo 2.1

Slide 18

Slide 18 text

Entity Framework Core Performance-Optimierung aus der Praxis Eager loading via Include EF 7 loads all data using 1 SQL statement by default (QuerySplittingBehavior.SingleQuery) • Bigger result sets: 100 products with 10 prices each and 2 sellers (incl. 200 product-seller-relationship) (100 * 10 * 2) rows > (100 + 1000 + 2 + 200) rows • EF-forced ORDER BY clause produces considerable load Reducing query complexity Product Id Price Id Seller Id 1 1 1 1 1 2 1 2 1 1 2 2 1 3 1 1 3 2 SQL LINQ var products = ctx.Products .Include(p => p.Studio) .Include(p => p.Prices) .Include(p => p.Sellers) .ToList(); SELECT * FROM Products INNER JOIN Studios ... LEFT JOIN Prices ON ... LEFT JOIN Sellers ON ... ... ORDER BY Products.Id, Studios.Id, Prices.Id, … Cartesian explosion

Slide 19

Slide 19 text

EF 7 allows to change the QuerySplittingBehavior from SingleQuery to SplitQuery Entity Framework Core Performance-Optimierung aus der Praxis Include: query splitting by EF Reducing query complexity Approach 2.1 LINQ var products = ctx.Products .AsSplitQuery() .Include(p => p.Studio) .Include(p => p.Prices) .Include(p => p.Sellers) .ToList();

Slide 20

Slide 20 text

Entity Framework Core Performance-Optimierung aus der Praxis Fetching collections in projections Cartesian Explosion due to fetching of collection in Select Reducing query complexity LINQ var result = _ctx.Studios .Select(s => new { Studio = s, Infinity = s.Products .Where(p => p.Name.Contains("Infinity")) .Select(p => new { Product = p, p.Prices, p.Sellers }), Endgame = s.Products .Where(p => p.Name.Contains("Endgame")) .Select(p => new { Product = p, p.Prices, p.Sellers }) }) .ToList(); Demo 2.2 Before Statistics: • CPU: 1766 • Duration: 3180 • Reads: 20773 • Row Count: 200000

Slide 21

Slide 21 text

Entity Framework Core Performance-Optimierung aus der Praxis Projections: query splitting by EF Cartesian Explosion due to fetching of collection in Select Reducing query complexity LINQ var result = _ctx.Studios .AsSplitQuery() .Select(s => new { Studio = s, Infinity = s.Products .Where(p => p.Name.Contains("Infinity")) .Select(p => new { Product = p, p.Prices, p.Sellers }), Endgame = s.Products .Where(p => p.Name.Contains("Endgame")) .Select(p => new { Product = p, p.Prices, p.Sellers }) }) .ToList(); Approach 2.2 AsSplitQuery Before Statistics: • CPU: 1766 • Duration: 3180 • Reads: 20773 • Row Count: 200000 AsSplitQuery 13 318 568 855

Slide 22

Slide 22 text

Usually, loading collection separately requires more code but less resources on database LINQ Entity Framework Core Performance-Optimierung aus der Praxis Manual query splitting Reducing query complexity var studiosQuery = _ctx.Studios; var studios = studiosQuery.ToList(); var infinityProductsQuery = studiosQuery.SelectMany(s => s.Products).Where(p => p.Name.Contains("Infinity")); var endgameProductsQuery = studiosQuery.SelectMany(s => s.Products).Where(p => p.Name.Contains("Endgame")); var infinityProducts = infinityProductsQuery.ToList(); var endgameProducts = endgameProductsQuery.ToList(); var productIds = infinityProducts.Concat(endgameProducts).Select(p => p.Id); var prices = _ctx.Prices.Where(p => productIds.Contains(p.ProductId)).ToList(); // cannot use "productsQuery.Select(p => p.Sellers)" because JoinTable "Seller_Product" won't be selected var sellers = _ctx.SellerProducts.Where(sp => productIds.Contains(sp.ProductId)).Include(sp => sp.Seller).ToList(); // Build the desired data structure Approach 2.2 Manual Splitting Before Statistics: • CPU: 1766 • Duration: 3180 • Reads: 20773 • Row Count: 200000 AsSplitQuery 13 318 568 855 Manual Splitting 1 270 22 1305

Slide 23

Slide 23 text

Entity Framework Core Performance-Optimierung aus der Praxis Understanding queries P

Slide 24

Slide 24 text

LINQ 2 LINQ 1 Entity Framework Core Performance-Optimierung aus der Praxis Which one is better? Understanding Queries var studios = ctx.Studios .Select(s => new { s.Products.OrderBy(p => p.Id).FirstOrDefault().Id, s.Products.OrderBy(p => p.Id).FirstOrDefault().Name }) .ToList(); var studios = ctx.Studios .Select(s => s.Products .OrderBy(p => p.Id) .Select(p => new { p.Id, p.Name }) .FirstOrDefault()) .ToList(); 2x “FirstOrDefault()” before selecting properties Selection of properties before “FirstOrDefault()”

Slide 25

Slide 25 text

SQL 2 SQL 1 Entity Framework Core Performance-Optimierung aus der Praxis Lets try with SQL … Understanding Queries SELECT ( SELECT TOP(1) p.Id FROM Products p WHERE s.Id = p.StudioId ORDER BY p.Id) AS FirstProductId, ( SELECT TOP(1) p.Name FROM Products p WHERE s.Id = p.StudioId ORDER BY p.Id) AS FirstProductName FROM Studios s SELECT p.Id, p.Name FROM Studios s LEFT JOIN ( SELECT Id, Name, StudioId FROM ( SELECT Id, Name, StudioId, ROW_NUMBER() OVER(PARTITION BY StudioId ORDER BY Id) AS row FROM Products ) p WHERE row <= 1 ) p ON s.Id = p.StudioId 2 sub-selects Window function “ROW_NUMBER”

Slide 26

Slide 26 text

Entity Framework Core Performance-Optimierung aus der Praxis Execution plans

Slide 27

Slide 27 text

Entity Framework Core Performance-Optimierung aus der Praxis Textual representation of all database operations • Type and the order of operations (JOIN, filter, projection, ...) • Indexes being used • Amount of data flowing between two operations • Costs of an operation and a subtree Some tools have built-in support for displaying execution plans as a graph Execution plans

Slide 28

Slide 28 text

Entity Framework Core Performance-Optimierung aus der Praxis Another try … Execution plans Execution plan 1 SQL 1 SELECT ( SELECT TOP(1) p.Id FROM Products p WHERE s.Id = p.StudioId ) AS FirstProductId, ( SELECT TOP(1) p.Name FROM Products p WHERE s.Id = p.StudioId ) AS FirstProductName FROM Studios s Why not “index seek”?

Slide 29

Slide 29 text

Entity Framework Core Performance-Optimierung aus der Praxis Execution plans Execution plan 2 SQL 2 SELECT p.Id, p.Name FROM Studios s LEFT JOIN ( SELECT Id, Name, StudioId FROM ( SELECT Id, Name, StudioId, ROW_NUMBER() OVER(PARTITION BY StudioId ORDER BY Id) AS row FROM Products ) p WHERE row <= 1 ) p ON s.Id = p.StudioId Why “index scan” + “sort”?

Slide 30

Slide 30 text

Entity Framework Core Performance-Optimierung aus der Praxis Comparing • Compare estimated subtree costs • Deviance: estimated vs actual number of rows • Take statistics (reads) into consideration Execution plans Demo 3.1

Slide 31

Slide 31 text

Entity Framework Core Performance-Optimierung aus der Praxis SQL Server Operators General icon Cursors Parallelism T-SQL Execution plans

Slide 32

Slide 32 text

Entity Framework Core Performance-Optimierung aus der Praxis Execution plans Crash course Table scan Clustered index scan Non-clustered index scan Clustered index seek Non-clustered index seek Scan No clustered index Seek Filter Filtering Filtering Filtering Key lookup Missing (include) columns? RID lookup If no clustered index Sort “Stop & Go” operator! Required? Missing index? Sorting in .NET cheaper? Parallelism Expensive High query complexity Missing index? Fuzzy search? Bad discriminator?

Slide 33

Slide 33 text

Entity Framework Core Performance-Optimierung aus der Praxis JOINs Execution plans DB may perform sort on its own • Resource-saving • Data sets must be ordered • The “default” • Kind-of 2 nested loops • Speed depends on: • Scan of data set A • Seek of data set B • Large and unsorted sets • 2 phases: • Builds hash table for set A • Matches hash values from B Should be looked at Merge join Nested loop join Hash match join

Slide 34

Slide 34 text

Entity Framework Core Performance-Optimierung aus der Praxis Learnings • Look for database requests in “loops” • Split complex queries if necessary • Learn to read execution plans - optimize queries accordingly Resources • Entity Framework Core: https://docs.microsoft.com/en-us/ef/core/ • Execution Plans: https://docs.microsoft.com/en-us/sql/relational-databases/performance/execution-plans • Webinar demos: https://github.com/thinktecture/ef-core-performance-webinar • My EF playground: https://github.com/PawelGerr/EntityFrameworkCore-Demos Ask me @pawelgerr [email protected]