Database Indexes - Speaker Deck

Slide 1

Slide 1 text

DATABASE   INDEXES #SperasoftTalks

Slide 2

Slide 2 text

Who We Are? We are a team of professionals specializing in game development, art production, online engineering and creation of amazing products. Our technology competencies include solid experience and background in delivering scalable platforms and online solutions. That serve millions of players all over the world and run beyond amazing games. Read more: http://www.sperasoft.com

Slide 3

Slide 3 text

•  RDBMS store data only in Trees •  Index is a tree in terms of data structure •  a Table is an Index •  a Clustered Index is a Table itself •  a Non-‐clustered Index is a copy of data •  all Non-‐clustered Indexes refer to Clustered one •  all keys in Tree Nodes are always unique The Simple Truth

Slide 4

Slide 4 text

•  Oracle Database •  SQL Server •  IBM DB2 •  MySQL •  PostgreSQL •  Sybase •  Informix What’s Common Between

Slide 5

Slide 5 text

RDBMS is a type of Database Management System that stores data in the form of related tables RDBMS is a Database Management System that is based on the relaJonal model introduced by E.F. Codd Data is stored in tables and the relaJonships among the data are also stored in tables Relational Database Management Systems

Slide 6

Slide 6 text

•  Born on the Isle of Portland in England in 1923 •  Died in Florida US in 2003, aged 79 •  MathemaJc •  Worked for IBM Edgar Frank “Ted” Codd

Slide 7

Slide 7 text

•  Introduced “A RelaJonal Model of Data for Large Shared Data Banks” and Alpha database language •  IBM started implemenJng the RelaJonal model and introduced another language named SEQUEL Edgar Frank “Ted” Codd

Slide 8

Slide 8 text

•  Larry Ellison came up in Jme with his implementaJon of RelaJonal model and the language – Oracle Database and SQL •  ANSI started making SQL standard Birth of Oracle

Slide 9

Slide 9 text

•  It’s all about Table RelaJons Relation Model Briefly

Slide 10

Slide 10 text

•  Database contains tables (two dimensional arrays) •  Tables have relaJonships enforced by Foreign Key constraints (1-‐to-‐Many relaJonship) •  NormalizaJon of tables is a key concept •  That’s why RDBMS are called RelaJonal Relation Model

Slide 11

Slide 11 text

What’s Database Physically

Slide 12

Slide 12 text

•  Files are ﬂat in nature FILE READING CURSOR 0 OFFSET All Tables Are Stored in a File

Slide 13

Slide 13 text

•  What’s the value behind relaJons? •  What is a database table? •  What is a table index? •  RelaJons vs How data is stored What’s Actually Matter

Slide 14

Slide 14 text

Id User Name Country City Age 1 Michael USA Boston 30 2 Jane USA Boston 24 3 Scoe USA NYC 18 4 Bob UK London 41 5 Prescoe UK London 35 •  Such array seems to be a table •  How to ﬁnd Users from Boston faster? ArrayList users = new ArrayList(); How to Handle Millions of Users

Slide 15

Slide 15 text

Boston 1, Michael, USA, Boston, 30 2, Jane, USA, Boston, 24 NYC 3, Scoe, USA, NYC, 18 London 4, Bob, UK, London, 41 5, Prescoe, UK, London, 35 Index is a Tree

Slide 16

Slide 16 text

ID = 2 2, Jane, USA, Boston, 24 ID = 1 1, Michael, USA, Boston, 30 ID = 3 3, Scoe, USA, NYC, 18 Can replace an initial array with Index

Slide 17

Slide 17 text

•  Key values in a Key node should be unique •  Otherwise Trees do not work What’s important to note

Slide 18

Slide 18 text

•  Indexes are Trees in terms of data structure •  Trees are suitable to store any array of data to make search faster Returning to our sheep

Slide 19

Slide 19 text

•  All RDBMS store data as Balanced Trees •  The concrete implementaJon of B-‐Tree could diﬀer from vendor to vendor •  It means the only way to store data is Tree •  No excepJons here -‐ table is a tree, index is a tree Balanced Trees

Slide 20

Slide 20 text

What’s a Clustered Index

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Have a quesKon? Like this deck? Tweet us @SperasoR Like deck on SlideShare.com/sperasoR

Slide 23

Slide 23 text

•  Clustered Indexes •  Non-‐clustered indexes •  Both could be unique and non-‐unique •  Table can be without any indexes •  How is that comply with how data is actually stored? What SQL allows us to do

Slide 24

Slide 24 text

•  Unique and non-‐unique •  CREATE CLUSTERED INDEX [name] ON [table_name] ([column1], [column2]) •  CREATE UNIQUE CLUSTERED INDEX [name] ON [table_name] ([column1], [column2]) Clustered Indexes

Slide 25

Slide 25 text

•  Unique and non-‐unique •  CREATE NONCLUSTERED INDEX [name] ON [table_name] ([column1], [column2]) •  CREATE UNIQUE NONCLUSTERED INDEX [name] ON [table_name] ([column1], [column2]) None Clustered Indexes

Slide 26

Slide 26 text

ID = 2 Jane, USA, Boston, 24 ID = 1 Michael, USA, Boston, 30 ID = 3 Scoe, USA, NYC, 18 Unique Clustered Index

Slide 27

Slide 27 text

•  We know Key values should be unique •  How RDBMS resolves this problem? Non-unique Clustered Index

Slide 28

Slide 28 text

•  SQL Server adds 4-‐byte uniquiﬁer to each duplicated key value •  Algorithms could diﬀer from vendor to vendor •  But the principle is the same – add something to make them unique Non-unique Clustered Index

Slide 29

Slide 29 text

•  Just omitng Unique keyword makes Key values bigger (why it’s bad realize later) •  The simple truth is that Each table should have Clustered Index •  The Clustered Index should be always Unique •  The situaJons when its not so should be excepJonal Clustered Indexes

Slide 30

Slide 30 text

•  Such tables are called Heap Tables •  How are they stored in database if they do not have a Key value speciﬁed? Tables without Clustered Index

Slide 31

Slide 31 text

•  Heap Tables are also stored in Trees •  What’s in a Key value for Tables without Clustered Index? •  The value called RID •  the unique idenJﬁer which refers to the physical locaJon of the record in a ﬁle No magic over here

Slide 32

Slide 32 text

•  There is no meaningful data in Keys •  Table records are not stored physically in Keys’ order Why Heap Tables are so bad

Slide 33

Slide 33 text

•  Clustered Index has the actual data columns in Leaf-‐ nodes •  What’s in Leaf-‐node of Non-‐clustered index? •  Remember that Non-‐clustered Indexes are duplicated data Non-clustered Indexes

Slide 34

Slide 34 text

Jane Lookup value: ID=2 Michael Lookup value: ID=1 Scoe Lookup value: ID=3 •  Leaf-‐nodes contain the lookup values •  Lookup value is Clustered Index’s Key Non-clustered Index

Slide 35

Slide 35 text

•  We know Key values should be unique •  How non-‐clustered index’s key becomes unique? Non-unique Non-clustered Index

Slide 36

Slide 36 text

•  SQL Server adds Clustered Index Key value to Non-‐clustered Index Key value to make it unique Jane, 2 Lookup value: ID=2 Michael, 1 Lookup value: ID=1 Scoe, 3 Lookup value: ID=3 Non-unique Non-clustered Index

Slide 37

Slide 37 text

•  from SELECT statement the WHERE condiJon is taken •  based on the Columns in WHERE we know what columns we search by •  look through available indexes trying to ﬁnd the appropriate one, starJng from Clustered •  found out non-‐clustered index which ﬁts best How indexes are used (1)

Slide 38

Slide 38 text

•  get the needed Node in Non-‐clustered index •  get the Lookup value from that Node •  use that lookup value to ﬁnd a record in Clustered index •  get selected columns from Clustered index (table itself) How indexes are used (2)

Slide 39

Slide 39 text

•  Unique Clustered Index on Id column •  Non-‐unique Non-‐clustered Index on City column •  Select UserName from tbl where City = ‘Boston’ Sample 1

Slide 40

Slide 40 text

•  Unique Clustered Index on Id column •  Non-‐unique Non-‐clustered Index on City column •  Select Id from tbl where City = ‘Boston’ Sample 2

Slide 41

Slide 41 text

•  Unique Clustered Index on Id column •  Non-‐unique Non-‐clustered Index on City column •  Select UserName from tbl where City = ‘Boston’ select should not go to Clustered Index Sample 3

Slide 42

Slide 42 text

•  Unique Clustered Index on Id, UserName column •  Select Id from tbl where City = ‘Boston’ and UserName = ‘Michael’ •  What columns Non-‐unique Non-‐clustered Index would include? Sample 1

Slide 43

Slide 43 text

WE ARE SPERASOFT DELIVERING AMAZING PRODUCTS Follow us: @SperasoA hCp://www.sperasoA.com